6.1 Correlation and Causation

From Sense & Sensibility & Science
Revision as of 01:07, 22 August 2023 by Gpe (talk | contribs) (// Edit via Wikitext Extension for VSCode)
Topic Cover - 6.1 Correlation and Causation.png

An introduction to the scientific approach to determining causal relationships.



The Lesson in Context

We introduce one definition of causation—as (statistically significant) correlation under intervention—and the Randomized Controlled Trial (RCT), which is a method of isolating and studying a causal relationship between two variables, even when the world is full of complex causal structures and random variations.

1.2 Shared Reality and ModelingTopic Icon - 1.2 Shared Reality and Modeling.png
  • Causation is a part of the shared reality and thus can be studied by empirical observation and experimentation.
2.1 Senses and InstrumentationTopic Icon - 2.1 Senses and Instrumentation.png
  • When we used interactive exploration to establish some trust in our instruments, we implicitly applied the idea of "causation as correlation under intervention".
2.2 Systematic and Statistical UncertaintyTopic Icon - 2.2 Systematic and Statistical Uncertainty.png
  • The measurement of correlation and causation is subject to both statistical and systematic uncertainty, and RCTs are designed to mitigate these uncertainties.
  • Statistical uncertainty: Apparent correlation between two variables might occur simply due to randomness. An RCT should be performed on a sufficiently large representative sample.
  • Systematic uncertainty: Samples are randomly assigned to either the intervention or the control group, in order to remove (by averaging out) any potential systematic differences between the two groups due to the way they are assigned.
4.1 Signal and NoiseTopic Icon - 4.1 Signal and Noise.png
  • We can conclude a causal relationship from an RCT if there is a statistically significant difference in the dependent variable between the intervention and control groups. However, a small difference between them is inevitable due to random variations (statistical uncertainty).
  • We are trying to detect a significant difference (signal), which may sometimes be difficult to distinguish from a difference that arises by random chance (noise).
  • A strong signal would be a difference that is much larger than what could be expected from random chance alone.
4.2 Finding Patterns in Random NoiseTopic Icon - 4.2 Finding Patterns in Random Noise.png
  • Statistical concepts such as [math]\displaystyle{ p }[/math]-value help us quantify the statistical significance of the result of an RCT—it is the probability that the observed correlation may be produced by random chance alone.
  • No RCT can claim 100% confidence, but it must give a [math]\displaystyle{ p }[/math]-value, which quantifies even the tiniest possibility that the result may be a random fluke.
6.2 Hill's CriteriaTopic Icon - 6.2 Hill's Criteria.png
  • Despite the power of RCTs in studying causal relationships, there are yet many cases in which an experimental intervention or control condition is not feasible due to resource or ethics concerns. It is still possible to extract valuable causal information from non-RCT studies using Hill's criteria.
7.1 Causation, Blame, and PolicyTopic Icon - 7.1 Causation, Blame, and Policy.png
  • RCTs probe general causation about a population or a collection of phenomena, but it does not make claims about the precise causal pathway or whether a causal relationship occurs in any singular individual in this population.


Takeaways

After this lesson, students should

  1. Be able to explain why correlation is insufficient to demonstrate causation because there are multiple causal structures that lead to correlation:
    1. [math]\displaystyle{ A }[/math] causes [math]\displaystyle{ B }[/math] (direct causation)
    2. [math]\displaystyle{ B }[/math] causes [math]\displaystyle{ A }[/math] (reverse causation)
    3. [math]\displaystyle{ A }[/math] and [math]\displaystyle{ B }[/math] are both caused by [math]\displaystyle{ C }[/math]
    4. [math]\displaystyle{ A }[/math] causes [math]\displaystyle{ B }[/math] and [math]\displaystyle{ B }[/math] causes [math]\displaystyle{ A }[/math] (bidirectional or cyclic causation)
    5. There is no connection between [math]\displaystyle{ A }[/math] and [math]\displaystyle{ B }[/math], and the correlation is a coincidence
    6. The effect of [math]\displaystyle{ A }[/math] on [math]\displaystyle{ B }[/math] depends on [math]\displaystyle{ C }[/math]
  2. Be able to explain and justify the essential features of a Randomized Controlled Trial (RCT): An attempt to identify causal relations by randomly assigning subjects into two groups and then performing an experimental intervention on the subjects in one of the groups.
    1. Be able to recognize and explain the function of a control condition.
    2. Be able to recognize and explain the function of randomized assignment.
  3. Recognize the epistemic power of a well-designed RCT as evidence for causation, if the experimental condition turns out to be significantly different from the control condition.

Additional Content

You must be logged in to see this content.