6.1 Correlation and Causation

Does taking this vaccine help prevent this disease? How can we be sure? We explain the mantra that "correlation does not equal causation" by defining causation as "correlation under intervention." We introduce randomized controlled trials, a widely used type of experiment that can tell us with a high degree of confidence whether two variables are causally linked.

The Lesson in Context

We introduce one definition of causation—as (statistically significant) correlation under intervention—and the Randomized Controlled Trial (RCT), which is a method of isolating and studying a causal relationship between two variables, even when the world is full of complex causal structures and random variations.

Earlier Lessons

1.2 Shared Reality and Modeling

Causation is a part of the shared reality and thus can be studied by empirical observation and experimentation.

2.1 Senses and Instrumentation

When we used interactive exploration to establish some trust in our instruments, we implicitly applied the idea of "causation as correlation under intervention".

2.2 Systematic and Statistical Uncertainty

The measurement of correlation and causation is subject to both statistical and systematic uncertainty, and RCTs are designed to mitigate these uncertainties.
Statistical uncertainty: Apparent correlation between two variables might occur simply due to randomness. An RCT should be performed on a sufficiently large representative sample.
Systematic uncertainty: Samples are randomly assigned to either the intervention or the control group, in order to remove (by averaging out) any potential systematic differences between the two groups due to the way they are assigned.

4.1 Signal and Noise

We can conclude a causal relationship from an RCT if there is a statistically significant difference in the dependent variable between the intervention and control groups. However, a small difference between them is inevitable due to random variations (statistical uncertainty).
We are trying to detect a significant difference (signal), which may sometimes be difficult to distinguish from a difference that arises by random chance (noise).
A strong signal would be a difference that is much larger than what could be expected from random chance alone.

4.2 Finding Patterns in Random Noise

Statistical concepts such as [math]\displaystyle{ p }[/math]-value help us quantify the statistical significance of the result of an RCT—it is the probability that the observed correlation may be produced by random chance alone.
No RCT can claim 100% confidence, but it must give a [math]\displaystyle{ p }[/math]-value, which quantifies even the tiniest possibility that the result may be a random fluke.

Later Lessons

6.2 Hill's Criteria

Despite the power of RCTs in studying causal relationships, there are yet many cases in which an experimental intervention or control condition is not feasible due to resource or ethics concerns. It is still possible to extract valuable causal information from non-RCT studies using Hill's criteria.

7.1 Causation, Blame, and Policy

RCTs probe general causation about a population or a collection of phenomena, but it does not make claims about the precise causal pathway or whether a causal relationship occurs in any singular individual in this population.

Takeaways

After this lesson, students should

Be able to explain why correlation is insufficient to demonstrate causation because there are multiple causal structures that lead to correlation:
1. [math]\displaystyle{ A }[/math] causes [math]\displaystyle{ B }[/math] (direct causation)
2. [math]\displaystyle{ B }[/math] causes [math]\displaystyle{ A }[/math] (reverse causation)
3. [math]\displaystyle{ A }[/math] and [math]\displaystyle{ B }[/math] are both caused by [math]\displaystyle{ C }[/math]
4. [math]\displaystyle{ A }[/math] causes [math]\displaystyle{ B }[/math] and [math]\displaystyle{ B }[/math] causes [math]\displaystyle{ A }[/math] (bidirectional or cyclic causation)
5. There is no connection between [math]\displaystyle{ A }[/math] and [math]\displaystyle{ B }[/math], and the correlation is a coincidence
6. The effect of [math]\displaystyle{ A }[/math] on [math]\displaystyle{ B }[/math] depends on [math]\displaystyle{ C }[/math]
Be able to explain and justify the essential features of a Randomized Controlled Trial (RCT): An attempt to identify causal relations by randomly assigning subjects into two groups and then performing an experimental intervention on the subjects in one of the groups.
1. Be able to recognize and explain the function of a control condition.
2. Be able to recognize and explain the function of randomized assignment.
Recognize the epistemic power of a well-designed RCT as evidence for causation, if the experimental condition turns out to be significantly different from the control condition.

Randomized Controlled Trial (RCT)

An attempt to identify causal relations by randomly assigning subjects into two or more groups and then performing an experimental intervention on the subjects in one or more of the groups. It consists of three essential components.

Random Assignment

Individual samples are randomly assigned to the intervention or control group. This ensures that the variable under study is the only difference between the two groups and reduces systematic differences in any other variable between them due to the way the samples have been assigned to them.

Control Group/Condition

A subset of the study sample, often half, treated the same as the rest except that the experimental intervention is withheld. This yields a baseline against which the part of the sample subjected to the experimental intervention (the "experimental condition") can be compared.

Trial/Experimental Intervention

The act of the experimenter changing a variable under study (the "independent variable") on a subset of the sample, to see if it influences a second variable (the "dependent variable").

Students often think that the "random" in "randomized control trial" refers to how people are selected. They do not need to be sampled randomly from the general population. They just need to be randomly assigned between two groups. The lack of random sampling does not invalidate the study. It just affects how representative the sample is of the larger population.

Correlation

The correlation between two numerical variables [math]\displaystyle{ X }[/math] and [math]\displaystyle{ Y }[/math] is a measure of how much they increase/decrease with each other.

Positive Correlation

[math]\displaystyle{ Y }[/math] increases as [math]\displaystyle{ X }[/math] increases.

Negative/Inverse Correlation

[math]\displaystyle{ Y }[/math] decreases as [math]\displaystyle{ X }[/math] increases.

Causation

[math]\displaystyle{ X }[/math] causes [math]\displaystyle{ Y }[/math] if and only if [math]\displaystyle{ X }[/math] and [math]\displaystyle{ Y }[/math] are correlated under interventions on [math]\displaystyle{ X }[/math].

Almost all students understand on some level that "correlation does not imply causation", but they may still feel that really strong correlation "has got to say something." Correlation can be used as an exploratory incentive for looking into something, but without directly controlling one of the correlated variables, you don't in general know anything about their causal relationship.

Spurious Correlations

Website with many fun and obviously spurious correlations.

[math]\displaystyle{ B }[/math] causes [math]\displaystyle{ A }[/math]

Many parents worry that children sitting too close to a TV will cause nearsightedness, because they've observed that children who do sit very close to a TV end up having to get glasses. The reality is that children develop nearsightedness without their parents' knowledge and as a result have to sit close to a TV to see clearly.

[math]\displaystyle{ C }[/math] causes [math]\displaystyle{ A }[/math] and [math]\displaystyle{ B }[/math]

Red wine consumption is correlated with longer life span, but it could really just be that wealthy people consume more red wine and have better resources to stay healthy. In this case, the wealth of the individual is a confounding variable. (Study)

[math]\displaystyle{ A }[/math] causes [math]\displaystyle{ B }[/math], which affects [math]\displaystyle{ A }[/math]

Predator-prey relationship. For example, wolves prey on hares, so a higher wolf population causes a reduction in hare population, but a reduced hare population in turn causes a later reduction in wolf population due to lack of food.

Effect of [math]\displaystyle{ A }[/math] on [math]\displaystyle{ B }[/math] depends on [math]\displaystyle{ C }[/math]

Intense studying can cause worse grades, if one is studying instead of sleeping before an exam.

You cannot infer anything from an RCT about anything that wasn't in the study.

What's important is that the sample you're studying is, in key ways, representative of the group you want to make conclusions about. What these "key ways" are varies a lot depending on what's being studied. But, the more similar the group you're making conclusions about is to the one sampled in the RCT, the more likely it is to be true. For example, a conclusion about students at one university may be very likely to hold for students at a similar school. It could also hold for university students in general. Depending on what you're studying, it might hold for people of that age group in general. And if you want to generalize the conclusion to an even broader population (the country or world as a whole) then you need to think very carefully about whether your sample represents any confounding variables in this larger group.

An RCT doesn't tell you very much if it's only a small fraction of the total population that you want to study.

There is some sense in which this is true. You need to have a sufficiently large sample to "average out" all the differences within the group you're studying. But, once this criterion is met, the sheer size of the sample is no longer a concern. The main issue then is what group your sample is representative of.

The control and intervention groups need to be roughly the same size.

This isn't true so long as both groups are sufficiently large to capture the differences within the sample.

If an RCT shows strong evidence that X causes Y, then X must cause Y in every single case.

The RCT demonstrates that a relationship tends to exist on the scale of an overall population. It does not mean that individual cases are necessarily subject to that exact condition. For example, a particular drug may reduce headaches in most people but not work for every individual.

If the samples are not randomly chosen from the population, then it is not an RCT.

Random sampling is not necessary for an RCT. It only affects the generalizability of the results of an RCT.

Additional Content

You must be logged in to see this content.