6.2 Hill's Criteria

In the messy real world, an ideal randomized controlled trial may not always be feasible, for ethical or practical reasons. Even so, it is still possible to present compelling evidence for causation by considering whether the observed data satisfy a set of intuitive criteria introduced by Bradford Hill.

The Lesson in Context

This is a discussion-based lesson that familiarizes students with the concept of Hill's criteria, which are used when an ideal experiment (e.g. RCT) could not be done due to resource or ethical considerations. The criteria themselves are not difficult, but students typically have trouble associating their names with their meanings, and they would benefit from a diverse range of illustrative examples.

Earlier Lessons

2.2 Systematic and Statistical Uncertainty

When coming up with alternative explanations to an apparent correlation between two variables, it helps to consider factors that contribute to systematic and statistical uncertainties.

3.1 Probabilistic Reasoning

A credence level is always associated with any scientific claim of causation. As each Hill's criterion adds to a case for causation, so our credence level for a causal relation increases. However, unlike in the case of RCTs, it may be difficult to quantify.

6.1 Correlation and Causation

RCTs are introduced as ideal experiments for establishing causal relationships. These are often not possible due to resource or ethical considerations, and we must resort to Hill's criteria to examine the plausibility of causation. Causation is defined as correlation under intervention. When manual intervention is not possible, Hill's criteria can still build a strong case for causation.

Later Lessons

11.2 When Is Science Suspect

Students will explore how science has sometimes been used to justify the oppression of certain human groups. For example, current differences in achievement between human subgroups have been used to infer fundamental differences in biological or cognitive capacity, when in fact they may be sufficiently explained by differences in opportunity. These socially problematic causal inferences stem in part from the impossibility of an RCT that intervenes on genetics while keeping social opportunities equal between subgroups. It is important to recognise potential social implications when evaluating non-RCT evidence for causation, such as when drawing conclusions from observational studies alone.

Takeaways

After this lesson, students should

Identify cases in which "ideal" RCT experiments are not possible, due to ethical or practical constraints.
For a given scenario in which a causal hypothesis/claim is being made, identify plausible alternative hypotheses that could be consistent with the data.
Identify additional sources of evidence that could be used to help mitigate flawed experiments, including prior plausibility, dose-response relationships, specificity, temporal ordering, and consistency across contexts.
Recognize when causal evidence in the absence of an RCT can be fairly compelling, especially if there are many different types of evidence combined.

Hill's criteria

A group of criteria that suggest possible causation even in the absence of an RCT.

Prior Plausibility

Can a plausible mechanism be constructed, or is there some other basis for interpreting the current evidence in terms of one causal structure over another, such as data from other studies?

Temporality/Temporal Sequence

Did the hypothesized cause precede the effect?

Specificity

Specific predictions for specific consequences that have come true are less likely to be caused by other factors.

Dose-response Curve

Do the quantities of the hypothesized cause correlate with the quantity, severity, or frequency of the hypothesized effect across ranges?

Consistency Across Contexts

Does the correlation appear across diverse contexts?

This list may differ from the one on Wikipedia or elsewhere. It may be worth mentioning to students that these are the criteria we have chosen to focus on in this course.

It is not necessary for all of Hill's criteria to be satisfied to infer causation. Each criterion adds to the case for causation. Some criteria are not applicable in certain situations (e.g. dose-response curve in whether light switches cause the light to turn on and off).

Leaded Gasoline and Violent Crimes

The causal connection between leaded gasoline across the world and violent crimes in many countries.

Prior Plausibility: High levels of lead are known to cause cognitive damage. It is conceivable that extended exposure to lower levels of lead could have similar effects.
Temporality: In the graph shown in the video, the levels of violent crime correlate with the use of leaded gasoline, but delayed by about 20 years.
Specificity: Within the same demographic group, delinquents are 4 times more likely to have elevated bone lead concentrations than non-delinquents.
Dose-response Curve: See temporality.
Consistency Across Contexts: The delayed rise in violent crime after increased use of leaded gasoline is observed in many industrialized countries.

Since we can't run a randomized controlled trial on whether CO2 emissions cause global warming, we can't ever know whether it does.

A non-RCT study can still be serve as potentially weaker, but sometimes just as strong, evidence for causation.

Students are quick to notice small sample size, slower to notice problems with experimental design.

Students struggle to generate non-RCT types of evidence for causality, although they are better at recognizing it.

Additional Content

You must be logged in to see this content.