10.2 Blinding

From Sense & Sensibility & Science
Topic Icon - 10.2 Blinding.png

Blind analysis has emerged as the newest addition to the scientific process to guard against scientists' own confirmation bias, especially in high-precision experiments involving complex analyses, by preventing the scientists from seeing the result of their analysis as they refine and debug the analysis procedure. We illustrate blinding techniques and their dramatic effect through a simple measurement experiment.

The Lesson in Context

This lesson offers solutions to potential pitfalls in scientific studies raised in 10.1 Confirmation Bias and 4.2 Finding Patterns in Random Noise. We use a stick measurement activity to illustrate the effects of confirmation bias and to motivate techniques to reduce its effect, especially blind analysis. These techniques are not universally employed in all fields of science today, and students pursuing a scientific career are encouraged to introduce these techniques in their own work.

Relation to Other Lessons

Earlier Lessons

4.2 Finding Patterns in Random NoiseTopic Icon - 4.2 Finding Patterns in Random Noise.png
  • Blind analysis is one way to prevent some forms of [math]\displaystyle{ p }[/math]-hacking. For example, choices to be made about a study, such as the exact statement of the hypothesis, definitions of terms, and statistical techniques, can be preregistered or made "blinded" to the data. The effects on the final result due to these choices may be hidden from the researcher during analysis. These techniques prevent motivated reasoning during analysis decisions.
10.1 Confirmation BiasTopic Icon - 10.1 Confirmation Bias.png
  • Blind analysis helps prevent scientists from the temptation to make choices that make the result more likely to confirm the researcher's own prediction or to match currently accepted knowledge. Otherwise, non-confirming, surprising results may be incorrectly missed.

Later Lessons

13.1 Denver Bullet StudyTopic Icon - 13.1 Denver Bullet Study.png
  • In group decision making, when factual evaluation and values evaluation are made by two different groups of people without knowledge of the other, it prevents evaluation motivated by the need to confirm a personal belief.

Takeaways

After this lesson, students should

  1. Recognize what types of blinding are useful for solving what types of errors.
  2. Be able to explain why blind analysis might be needed, by explaining the errors that can arise in its absence.
  3. Recognize when blind analysis is being used and explain what function it serves. Identify situations and decisions in which blind analysis would be useful.
  4. Be able to evaluate techniques (e.g., registered replication, adversarial collaboration, peer review)
    1. for ability to address confirmation bias, and
    2. in comparison to blind analysis.
  5. Propose how to use blind analysis for simple studies.

Blind Analysis

Making all decisions regarding data analysis before the results of interest are unveiled, such that expectations about the results do not bias the analysis. Usually co-occurs with a commitment to publicize the results however they turn out.

Double Blind

Studies where both the participants and the administrators of the experiment are blinded as to whether they're in the control or intervention group.
Students may confuse blind analysis with a double blind experiment. The latter is used primarily in treatment testing in conjunction with a placebo, such that the patient is prevented from knowing whether they received the real treatment or placebo, and the doctor is also prevented from knowing this fact in order not to inadvertently reveal this fact to the patient through subtle signs. The former type of blinding applies to the analysis process once the data has been collected. In the case of treatment testing, blind analysis may be employed whether or not double blinding is.

Preregistration

A research group publicly commits to a specific set of methods and analyses before they conduct their research.

Registered Replication

One or more research groups commit to a specific set of methods and procedures to replicate earlier work to see if they get the same results (typically with the input of the original research team). Results are publicized regardless of outcome.

Registered Reports

When studies are peer reviewed and journals commit to publishing before the research is undertaken. This reduces publication biases where journals prioritize interesting or statistically significant findings over null results.

Adversarial Collaboration

Scientists with opposing views agree to all the details of how data should be gathered and analyzed before any of the results are known.

Peer Review

New results are evaluated by other experts in the same field to determine whether they are valid. This only reduces confirmation bias insofar as reviewers don't share the same biases.

Muon [math]\displaystyle{ g }[/math]−2 Experiment

This experiment performed highly precise measurements of the magnetic dipole moment of muons to test the theoretical predictions of the currently accepted model of elementary particles. Blinding is done by injecting a secret code into all of the data that would undergo analysis, so that the scientists involved would not make specific choices in the analysis in a way that makes the final value agree with the theoretical prediction. The secret code was kept in a physical locker, the opening of which was highly publicized in the announcement event. Once the data was "unscrambled", the result shows that there is indeed a sizeable deviation of the measured value from the theoretical prediction.

[math]\displaystyle{ p }[/math]-hacking and Preregistration

One way in which [math]\displaystyle{ p }[/math]-hacking could occur is to choose or alter the analysis method after one has seen the results of that method to be undesirable. As an example, suppose a psychologist performs an experiment with 100 participants, sees that the results are at a statistical significance of [math]\displaystyle{ p }[/math] = 0.06, just shy of the [math]\displaystyle{ p }[/math] < 0.05 threshold for publication. They then decide to recruit another 100 participants to "improve their results", finally leading to [math]\displaystyle{ p }[/math] = 0.04, good enough for publication. This is a form of [math]\displaystyle{ p }[/math]-hacking, as [math]\displaystyle{ p }[/math]-values can dip below 0.05 as one slowly increases the sample size simply by random chance. To guard against this phenomenon, the sample size of a study is a required item in the preregistration process.

Additional Content

You must be logged in to see this content.