10.2 Blinding: Difference between revisions

From Sense & Sensibility & Science
(// Edit via Wikitext Extension for VSCode)
(// Edit via Wikitext Extension for VSCode)
Line 70: Line 70:


</tabber>
</tabber>
<restricted>


== Useful Resources ==
{{#restricted:{{10.2 Blinding}}}}{{NavCard|prev=10.1 Confirmation Bias|next=11.1 Pathological Science}}
 
<tabber>
 
|-|Lecture Video=
 
<br /><center><youtube>tovt651VVZ4</youtube></center><br />
 
|-|Discussion Slides=
 
{{LinkCard
|url=https://docs.google.com/presentation/d/1JY2QW7yJ0VhlVrLgOLWhzgnWnPhAIxs1KZ1_QZlCPkU/
|title=Discussion Slides Template
|description=The discussion slides for this lesson.
}}
<br />
 
|-|Handouts and Activities=
 
{{LinkCardInternal
|url=:File:Kabob Kalimba Worksheet.pdf
|title=Kabob Kalimba Worksheet
|description=A handout with an explanation of the activity for the students to carefully read.}}
{{LinkCard
|url=https://docs.google.com/spreadsheets/d/1e2joE4obZttn6rKqeQUfT0GikA80iK7-fXgBM1wfDTo/
|title=Kabob Kalimba Spreadsheet Template
|description=The spreadsheet that students should have the option of copying and using for the Kabob Kalimba activity.}}
{{LinkCard
|url=https://docs.google.com/forms/d/1qQjyLoZiJC5n6ZkpW3iwJqWyexF8jBTWf6zi4MPKj34/
|title=Kabob Kalimba Response Form Template
|description=The Google form students fill out with their results from the Kabob Kalimba activity.}}
<br />
 
|-|Readings and Assignments=
 
{{LinkCardInternal
|url=:File:Hide Results to Seek the Truth - MacCoun, Perlmutter.pdf
|title=Hide Results to Seek the Truth
|description=Reading on blind analysis by SSS professors.}}
<br />
 
</tabber>
 
== Recommended Outline ==
 
=== Before Class ===
 
The [[#Preparation|kabob kalimba activity]] requires some preparation. Make sure you're very familiar with the activity, print out the worksheets, and have ''lots'' of kabob skewers on hand. You will likely also want to be ready to visualize the results of the experiment. Additionally, you may want to briefly look at the [[11.1 Pathological Science|next lesson]] so as to tell your students what to read for it.
 
=== During Class ===
 
{| class="wikitable" style="margin-left: 0px; margin-right: auto;"
|80 Minutes
|This entire class is spent on the [[#Kabob Kalimba|kabob kalimba activity]].
|}
 
== Lesson Content ==
 
=== Kabob Kalimba ===
 
This activity gives the students a chance to try and make some scientific measurement while falling into or avoiding several of the pitfalls that blind analysis could help with. In groups of three, the students will be cantilevering wooden skewers off the edge of a table and measuring the lengths at which they produce notes of different frequencies. The students' ultimate task is to find the ratio of the lengths of skewer stick (measured from the edge of the table to the end of the stick) for two notes that are an octave apart. The actual value is <math>\sqrt{2}\approx1.41</math>. But, there is substantial priming to suggest to the students that the value they should expect is two. If the students think of applying the methods of blind analysis, they should be able to avoid falling into this trap.
 
==== Preparation ====
 
# Print out at least one copy of the worksheet for every three students.
# Acquire at least ten wooden kabob sticks for every three students.
# Make sure you have some method to number the groups the students work in.
# Have at least one sheet of paper for students to write down their final answers on.
 
==== Instructions ====
 
{| class="wikitable" style="margin-left: 0px; margin-right: auto;"
|5 Minutes
|Introduce and set up the activity.
# Go through the slides briefly introducing the activity. You may also have to demonstrate it.
# Split the class into groups of three.
# Assign each group a number.
# Give every group a copy of the worksheet.
# Remind the students to try and whisper for this activity so as to not mess up other people's data.
|-
|30 Minutes
|Let the students work on the activity and try to get their measurements.
|-
|5 Minutes
|Debrief the students on the gimmick of the activity.
# Have every group submit their final ratio answer to the Google form along with their group number.
# Reveal the final answer and the "[[#Priming|trick]]" of this activity.
# Display and go through the answers the students came up with to see if anyone used or could have benefitted from blind analysis.
|-
|20 Minutes
|Go through the [[#Whole-class Discussion Questions|whole-class discussion questions]].
|-
|10 Minutes
|Have students discuss the [[#Small Group Discussion Questions|small group discussion questions]].
|-
|10 Minutes
|Call everyone back to go over the [[#Small Group Discussion Questions|same discussion questions]] with the entire class.
|}
 
==== Whole-class Discussion Questions ====
 
Ask these questions to the entire class. Spend about a minute on the first question, three on the second, and four minutes on each of the other questions.
 
# What do you think the correct answer is? (Hint: It is not 2.)
# Which senses or measurement instruments did you use?
# What is the signal you are trying to measure, and what are the sources of noise?
# What are the sources of systematic and statistical uncertainty? How did you address them?
# What are some choices you had to make in your measurement and analysis? What are some places into which confirmation bias could have crept?
# Did you use blind analysis? If so, how did you do it?
 
==== Small Group Discussion Questions ====
 
Have your students spend about five minutes on each of these questions in small groups. When done with both questions, call everyone back and ask the questions to the entire class.
 
<ol start=1><li>Is there anything you would do differently to guard against bias in the ''measurement'' process?</li></ol>
{{BoxAnswer|There's lots of ways to do this. But, they're all based around division of labor. The person listening for the octaves should probably not be the person using a ruler and making the measurements. You also could have someone independently make rules with different new units that convert inches or centimeters in ways that are unknown to the measurer.}}
<ol start=2><li>Is there anything you would do differently to guard against bias in the ''analysis'' process?</li></ol>
{{BoxAnswer|This also depends on division of labor. What's most important is that the person doing the analysis has no sense of that the actual ratio values are when they're deciding what data points to include or not. The person measuring may also add a secret number to their measurements that's not revealed to the analyzer until after the analysis work is complete.}}
==== Priming ====
 
[[File:Kabob.png|thumb|Results from running this activity in Spring 2023 at UC Berkeley.]]
There's lots of places where students are primed to think the answer is two instead of <math>\sqrt{2}\approx1.41</math>.
* The worksheet mentions that "[Pythagoras] found that when two strings of different lengths are plucked simultaneously, a pleasant harmony between the notes would be heard when the string lengths formed a simple ratio, for example, 2:1.
* The worksheet also explains that "An octave interval then corresponds to exactly doubling the frequency of this strongest component."
* The spreadsheet has a prefilled example ratio that is approximately two.
* The spreadsheet automatically calculates results and presents them on a plot where the center value is two.
<!-- == Overflow ==
 
<div class="toccolours mw-collapsible mw-collapsed" style="overflow:auto;">
<div style="font-weight:bold;line-height:1.6;">Extra content that's not currently part of the official lesson plan.</div>
<div class="mw-collapsible-content">
 
=== Tube Measurement ===
 
<youtube>https://youtu.be/2JyOok92sCQ</youtube>
 
For an in-person version of this activity, students would blow some plastic tubes to produce pitches. A short (fixed-length) tube, a long (adjustable-length) tube, and some measurement tools are provided. They are asked to find the ratio of lengths of the long tube to the short tube when the pitches they produce are one octave apart. Since the tuning and measurement are both difficult, some variation is inevitable in the final result. When asked to report a final ratio, the students may be inclined to ignore certain outlying points in order to confirm the expectation that the ratio is equal to 2, even when the true ratio is closer to 2.1. (Refer to old handouts before 2019 for this version.) This lets students experience first-hand the pull of confirmation bias and motivates certain blind analysis techniques.
 
In the Zoom version of this activity, the tuning and measurement are done by the course staff and video recorded. Half the groups are given "unblinded" worksheets and the other half "blinded" worksheets. The unblinded worksheets contain the actual measured ratios and try to prime students to confirm the ratio of 2. The blinded worksheets contain the same measured ratios, but with a secret number ''X'' added to all the values, so that students have to make choices about which data points to include without seeing whether the result will be close to 2.
 
==== Instructions ====
 
# Send the worksheets to students. Half the groups should receive the blinded worksheet and the other half the unblinded worksheets. Do not refer to them as blinded/unblinded or tell them what the difference between them is.
# Send the video link above to students. They may refer to parts of the video during their data analysis, but they do not need to watch the whole video.
# (X min) Let students work on the task and offer assistance wherever needed.
# After X min, remind each group to agree on a single ratio.
# Ask each group to tell you their final ratio privately, without revealing to other groups. For the blinded groups, you need to tell them that the secret number ''X'' = 1.37.{{Caution|You may alternatively use this [https://forms.gle/gMh9deSihge3Kc7T7 Google Form], filled by one person per group.}} {{Todo|Update Google Form link.}}
# Close all breakout rooms. Reveal the difference between the blinded/unblinded worksheets and share the results of the whole class. We expect that the blinded groups would report something close to 1.41, while the unblinded groups would report something close to 1.5 due to confirmation bias.{{Caution|It's okay if this result doesn't occur. You could say that SSS students have often defied common human psychology.}}
# Ask students in the unblinded groups to describe what they were thinking when they were doing the analysis. Then ask the blinded groups.
 
==== Discussion Questions ====
 
# What was the purpose of adding a secret number ''X'' to all the measured ratios for the blinded groups? {{Answer|It shifts all the numbers away from the expected value of 2, in order to prevent confirmation bias.}}
# Are there other places where confirmation bias may creep in? Can you come up with other methods (during measurement or analysis) to further reduce confirmation bias?
 
</div></div> --></restricted>{{NavCard|prev=10.1 Confirmation Bias|next=11.1 Pathological Science}}
[[Category:Lesson plans]]
[[Category:Lesson plans]]

Revision as of 00:13, 22 August 2023

Topic Cover - 10.2 Blinding.png

Blind analysis, the practice of deciding how we will analyze data before finding out if the analysis we have chosen supports our hypothesis, counteracts confirmation bias.



The Lesson in Context

This lesson offers solutions to potential pitfalls in scientific studies raised in 10.1 Confirmation Bias and 4.2 Finding Patterns in Random Noise. We use a stick measurement activity to illustrate the effects of confirmation bias and to motivate techniques to reduce its effect, especially blind analysis. These techniques are not universally employed in all fields of science today, and students pursuing a scientific career are encouraged to introduce these techniques in their own work.

4.2 Finding Patterns in Random NoiseTopic Icon - 4.2 Finding Patterns in Random Noise.png
  • Blind analysis is one way to prevent some forms of [math]\displaystyle{ p }[/math]-hacking. For example, choices to be made about a study, such as the exact statement of the hypothesis, definitions of terms, and statistical techniques, can be preregistered or made "blinded" to the data. The effects on the final result due to these choices may be hidden from the researcher during analysis. These techniques prevent motivated reasoning during analysis decisions.
10.1 Confirmation BiasTopic Icon - 10.1 Confirmation Bias.png
  • Blind analysis helps prevent scientists from the temptation to make choices that make the result more likely to confirm the researcher's own prediction or to match currently accepted knowledge. Otherwise, non-confirming, surprising results may be incorrectly missed.
13.1 Denver Bullet StudyTopic Icon - 13.1 Denver Bullet Study.png
  • In group decision making, when factual evaluation and values evaluation are made by two different groups of people without knowledge of the other, it prevents evaluation motivated by the need to confirm a personal belief.


Takeaways

After this lesson, students should

  1. Recognize what types of blinding are useful for solving what types of errors.
  2. Be able to explain why blind analysis might be needed, by explaining the errors that can arise in its absence.
  3. Recognize when blind analysis is being used and explain what function it serves. Identify situations and decisions in which blind analysis would be useful.
  4. Be able to evaluate techniques (e.g., registered replication, adversarial collaboration, peer review)
    1. for ability to address confirmation bias, and
    2. in comparison to blind analysis.
  5. Propose how to use blind analysis for simple studies.

Blind Analysis

Making all decisions regarding data analysis before the results of interest are unveiled, such that expectations about the results do not bias the analysis. Usually co-occurs with a commitment to publicize the results however they turn out.

Double Blind

Studies where both the participants and the administrators of the experiment are blinded as to whether they're in the control or intervention group.
Students may confuse blind analysis with a double blind experiment. The latter is used primarily in treatment testing in conjunction with a placebo, such that the patient is prevented from knowing whether they received the real treatment or placebo, and the doctor is also prevented from knowing this fact in order not to inadvertently reveal this fact to the patient through subtle signs. The former type of blinding applies to the analysis process once the data has been collected. In the case of treatment testing, blind analysis may be employed whether or not double blinding is.

Preregistration

A research group publicly commits to a specific set of methods and analyses before they conduct their research.

Registered Replication

One or more research groups commit to a specific set of methods and procedures to replicate earlier work to see if they get the same results (typically with the input of the original research team). Results are publicized regardless of outcome.

Registered Reports

When studies are peer reviewed and journals commit to publishing before the research is undertaken. This reduces publication biases where journals prioritize interesting or statistically significant findings over null results.

Adversarial Collaboration

Scientists with opposing views agree to all the details of how data should be gathered and analyzed before any of the results are known.

Peer Review

New results are evaluated by other experts in the same field to determine whether they are valid. This only reduces confirmation bias insofar as reviewers don't share the same biases.

Muon [math]\displaystyle{ g }[/math]−2 Experiment

This experiment performed highly precise measurements of the magnetic dipole moment of muons to test the theoretical predictions of the currently accepted model of elementary particles. Blinding is done by injecting a secret code into all of the data that would undergo analysis, so that the scientists involved would not make specific choices in the analysis in a way that makes the final value agree with the theoretical prediction. The secret code was kept in a physical locker, the opening of which was highly publicized in the announcement event. Once the data was "unscrambled", the result shows that there is indeed a sizeable deviation of the measured value from the theoretical prediction.

[math]\displaystyle{ p }[/math]-hacking and Preregistration

One way in which [math]\displaystyle{ p }[/math]-hacking could occur is to choose or alter the analysis method after one has seen the results of that method to be undesirable. As an example, suppose a psychologist performs an experiment with 100 participants, sees that the results are at a statistical significance of [math]\displaystyle{ p }[/math] = 0.06, just shy of the [math]\displaystyle{ p }[/math] < 0.05 threshold for publication. They then decide to recruit another 100 participants to "improve their results", finally leading to [math]\displaystyle{ p }[/math] = 0.04, good enough for publication. This is a form of [math]\displaystyle{ p }[/math]-hacking, as [math]\displaystyle{ p }[/math]-values can dip below 0.05 as one slowly increases the sample size simply by random chance. To guard against this phenomenon, the sample size of a study is a required item in the preregistration process.

Additional Content

You must be logged in to see this content.