No edit summary |
(// Edit via Wikitext Extension for VSCode) |
||
Line 155: | Line 155: | ||
** Noise: Go outside to get away from the background conversations. | ** Noise: Go outside to get away from the background conversations. | ||
== Overflow == | <!-- == Overflow == | ||
<div class="toccolours mw-collapsible mw-collapsed" style="overflow:auto;"> | <div class="toccolours mw-collapsible mw-collapsed" style="overflow:auto;"> | ||
Line 165: | Line 165: | ||
# {{Changemaker|Session 7 discusses the research by Julia Rozovsky at Google to learn what makes the perfect team. The authors note that, "the only thing worse than not finding a pattern is finding too many of them". In Rozovsky's work, what was the signal they were looking for and what created noise? }} | # {{Changemaker|Session 7 discusses the research by Julia Rozovsky at Google to learn what makes the perfect team. The authors note that, "the only thing worse than not finding a pattern is finding too many of them". In Rozovsky's work, what was the signal they were looking for and what created noise? }} | ||
</div></div> | </div></div> --> | ||
{{NavCard|prev=3.2 Calibration of Credence Levels|next=4.2 Finding Patterns in Random Noise}} | |||
[[Category:Lesson plans]] | [[Category:Lesson plans]] |
Revision as of 13:08, 24 July 2023
The challenges of finding the information we want amidst messy data.
Useful Links
- Guess the Message Game Handout
- Three Column Overview of the Week
- Lesson Slides (2023 Master)
- Website Page
Readings and Assignments
Lecture Video
Learning Goals
After this lesson, students should
- Be able to explain what scientists mean by "signal," "noise," and "signal-to-noise ratio."
- Be able to identify examples of "signal" and "noise," recognizing that these examples are context-dependent.
- Be able to roughly compare measurement techniques in terms of their resultant signal-to-noise ratios.
- Be able to describe examples of techniques and tools to suppress noise and/or amplify signal.
Definitions
- Signal
- Aspects of observations or stimuli that provide useful information about the target of interest, as opposed to noise.
Please hold off on introducing the concept of false positive/negative or thresholds in detections, as students have previously been overwhelmed and confused. We will properly discuss them in 5.1 False Positives and Negatives.
- Aspects of observations or stimuli that provide useful information about the target of interest, as opposed to noise.
- Noise
- The aspects of observations that get confused with signal but do not provide the same useful information about the target of interest. Noise is frequently, but not always, the result of random measurement fluctuations.
Some students falsely think that noise is anything that prevents you from detecting the signal, for instance, a law banning the use of ultrasound to detect the sex of the foetus. In fact, noise is something that is detected by an instrument the same way a signal would be, except that it is not caused by the source of the signal and could be confused with the signal. There is always random background noise. But, noise doesn't have to be random. Noise does not have to be sound.
- The aspects of observations that get confused with signal but do not provide the same useful information about the target of interest. Noise is frequently, but not always, the result of random measurement fluctuations.
- Signal-to-noise Ratio
- The relative strength of signal compared to the relative strength of noise in a given context. Obtaining meaningful information from the world requires distinguishing signal from noise. Therefore, human cognition (both scientific and otherwise) relies on techniques and tools to suppress noise and/or amplify signal (i.e., increase signal-to-noise ratio). It is possible to design filters to increase the signal-to-noise ratio, if you know where the noise is going to appear.
Examples
- As a member of the Bajau people of Southeast Asia, you are diving to collect shellfish for food. While the shellfish themselves are the signal, there are several sources of noise: rocks and other creatures resembling shellfish, waving sunlight patterns on the seafloor. The signal-to-noise ratio may be low if the water is murky (higher noise), the shellfish are camouflaged (lower signal), or if the light is dim (lower signal). BBC article
- Detecting fish jumps (signal) on a lake on a day when the wind is causing waves (noise). Some splashing waves may be misidentified as fish jumps.
- Getting the words of a radio personality through static.
- Hearing your conversational partner at a party where lots of conversations are happening.
- Figuring out if there's a meaningful difference between the control condition and experimental condition in an RCT. Random fluctuations in the chosen experimental sample may cause a spurious difference between the two groups; this is a source of noise.
- Finding the facts on a topic where there's a lot of disinformation floating around.
- Saul's story of an exoplanet around a pulsar, when it was not really there.
- Palette cleansing with water or crackers between tasting different wines. The subtle differences between wines are the signal, while lingering flavours and scents from the previous wine are the noise.
- Covid symptom screening, where the signal is the actual Covid infection, and the noise is all the other illnesses/allergies/etc causing similar symptoms.
- Smoke detectors detect the presence of smoke from a fire (signal) by measuring the opacity of air. Steam is a possible source of noise.
Context
We introduce the concept of signal and noise in "detection problems" and teach students how to identify the signal and various sources of noise in diverse scenarios. This foreshadows the ethical considerations in deciding how strong a signal must be to be counted as a "positive" (5.1 False Positives and Negatives).
Before
- 2.2 Systematic and Statistical Uncertainty
- Both systematic and statistical uncertainties introduce noise to every measurement.
- 3.1 Probabilistic Reasoning
- The presence of noise, which sometimes disguises as a signal, is inevitable in any measurement. The identification of a signal always comes with a roughly quantifiable level of confidence.
After
- 4.2 Finding Patterns in Random Noise
- In addition to the signal-to-noise ratio, there are other statistical tools (e.g. p-value) to quantify the strength of the signal amidst all the noise.
- 5.1 False Positives and Negatives
- "Positive" and "negative" refer to whether we identify what we detect as a signal or not. The decision of any "threshold" of strength for a signal to be counted as positive inevitably involves human values judgment in a trade-off between the rates of false positives and false negatives.
- 5.2 Scientific Optimism
- Some signals in nature seem hopelessly too weak to detect, such as the tiny fluctuations in the distance between two mirrors as a result of the gravitational waves from faraway black holes, but scientists spend decades to develop new instruments to increase the strength of the signal, as well as new analysis techniques to filter out the noise.
- 6.1 Correlation and Causation
- The detection of a "statistically significant" difference between conditions in an RCT is the identification of a signal. The random variations that exist between experimental subjects are a source of noise.
Recommended Outline
Before Class
- Review PlayPosit and discussion questions and ask faculty, Gabriel, or Emlen any questions you have.
- Print the handouts for Guess the Message Game.
During Class
- (5 min) Come up with some fun way to assign the roles of spokesperson and notetaker (e.g. earliest birthday in the year, lives furthest from campus). Remind them of the responsibilities of these roles.
- (5 min) Opening clicker question.
- (20 min) Go through several scenarios in the scenario analysis activity.
- (30 min) Play the guess the message game.
- (5 min) Teach a little about signal-to-noise ratio.
- (5 min) Collect questions for plenary.
After Class
- [Any essential logistical things that need to be done as followup for this class]
- Collect answers from notetakers for the forum / plenary.
Lesson Content
Warmup Poll
- How can we definitely tell if a single stimulus is signal or noise?
- By improving the sensitivity of the instrument.
- The stimulus is definitely a signal if it is stronger than most of the previous stimuli.
- It is impossible to tell for sure if a single stimulus is a signal or noise.
c. Noise can masquerade itself as signal, and random fluctuations can sometimes produce a single strong stimulus. For a given stimulus, we can only come up with a likelihood for whether it is signal or noise. We then have to determine the confidence level we need in order to classify stimuli appropriately.
- Any single supposed signal might be a rare (or not-so-rare) spike in noise.
Scenario Analysis
For each of the following scenarios, answer the following questions about signal and noise.
- What is the sense/instrument that you are using?
- What does the sense/instrument actually measure?
- What is the signal from the sense/instrument you are expecting?
- What sources of noise do you anticipate in this measurement? List two or more if you can.
- (Optional) How would you reduce these sources of noise?
Example: Bajau Divers
Members of the Bajau people of Southeast Asia collecting shellfish for food.
- What is the sense/instrument that you are using?
Their eyes. - What does the sense/instrument actually measure?
Visual light reflecting off nearby surfaces. - What is the signal from the sense/instrument you are expecting?
A shell shape/pattern. - What sources of noise do you anticipate in this measurement? List two or more if you can.
Murky water, low light, creatures and rocks that look like shellfish. - (Optional) How would you reduce these sources of noise?
Clean the water, dive during the day, etc.
Scenarios
- Catching gossip about you from across the room at a party. What about understanding what the person you're talking to is saying?
- Detecting a metal knife in the luggage of someone boarding an airplane.
- Detecting an ongoing earthquake in Berkeley.
- Determining if your arch nemesis put cyanide in your almond milk.
- Determining whether there are birds around you on your weekly birding expedition, then determining whether owls are in the mix.
- Is that a creepy crawly on your neck right now?
- Identifying a budding new wave of COVID in the US. (Suppose you're a health official provided with daily updates of the following data from hospitals across the country: rates of people coming into the ER with fever, coughs, broken bones, wounds, diarrhea, and cardiac arrest...)
Signal and Noise Game
The 140 page handout for this game is designed to work with up to 35 students (each student gets four pages). The last of the four pages has a different randomized pair of letter and number grids for each student. Hence, the handout is slightly different for each student. If you have more than 35 students, you'll have to print more copies. |
In this game the students will write a message and corrupt it to varying degrees. Each student will have a partner with whom they shared the corrupted messages. Each student will try and decode the messages from the other student from most to least corrupted. Full instructions are available here.
They should not share the messages nor the decoding process until the game is done. |
Instructions
- Hand out the students a copy of the worksheet. It has the full instructions in it.
- (2 min) Explain the game as per the instructions linked above. Make sure the students know not to share their uncorrupted messages with each other until the game is complete.
- (10 min) Have students pair up and play the game.
- (3 min) Ask the discussion questions below.
Discussion Questions
- What was the highest corruption level at which you could understand the message?
- What are the factors affecting the signal-to-noise ratio?
The quantity of letters corrupted increases the noise to affect the ratio. The strength of the original message is also important. If the original message is short, then this also lowers the signal-to-noise ratio. Furthermore, if you have a very obscure message (that another student might not be likely to recognize) to begin with, then the signal would also be less clear. |
Takeaway
- The ratio of signal to noise determines how easy it is to distinguish between true signal and the noise that "pretends" to be signal.
- We could have a poor ratio because the signal is very low or because the noise is very high.
- Example: Why is it hard to have a conversation at a cocktail party?
- Signal: Increase the sound of your voice.
- Noise: Go outside to get away from the background conversations.