|
|
(13 intermediate revisions by the same user not shown) |
Line 1: |
Line 1: |
| [[File:Topic Cover - 4.1 Signal and Noise.png|thumb]]
| | {{Cover|4.1 Signal and Noise}} |
|
| |
|
| The challenges of finding the information we want amidst messy data.
| | To make sense of this complex world, how do we confidently identify a meaningful pattern amongst a myriad of distractions? Scientists call the pattern "signal" and the distractions "noise." We clarify this subtle distinction and introduce techniques to make the signal stand out from the noise, such as with the use of filters. |
|
| |
|
| {{Navbox}}
| | == The Lesson in Context == |
|
| |
|
| == Useful Links ==
| | <!-- Always begin section with a description of this lesson in relation to the course as a whole. --> |
| | We introduce the concept of signal and noise in "detection problems" and teach students how to identify the signal and various sources of noise in diverse scenarios. This foreshadows the [[5.1 False Positives and Negatives|ethical considerations in deciding how strong a signal must be to be counted as a "positive"]]. |
|
| |
|
| * [[:File:Guess the Message Game - Complete.pdf|Guess the Message Game Handout]]
| | <!-- Expandable section relating this lesson to other lessons. --> |
| * [https://docs.google.com/document/d/1afjmqlZJFqnB-CK1VeJNnUH2tbmalWcj4OW2OgEiKJ8/edit?usp=sharing Three Column Overview of the Week]
| | {{Expand|Relation to Other Lessons| |
| * [https://docs.google.com/presentation/d/1xwdRq6FVkyz4dgdygAitSZLx-kZt54njGHrWF4lc2mg/edit?usp=drivesdk Lesson Slides (2023 Master)]
| | '''Earlier Lessons''' |
| * [https://sensesensibilityscience.berkeley.edu/topic/9 Website Page]
| | {{ContextLesson|2.2 Systematic and Statistical Uncertainty}} |
| | {{ContextRelation|Both systematic and statistical uncertainties introduce noise to every measurement.}} |
| | {{ContextLesson|3.1 Probabilistic Reasoning}} |
| | {{ContextRelation|The presence of noise, which sometimes disguises as a signal, is inevitable in any measurement. The identification of a signal always comes with a roughly quantifiable level of confidence.}} |
| | {{Line}} |
| | '''Later Lessons''' |
| | {{ContextLesson|4.2 Finding Patterns in Random Noise}} |
| | {{ContextRelation|In addition to the signal-to-noise ratio, there are other statistical tools (e.g. <math>p</math>-value) to quantify the strength of the signal amidst all the noise.}} |
| | {{ContextLesson|5.1 False Positives and Negatives}} |
| | {{ContextRelation|"Positive" and "negative" refer to whether we identify what we detect as a signal or not. The decision of any "threshold" of strength for a signal to be counted as positive inevitably involves human values judgment in a trade-off between the rates of false positives and false negatives.}} |
| | {{ContextLesson|5.2 Scientific Optimism}} |
| | {{ContextRelation|Some signals in nature seem hopelessly too weak to detect, such as the tiny fluctuations in the distance between two mirrors as a result of the gravitational waves from faraway black holes, but scientists spend decades to develop new instruments to increase the strength of the signal, as well as new analysis techniques to filter out the noise.}} |
| | {{ContextLesson|6.1 Correlation and Causation}} |
| | {{ContextRelation|The detection of a "statistically significant" difference between conditions in an RCT is the identification of a signal. The random variations that exist between experimental subjects are a source of noise.}} |
| | }} |
| | == Takeaways == |
|
| |
|
| === Readings and Assignments ===
| | <tabber> |
|
| |
|
| ==== Lecture Video ====
| | |-|Learning Goals= |
| | |
| <youtube>https://youtu.be/S2AIAv_sTq8</youtube>
| |
| | |
| == Learning Goals ==
| |
|
| |
|
| After this lesson, students should | | After this lesson, students should |
| | <!-- Learning goals are written as a numbered list. --> |
| # Be able to explain what scientists mean by "signal," "noise," and "signal-to-noise ratio." | | # Be able to explain what scientists mean by "signal," "noise," and "signal-to-noise ratio." |
| # Be able to identify examples of "signal" and "noise," recognizing that these examples are context-dependent. | | # Be able to identify examples of "signal" and "noise," recognizing that these examples are context-dependent. |
Line 26: |
Line 39: |
| # Be able to describe examples of techniques and tools to suppress noise and/or amplify signal. | | # Be able to describe examples of techniques and tools to suppress noise and/or amplify signal. |
|
| |
|
| === Definitions ===
| | |-|Definitions= |
| | |
| * '''Signal'''
| |
| *: Aspects of observations or stimuli that provide useful information about the target of interest, as opposed to noise. {{Caution|Please hold off on introducing the concept of false positive/negative or thresholds in detections, as students have previously been overwhelmed and confused. We will properly discuss them in [[5.1 False Positives and Negatives]].}}
| |
| * '''Noise'''
| |
| *: The aspects of observations that get confused with signal but do not provide the same useful information about the target of interest. Noise is frequently, but not always, the result of random measurement fluctuations. {{Caution|Some students falsely think that noise is anything that prevents you from detecting the signal, for instance, a law banning the use of ultrasound to detect the sex of the foetus. In fact, noise is something that is detected by an instrument the same way a signal would be, except that it is not caused by the source of the signal and could be confused with the signal.}} {{Caution|There is always random background noise. But, noise doesn't have to be random.|small=right}} {{Caution|Noise does not have to be sound.|small=right}}
| |
| * '''Signal-to-noise Ratio'''
| |
| *: The relative strength of signal compared to the relative strength of noise in a given context. Obtaining meaningful information from the world requires distinguishing signal from noise. Therefore, human cognition (both scientific and otherwise) relies on techniques and tools to suppress noise and/or amplify signal (i.e., increase signal-to-noise ratio). It is possible to design filters to increase the signal-to-noise ratio, if you know where the noise is going to appear.
| |
| | |
| === Examples ===
| |
| | |
| * As a member of the Bajau people of Southeast Asia, you are diving to collect shellfish for food. While the shellfish themselves are the signal, there are several sources of noise: rocks and other creatures resembling shellfish, waving sunlight patterns on the seafloor. The signal-to-noise ratio may be low if the water is murky (higher noise), the shellfish are camouflaged (lower signal), or if the light is dim (lower signal). [https://www.bbc.com/news/science-environment-43823885 BBC article]
| |
| * Detecting fish jumps (signal) on a lake on a day when the wind is causing waves (noise). Some splashing waves may be misidentified as fish jumps.
| |
| * Getting the words of a radio personality through static.
| |
| * Hearing your conversational partner at a party where lots of conversations are happening.
| |
| * Figuring out if there's a meaningful difference between the control condition and experimental condition in an RCT. Random fluctuations in the chosen experimental sample may cause a spurious difference between the two groups; this is a source of noise.
| |
| * Finding the facts on a topic where there's a lot of disinformation floating around.
| |
| * Saul's story of an exoplanet around a pulsar, when it was not really there.
| |
| * Palette cleansing with water or crackers between tasting different wines. The subtle differences between wines are the signal, while lingering flavours and scents from the previous wine are the noise.
| |
| * Covid symptom screening, where the signal is the actual Covid infection, and the noise is all the other illnesses/allergies/etc causing similar symptoms.
| |
| * Smoke detectors detect the presence of smoke from a fire (signal) by measuring the opacity of air. Steam is a possible source of noise.
| |
| | |
| == Context ==
| |
| | |
| We introduce the concept of signal and noise in "detection problems" and teach students how to identify the signal and various sources of noise in diverse scenarios. This foreshadows the ethical considerations in deciding how strong a signal must be to be counted as a "positive" ([[5.1 False Positives and Negatives]]).
| |
| | |
| === Before ===
| |
| | |
| : '''[[2.2 Systematic and Statistical Uncertainty]]'''
| |
| :: Both systematic and statistical uncertainties introduce noise to every measurement.
| |
| : '''[[3.1 Probabilistic Reasoning]]'''
| |
| :: The presence of noise, which sometimes disguises as a signal, is inevitable in any measurement. The identification of a signal always comes with a roughly quantifiable level of confidence.
| |
| | |
| === After ===
| |
| | |
| : '''[[4.2 Finding Patterns in Random Noise]]'''
| |
| :: In addition to the signal-to-noise ratio, there are other statistical tools (e.g. p-value) to quantify the strength of the signal amidst all the noise.
| |
| : '''[[5.1 False Positives and Negatives]]'''
| |
| :: "Positive" and "negative" refer to whether we identify what we detect as a signal or not. The decision of any "threshold" of strength for a signal to be counted as positive inevitably involves human values judgment in a trade-off between the rates of false positives and false negatives.
| |
| : '''[[5.2 Scientific Optimism]]'''
| |
| :: Some signals in nature seem hopelessly too weak to detect, such as the tiny fluctuations in the distance between two mirrors as a result of the gravitational waves from faraway black holes, but scientists spend decades to develop new instruments to increase the strength of the signal, as well as new analysis techniques to filter out the noise.
| |
| : '''[[6.1 Correlation and Causation]]'''
| |
| :: The detection of a "statistically significant" difference between conditions in an RCT is the identification of a signal. The random variations that exist between experimental subjects are a source of noise.
| |
| | |
| == Recommended Outline ==
| |
| | |
| === Before Class ===
| |
| | |
| * Review PlayPosit and discussion questions and ask faculty, Gabriel, or Emlen any questions you have.
| |
| * Print the handouts for [[#Guess the Message Game|Guess the Message Game]].
| |
| | |
| === During Class ===
| |
| | |
| * (5 min) Come up with some fun way to assign the roles of spokesperson and notetaker (e.g. earliest birthday in the year, lives furthest from campus). Remind them of the responsibilities of these roles.
| |
| * (5 min) Opening [[#Clicker Question|clicker question]].
| |
| * (20 min) Go through several scenarios in the [[#Scenario Analysis|scenario analysis]] activity.
| |
| * (30 min) Play the [[#Guess the Message Game|guess the message game]].
| |
| * (5 min) Teach a little about signal-to-noise ratio.
| |
| * (5 min) Collect questions for plenary.
| |
| | |
| === After Class ===
| |
| | |
| * [Any essential logistical things that need to be done as followup for this class]
| |
| * Collect answers from notetakers for the forum / plenary.
| |
| | |
| == Lesson Content ==
| |
| | |
| === Warmup Poll ===
| |
| | |
| # How can we definitely tell if a ''single'' stimulus is signal or noise?
| |
| ## By improving the sensitivity of the instrument.
| |
| ## The stimulus is definitely a signal if it is stronger than most of the previous stimuli.
| |
| ## It is impossible to tell for sure if a single stimulus is a signal or noise. {{Answer|c. Noise can masquerade itself as signal, and random fluctuations can sometimes produce a single strong stimulus. For a given stimulus, we can only come up with a likelihood for whether it is signal or noise. We then have to determine the confidence level we need in order to classify stimuli appropriately.}}
| |
| # Any ''single'' supposed signal might be a rare (or not-so-rare) spike in noise.
| |
| | |
| === Scenario Analysis ===
| |
| | |
| For each of the following scenarios, answer the following questions about signal and noise.
| |
| # What is the sense/instrument that you are using?
| |
| # What does the sense/instrument actually measure?
| |
| # What is the signal from the sense/instrument you are expecting?
| |
| # What sources of noise do you anticipate in this measurement? List two or more if you can.
| |
| # (Optional) How would you reduce these sources of noise?
| |
| | |
| ==== Example: Bajau Divers ====
| |
| | |
| Members of the Bajau people of Southeast Asia collecting shellfish for food.
| |
| # What is the sense/instrument that you are using? {{Answer|Their eyes.}}
| |
| # What does the sense/instrument actually measure? {{Answer|Visual light reflecting off nearby surfaces.}}
| |
| # What is the signal from the sense/instrument you are expecting? {{Answer|A shell shape/pattern.}}
| |
| # What sources of noise do you anticipate in this measurement? List two or more if you can. {{Answer|Murky water, low light, creatures and rocks that look like shellfish.}}
| |
| # (Optional) How would you reduce these sources of noise? {{Answer|Clean the water, dive during the day, etc.}}
| |
| | |
| ==== Scenarios ====
| |
| | |
| * Catching gossip about you from across the room at a party. What about understanding what the person you're talking to is saying?
| |
| * Detecting a metal knife in the luggage of someone boarding an airplane.
| |
| * Detecting an ongoing earthquake in Berkeley.
| |
| * Determining if your arch nemesis put cyanide in your almond milk.
| |
| * Determining whether there are birds around you on your weekly birding expedition, then determining whether owls are in the mix.
| |
| * Is that a creepy crawly on your neck right now?
| |
| * Identifying a budding new wave of COVID in the US. (Suppose you're a health official provided with daily updates of the following data from hospitals across the country: rates of people coming into the ER with fever, coughs, broken bones, wounds, diarrhea, and cardiac arrest...)
| |
| | |
| === Signal and Noise Game ===
| |
| | |
| {{Caution|The 140 page handout for this game is designed to work with up to 35 students (each student gets four pages). The last of the four pages has a different randomized pair of letter and number grids for each student. Hence, the handout is ''slightly'' different for each student. If you have more than 35 students, you'll have to print more copies.}}
| |
| In this game the students will write a message and corrupt it to varying degrees. Each student will have a partner with whom they shared the corrupted messages. Each student will try and decode the messages from the other student from most to least corrupted. Full instructions are available [[:File:Guess the Message Game - Complete.pdf|here]]. {{Caution|They should not share the messages nor the decoding process until the game is done.|small=right}}
| |
| | |
| ==== Instructions ====
| |
| | |
| # Hand out the students a copy of the [[:File:Guess the Message Game - Complete.pdf|worksheet]]. It has the full instructions in it.
| |
| # (2 min) Explain the game as per the instructions linked above. Make sure the students know not to share their uncorrupted messages with each other until the game is complete.
| |
| # (10 min) Have students pair up and play the game.
| |
| # (3 min) Ask the discussion questions below.
| |
| | |
| ==== Discussion Questions ====
| |
| | |
| # What was the highest corruption level at which you could understand the message?
| |
| # What are the factors affecting the signal-to-noise ratio?
| |
| {{Answer|The quantity of letters corrupted increases the noise to affect the ratio. The strength of the original message is also important. If the original message is short, then this also lowers the signal-to-noise ratio. Furthermore, if you have a very obscure message (that another student might not be likely to recognize) to begin with, then the signal would also be less clear.}}
| |
| | |
| ==== Takeaway ====
| |
| | |
| * The ratio of signal to noise determines how easy it is to distinguish between true signal and the noise that "pretends" to be signal.
| |
| * We could have a poor ratio because the signal is very low or because the noise is very high.
| |
| * Example: Why is it hard to have a conversation at a cocktail party?
| |
| ** Signal: Increase the sound of your voice.
| |
| ** Noise: Go outside to get away from the background conversations.
| |
| | |
| <!-- == Overflow ==
| |
|
| |
|
| <div class="toccolours mw-collapsible mw-collapsed" style="overflow:auto;"> | | <!-- Definitions must be written with the Definition and Subdefinition templates. The first Definition should have the "first=yes" flag at the end. --> |
| <div style="font-weight:bold;line-height:1.6;">Extra content that's not currently part of the official lesson plan.</div>
| | {{Definition|Signal|Aspects of observations or stimuli that provide useful information about the target of interest, as opposed to noise.|first=yes}} |
| <div class="mw-collapsible-content"> | | {{BoxCaution|Please hold off on introducing the concept of false positive/negative or thresholds in detections, as students have previously been overwhelmed and confused. We will properly discuss them in [[5.1 False Positives and Negatives]].}} |
| | {{Definition|Noise|The aspects of observations that get confused with signal but do not provide the same useful information about the target of interest. Noise is frequently, but not always, the result of random measurement fluctuations.}} |
| | {{BoxCaution|Some students falsely think that noise is anything that prevents you from detecting the signal, for instance, a law banning the use of ultrasound to detect the sex of the foetus. In fact, noise is something that is detected by an instrument the same way a signal would be, except that it is not caused by the source of the signal and could be confused with the signal.}} |
| | {{BoxCaution|There is always random background noise. But, noise doesn't have to be random.}} |
| | {{BoxCaution|Noise does not have to be sound.}} |
| | {{Definition|Signal-to-noise Ratio|The relative strength of signal compared to the relative strength of noise in a given context. Obtaining meaningful information from the world requires distinguishing signal from noise. Therefore, human cognition (both scientific and otherwise) relies on techniques and tools to suppress noise and/or amplify signal (i.e., increase signal-to-noise ratio). It is possible to design filters to increase the signal-to-noise ratio, if you know where the noise is going to appear.}} |
| | <br /> |
|
| |
|
| === Changemaker ===
| | |-|Examples= |
|
| |
|
| # {{Changemaker|Session 7 discusses the research by Julia Rozovsky at Google to learn what makes the perfect team. The authors note that, "the only thing worse than not finding a pattern is finding too many of them". In Rozovsky's work, what was the signal they were looking for and what created noise? }}
| | <!-- Example formatting is still experimental. --> |
| | '''Bajau People''' |
| | : As a member of the Bajau people of Southeast Asia, you are diving to collect shellfish for food. While the shellfish themselves are the signal, there are several sources of noise: rocks and other creatures resembling shellfish, waving sunlight patterns on the seafloor. The signal-to-noise ratio may be low if the water is murky (higher noise), the shellfish are camouflaged (lower signal), or if the light is dim (lower signal). [https://www.bbc.com/news/science-environment-43823885 (BBC Article)] |
| | {{Line}} |
| | '''Identifying Fish''' |
| | : Detecting fish jumps (signal) on a lake on a day when the wind is causing waves (noise). Some splashing waves may be misidentified as fish jumps. |
| | {{Line}} |
| | '''Radio Static''' |
| | : Getting the words of a radio personality through static. |
| | {{Line}} |
| | '''Loud Party''' |
| | : Hearing your conversational partner at a party where lots of conversations are happening. |
| | {{Line}} |
| | '''Randomized Controlled Trials''' |
| | : Figuring out if there's a meaningful difference between the control condition and experimental condition in an RCT. Random fluctuations in the chosen experimental sample may cause a spurious difference between the two groups; this is a source of noise. |
| | {{BoxCaution|We will cover RCTs in detail in [[6.1 Correlation and Causation]].}} |
| | {{Line}} |
| | '''Online Researching''' |
| | : Finding the facts on a topic where there's a lot of disinformation floating around. |
| | {{Line}} |
| | '''Palette Cleansing''' |
| | : Palette cleansing with water or crackers between tasting different wines. The subtle differences between wines are the signal, while lingering flavours and scents from the previous wine are the noise. |
| | {{Line}} |
| | '''COVID Symptom Screening''' |
| | : The signal is the actual COVID infection, and the noise is all the other illnesses/allergies/etc causing similar symptoms. |
| | {{Line}} |
| | '''Smoke Detectors''' |
| | : Smoke detectors detect the presence of smoke from a fire (signal) by measuring the opacity of air. Steam is a possible source of noise. |
|
| |
|
| </div></div> --> | | </tabber> |
|
| |
|
| {{NavCard|prev=3.2 Calibration of Credence Levels|next=4.2 Finding Patterns in Random Noise}} | | {{#restricted:{{Private:4.1 Signal and Noise}}}} |
| | {{NavCard|chapter=Lesson plans|text=All lesson plans|prev=3.2 Calibration of Credence Levels|next=4.2 Finding Patterns in Random Noise}} |
| [[Category:Lesson plans]] | | [[Category:Lesson plans]] |