4.1 Signal and Noise

From Sense & Sensibility & Science
Revision as of 15:25, 21 August 2023 by Gpe (talk | contribs) (// Edit via Wikitext Extension for VSCode)
Topic Cover - 4.1 Signal and Noise.png

The challenges of finding the information we want amidst messy data.



The Lesson in Context

We introduce the concept of signal and noise in "detection problems" and teach students how to identify the signal and various sources of noise in diverse scenarios. This foreshadows the ethical considerations in deciding how strong a signal must be to be counted as a "positive".

Relation to Earlier Lessons

2.2 Systematic and Statistical UncertaintyTopic Icon - 2.2 Systematic and Statistical Uncertainty.png
  • Both systematic and statistical uncertainties introduce noise to every measurement.
3.1 Probabilistic ReasoningTopic Icon - 3.1 Probabilistic Reasoning.png
  • The presence of noise, which sometimes disguises as a signal, is inevitable in any measurement. The identification of a signal always comes with a roughly quantifiable level of confidence.
Relation to Later Lessons

4.2 Finding Patterns in Random NoiseTopic Icon - 4.2 Finding Patterns in Random Noise.png
  • In addition to the signal-to-noise ratio, there are other statistical tools (e.g. [math]\displaystyle{ p }[/math]-value) to quantify the strength of the signal amidst all the noise.
5.1 False Positives and NegativesTopic Icon - 5.1 False Positives and Negatives.png
  • "Positive" and "negative" refer to whether we identify what we detect as a signal or not. The decision of any "threshold" of strength for a signal to be counted as positive inevitably involves human values judgment in a trade-off between the rates of false positives and false negatives.
5.2 Scientific OptimismTopic Icon - 5.2 Scientific Optimism.png
  • Some signals in nature seem hopelessly too weak to detect, such as the tiny fluctuations in the distance between two mirrors as a result of the gravitational waves from faraway black holes, but scientists spend decades to develop new instruments to increase the strength of the signal, as well as new analysis techniques to filter out the noise.
6.1 Correlation and CausationTopic Icon - 6.1 Correlation and Causation.png
  • The detection of a "statistically significant" difference between conditions in an RCT is the identification of a signal. The random variations that exist between experimental subjects are a source of noise.


Takeaways

After this lesson, students should

  1. Be able to explain what scientists mean by "signal," "noise," and "signal-to-noise ratio."
  2. Be able to identify examples of "signal" and "noise," recognizing that these examples are context-dependent.
  3. Be able to roughly compare measurement techniques in terms of their resultant signal-to-noise ratios.
  4. Be able to describe examples of techniques and tools to suppress noise and/or amplify signal.

Signal

Aspects of observations or stimuli that provide useful information about the target of interest, as opposed to noise.

Please hold off on introducing the concept of false positive/negative or thresholds in detections, as students have previously been overwhelmed and confused. We will properly discuss them in 5.1 False Positives and Negatives.


Noise

The aspects of observations that get confused with signal but do not provide the same useful information about the target of interest. Noise is frequently, but not always, the result of random measurement fluctuations.

Some students falsely think that noise is anything that prevents you from detecting the signal, for instance, a law banning the use of ultrasound to detect the sex of the foetus. In fact, noise is something that is detected by an instrument the same way a signal would be, except that it is not caused by the source of the signal and could be confused with the signal.

There is always random background noise. But, noise doesn't have to be random.

Noise does not have to be sound.


Signal-to-noise Ratio

The relative strength of signal compared to the relative strength of noise in a given context. Obtaining meaningful information from the world requires distinguishing signal from noise. Therefore, human cognition (both scientific and otherwise) relies on techniques and tools to suppress noise and/or amplify signal (i.e., increase signal-to-noise ratio). It is possible to design filters to increase the signal-to-noise ratio, if you know where the noise is going to appear.

Bajau People

As a member of the Bajau people of Southeast Asia, you are diving to collect shellfish for food. While the shellfish themselves are the signal, there are several sources of noise: rocks and other creatures resembling shellfish, waving sunlight patterns on the seafloor. The signal-to-noise ratio may be low if the water is murky (higher noise), the shellfish are camouflaged (lower signal), or if the light is dim (lower signal). (BBC Article)

Identifying Fish

Detecting fish jumps (signal) on a lake on a day when the wind is causing waves (noise). Some splashing waves may be misidentified as fish jumps.

Radio Static

Getting the words of a radio personality through static.

Loud Party

Hearing your conversational partner at a party where lots of conversations are happening.

Randomized Controlled Trials

Figuring out if there's a meaningful difference between the control condition and experimental condition in an RCT. Random fluctuations in the chosen experimental sample may cause a spurious difference between the two groups; this is a source of noise.

We will cover RCTs in detail in 6.1 Correlation and Causation.


Online Researching

Finding the facts on a topic where there's a lot of disinformation floating around.

Palette Cleansing

Palette cleansing with water or crackers between tasting different wines. The subtle differences between wines are the signal, while lingering flavours and scents from the previous wine are the noise.

COVID Symptom Screening

The signal is the actual COVID infection, and the noise is all the other illnesses/allergies/etc causing similar symptoms.

Smoke Detectors

Smoke detectors detect the presence of smoke from a fire (signal) by measuring the opacity of air. Steam is a possible source of noise.

<restricted>

Useful Resources




Recommended Outline

Before Class

Print the handouts for Guess the Message Game.

During Class

5 Minutes Introduce the lesson and go over the plan for the day. Make sure people have groups, spokespeople, etc.
5 Minutes Review the definitions for this lesson. Make sure to emphasize signal-to-noise ratio.
10 Minutes Have the students do the warm-up question.
30 Minutes Go through several scenarios in the scenario analysis activity.
30 Minutes Play the guess the message game.

Lesson Content

Warm-up Question

How can we definitely tell if a single stimulus is signal or noise?

  1. By improving the sensitivity of the instrument.
  2. The stimulus is definitely a signal if it is stronger than most of the previous stimuli.
  3. It is impossible to tell for sure if a single stimulus is a signal or noise.

Explanation

Noise can masquerade itself as signal, and random fluctuations can sometimes produce a single strong stimulus. For a given stimulus, we can only come up with a likelihood for whether it is signal or noise. We then have to determine the confidence level we need in order to classify stimuli appropriately. Any single supposed signal might be a rare (or not-so-rare) spike in noise.

Scenario Analysis

For each of several scenarios, have the students answer the following questions about signal and noise.

  1. What is the sense/instrument that you are using?
  2. What does the sense/instrument actually measure?
  3. What is the signal from the sense/instrument you are expecting?
  4. What sources of noise do you anticipate in this measurement? List two or more if you can.
  5. (Optional) How would you reduce these sources of noise?

Example Scenario

Members of the Bajau people of Southeast Asia collecting shellfish for food.

  1. What is the sense/instrument that you are using?

Their eyes.

  1. What does the sense/instrument actually measure?

Visual light reflecting off nearby surfaces.

  1. What is the signal from the sense/instrument you are expecting?

A shell shape/pattern.

  1. What sources of noise do you anticipate in this measurement? List two or more if you can.

Murky water, low light, creatures and rocks that look like shellfish.

  1. (Optional) How would you reduce these sources of noise?

Clean the water, dive during the day, etc.

Scenarios

  1. Catching gossip about you from across the room at a party. What about understanding what the person you're talking to is saying?
  2. Detecting a metal knife in the luggage of someone boarding an airplane.
  3. Detecting an ongoing earthquake in Berkeley.
  4. Determining if your arch nemesis put cyanide in your almond milk.
  5. Determining whether there are birds around you on your weekly birding expedition, then determining whether owls are in the mix.
  6. Is that a creepy crawly on your neck right now?
  7. Identifying a budding new wave of COVID in the US. (Suppose you're a health official provided with daily updates of the following data from hospitals across the country: rates of people coming into the ER with fever, coughs, broken bones, wounds, diarrhea, and cardiac arrest...)

Guess the Message Game

In this game the students will write a message and corrupt it to varying degrees. Each student will have a partner with whom they shared the corrupted messages. Each student will try and decode the messages from the other student from most to least corrupted. Full instructions are available in the handout.

The 140 page handout for this game is designed to work with up to 35 students (each student gets four pages). The last of the four pages has a different randomized pair of letter and number grids for each student. Hence, the handout is slightly different for each student. If you have more than 35 students, you'll have to print more copies.

Students should not share the messages nor the decoding process until the game is done.

Instructions

2 Minutes Hand out the students a copy of the worksheet. It has the full instructions in it.
5 Minutes Explain the game as per the instructions linked above. Make sure the students know not to share their uncorrupted messages with each other until the game is complete.
20 Minutes Have students pair up and play the game.
3 Minutes Ask the discussion questions below.

Guess the Message Questions

  1. What was the highest corruption level at which you could understand the message?
  2. What are the factors affecting the signal-to-noise ratio?

The quantity of letters corrupted increases the noise to affect the ratio. The strength of the original message is also important. If the original message is short, then this also lowers the signal-to-noise ratio. Furthermore, if you have a very obscure message (that another student might not be likely to recognize) to begin with, then the signal would also be less clear.

Takeaway

  • The ratio of signal to noise determines how easy it is to distinguish between true signal and the noise that "pretends" to be signal.
  • We could have a poor ratio because the signal is very low or because the noise is very high.
  • Example: Why is it hard to have a conversation at a cocktail party?
    • Signal: Increase the sound of your voice.
    • Noise: Go outside to get away from the background conversations.

</restricted>