4.2 Finding Patterns in Random Noise

From Sense & Sensibility & Science
Topic Icon - 4.2 Finding Patterns in Random Noise.png

Humans are so good at identifying patterns that we often see them even when it is really noise in masquerade. When we think we have seen a pattern, how do we quantify the level of confidence correctly? We describe common pitfalls that lead to an overconfidence in an apparent pattern, some that even prey on the inattentive scientist!

The Lesson in Context

This lesson continues 4.1 Signal and Noise by elaborating on ways in which random noise can emulate signals (produce apparent patterns) in many different contexts. We introduce the idea of [math]\displaystyle{ p }[/math]-values to quantify the statistical significance of patterns and describe various tempting statistical fallacies we tend to make as laypersons or scientists, such as gambler's fallacy and [math]\displaystyle{ p }[/math]-hacking. We play a game in which students try to produce a random string of coin tosses by thought, which reveals that a truly random string in fact contains more apparent patterns than one intuitively expects. Two other activities also illustrate how spurious patterns are in fact expected to arise from random noise.

Takeaways

After this lesson, students should

  1. Understand that people tend to see any regularity as a meaningful pattern (i.e., see more signal than there is), even when "patterns" occur by chance (i.e. are pure noise).
  2. Recognize cases of the Look Elsewhere Effect in daily life when you hear phrases such as "what are the odds".
  3. Recognize and explain the flaw in scenarios in which scientists and other people mistake noise for signal.
  4. Resist the opposing temptations of both the Gambler's Fallacy (the expectation that a run of similar events will soon break and quickly balance out, because of the assumption that small samples resemble large samples) and the Hot-hand Fallacy (the expectation that a run will continue, because runs suggest non-randomness).
  5. (Data Science) Describe the difference between the effect size (strength of pattern) and credence level (probability that the pattern is real), and identify the role each plays in decision making.

People underestimate the frequency of apparent patterns produced by randomness, leading to over-perception of spurious signal much more frequently than people account for. Events that are just coincidental are much more likely than most people expect.


Additional Content

You must be logged in to see this content.