As all scientific knowledge may be subject to change in light of new evidence, all claims of fact should only be made or trusted up to a certain degree of confidence. This allows scientists to be open to changing their mind, while still being able to meaningfully compare the validity of factual statements under limited information. This way of thinking is as important in daily life as it is in scientific reasoning.
The Lesson in Context
After introducing the concept of scientific uncertainty in previous lessons, we now teach the students that this uncertainty permeates all discussions of facts. Every factual claim should inherently carry a level of confidence as a percentage. It allows scientists to be open to the possibility that they may be wrong, while still being able to meaningfully discuss and compare the validity of factual statements. We aim to teach students that this way of thinking is important in daily life, often in the context of risk assessment, as well as in common discourse about social issues.
Facts vs. values: Since credence levels can only be assigned to factual statements, it is important to first distinguish between statements of fact and statements of value.
Measurements in the real world are imperfect, and measurement uncertainties/errors can be studied and quantified. This translates to a confidence interval for every measurement result, i.e. "We are [math]\displaystyle{ x }[/math] percent confident that the true value lies within this interval."
This lesson will follow up on the current one by teaching students how to calculate the calibration, or quality, of their credence, noticing and quantifying both underconfidence and overconfidence.
In that lesson, students explore how the signal they are looking for in data can be difficult to find amid the noise (random variation, random error, imperfect measurements, etc.). Because data is a mix of signal and noise, inferences from data tend to have some degree of uncertainty, which may be usefully quantified using credence levels or probabilities.
Since spurious patterns are expected to arise from random noise alone, any claim of actual pattern must carry with it a level of confidence that it is not due to random noise.
[math]\displaystyle{ p }[/math]-value: The probability that the observed pattern is due to random noise. In other words, one minus the [math]\displaystyle{ p }[/math]-value gives the level of confidence that the observed pattern is not due to random noise.
Since every binary test has a certain rate of false positives and false negatives, the result of such a test should only be understood as a recommendation of odds or risks, rather than a conclusive determination. Successive test results help one adjust their belief as well as their confidence level in that belief, e.g. whether one is suffering from a disease.
Takeaways
After this lesson, students should
Recognize that every claim comes with some degree of uncertainty.
Learn the function/utility of scientific expressions of uncertainty.
Understand that because every proposition comes with a degree of uncertainty:
Partial and probabilistic information still has value.
Back-up plans are important since no information is absolutely certain.
Evaluation of expertise and authority should be more directed towards accurately assigning confidence levels, rather than assuming a true expert would be "right" every single time.
Scientific culture primarily uses a language of probabilities, and sometimes even well-confirmed facts turn out to be incomplete or not true in every single case.
Even correctly done science will obtain incorrect results some of the time.
Credence
Level of confidence that a claim is true, from 0 (0 chance it is true) to 1 (100% certain it is true).
Confidence
Essentially a synonym for credence, as in "level of confidence," instead of colloquial meaning, "state of having a lot of confidence."
Accuracy
How frequently one is correct; proximity to a true value.
Calibration of Confidence
How closely confidence and accuracy correspond; that is, how accurate a person or system is at estimating the probability that they are correct.
[math]\displaystyle{ p }[/math]-value
The statistic used most often as a measure of statistical significance. The probability of getting a result as extreme or more if in fact the hypothesis is false, simply through random noise. The typical cut-off for a statistically significant [math]\displaystyle{ p }[/math]-value is [math]\displaystyle{ p }[/math] < .05.
Statistical Significance
How unlikely a given set of results would be if the null hypothesis were true (i.e. if the hypothesized effect did not actually exist).
A common misconception is that [math]\displaystyle{ p }[/math]-values are the probability of the hypothesis being false. This is not quite the same thing.
Where Credence Levels Come From
Credence levels can come from lots of places.
Past experience (how confident I feel about a statement of fact)
Instrumental uncertainties
Natural statistical variance (e.g. people come in different heights, wind speed is different across a town)
Where People Use Credence Levels
There are lots of familiar places where people already use or encounter credence levels.
Weather forecasts, specifically chances of rain.
Using polls to predict elections, phrased in terms of odds for betting (e.g. 50 to 1).
Making decisions based on a probabilistic risk assessment.
Credence levels predicting the probabilities of natural disasters within specific time frames (earthquakes, floods, wildfires etc.), and using these to make decisions about disaster preparation.
Use credence levels about getting into various colleges to decide what to use as a safety school.
Exemplary Quotes about Probabilistic Reasoning
“Uncertainty, in the presence of vivid hopes and fears, is painful, but must be endured if we wish to live without the support of comforting fairy tales. It is not good either to forget the questions that philosophy asks, or to persuade ourselves that we have found indubitable answers to them. To teach how to live without certainty, and yet without being paralyzed by hesitation, is perhaps the chief thing that philosophy, in our age, can still do for those who study it.” -Bertrand Russell, History of Western Philosophy, p. xiv
"But please observe, now, that when as empiricists we give up the doctrine of objective certitude, we do not thereby give up the quest or hope of truth itself. We still pin our faith on its existence, and still believe that we gain an ever better position towards it by systematically continuing to roll up experiences and think." - William James, The Will to Believe
"Credit scores are designed to make decisions easier for lenders. Banks and credit unions want to know whether or not you’re likely to default on your loan, so they look at your borrowing history for clues. For example, they want to know if you have borrowed money before and successfully repaid loans or if you recently have stopped making payments on several loans.” https://www.thebalance.com/how-credit-scores-work-315541
"I'm 95% confident that this battery is not going to explode. But more than even a 1% chance of our robot exploding would lead is too risky, so we shouldn't use this battery until we are more confident it won't explode."
My aunt still caught COVID even after taking the vaccine, therefore it's not true that the vaccine prevents COVID.
When phrased in terms of risks, the effectiveness of a vaccine is understood as the reduced probability that one would contract the disease after taking the vaccine, rather than total protection. Partial, probabilistic, improvements still have tangible causal consequences. This is expanded upon when we go over the distinction between singular and general causation.
Scientists used to say this particle is massless (its mass equals zero), but now they say it has a slightly non-zero mass. Their original measurement must've been wrong.
Despite how it's commonly described (sometimes even by scientists themselves), scientists don't typically measure the value of a quantity. Instead, they always measure a quantity to within some range of values, and then they say how confident they are that the true value falls within that range (marked with [math]\displaystyle{ \pm }[/math] or error bars).
After this lesson, students should
Attitudes
Recognize that every proposition comes with a degree of uncertainty.
Value and defend scientific expressions of uncertainty.
Concept Acquisition
Credence: level of confidence that a claim is true, from 0 to 1.
Confidence: essentially a synonym for credence, as in “level of confidence,” instead of colloquial meaning, “state of having a lot of confidence.”
Accuracy: How frequently one is correct; proximity to a true value.
Calibration: How closely confidence and accuracy correspond; that is, how accurate a person or system is at estimating the probability that they are correct.
Because every proposition comes with a degree of uncertainty:
Partial and probabilistic information still has value.
Back-up plans are important because no information is absolutely certain.
It is important to invest in calibrating where you are more and less likely to be right, as opposed to being overinvested in being “right.”
Scientific culture primarily uses a language of probabilities, not certain facts.
Even correctly-done science will obtain incorrect results some of the time.
Concept Application
Appropriately weigh uncertainty in decisions involving risk. Identify a reasonable threshold of confidence for a given decision.
Recognize situations where confidence levels can be high enough for risky action (e.g. sometimes confidence is high enough to bet your life or the lives of others, even without perfect certainty).
Explain how the treatment of uncertainty in scientific work allows scientists to follow the truth, even when that means changing their minds.