3.1 Probabilistic Reasoning

As all scientific knowledge may be subject to change in light of new evidence, all claims of fact should only be made or trusted up to a certain degree of confidence. This allows scientists to be open to changing their mind, while still being able to meaningfully compare the validity of factual statements under limited information. This way of thinking is as important in daily life as it is in scientific reasoning.

The Lesson in Context

After introducing the concept of scientific uncertainty in previous lessons, we now teach the students that this uncertainty permeates all discussions of facts. Every factual claim should inherently carry a level of confidence as a percentage. It allows scientists to be open to the possibility that they may be wrong, while still being able to meaningfully discuss and compare the validity of factual statements. We aim to teach students that this way of thinking is important in daily life, often in the context of risk assessment, as well as in common discourse about social issues.

   Earlier Lessons

[[1.1 Introduction and When Is Science Relevant

   ]][[Image:Topic Icon - 1.1 Introduction and When Is Science Relevant
   .png|32px|frameless|link=1.1 Introduction and When Is Science Relevant
   ]]

Facts vs. values: Since credence levels can only be assigned to factual statements, it is important to first distinguish between statements of fact and statements of value.

[[2.2 Systematic and Statistical Uncertainty

   ]][[Image:Topic Icon - 2.2 Systematic and Statistical Uncertainty
   .png|32px|frameless|link=2.2 Systematic and Statistical Uncertainty
   ]]

Measurements in the real world are imperfect, and measurement uncertainties/errors can be studied and quantified. This translates to a confidence interval for every measurement result, i.e. "We are [math]\displaystyle{ x }[/math] percent confident that the true value lies within this interval."

   Later Lessons

[[3.2 Calibration of Credence Levels

   ]][[Image:Topic Icon - 3.2 Calibration of Credence Levels
   .png|32px|frameless|link=3.2 Calibration of Credence Levels
   ]]

This lesson will follow up on the current one by teaching students how to calculate the calibration, or quality, of their credence, noticing and quantifying both underconfidence and overconfidence.

[[4.1 Signal and Noise

   ]][[Image:Topic Icon - 4.1 Signal and Noise
   .png|32px|frameless|link=4.1 Signal and Noise
   ]]

In that lesson, students explore how the signal they are looking for in data can be difficult to find amid the noise (random variation, random error, imperfect measurements, etc.). Because data is a mix of signal and noise, inferences from data tend to have some degree of uncertainty, which may be usefully quantified using credence levels or probabilities.

[[4.2 Finding Patterns in Random Noise

   ]][[Image:Topic Icon - 4.2 Finding Patterns in Random Noise
   .png|32px|frameless|link=4.2 Finding Patterns in Random Noise
   ]]

Since spurious patterns are expected to arise from random noise alone, any claim of actual pattern must carry with it a level of confidence that it is not due to random noise.

[math]\displaystyle{ p }[/math]-value: The probability that the observed pattern is due to random noise. In other words, one minus the [math]\displaystyle{ p }[/math]-value gives the level of confidence that the observed pattern is not due to random noise.

[[5.1 False Positives and Negatives

   ]][[Image:Topic Icon - 5.1 False Positives and Negatives
   .png|32px|frameless|link=5.1 False Positives and Negatives
   ]]

Since every binary test has a certain rate of false positives and false negatives, the result of such a test should only be understood as a recommendation of odds or risks, rather than a conclusive determination. Successive test results help one adjust their belief as well as their confidence level in that belief, e.g. whether one is suffering from a disease.

Takeaways

After this lesson, students should

Recognize that every claim comes with some degree of uncertainty.
Learn the function/utility of scientific expressions of uncertainty.
Understand that because every proposition comes with a degree of uncertainty:
1. Partial and probabilistic information still has value.
2. Back-up plans are important since no information is absolutely certain.
3. Evaluation of expertise and authority should be more directed towards accurately assigning confidence levels, rather than assuming a true expert would be "right" every single time.
4. Scientific culture primarily uses a language of probabilities, and sometimes even well-confirmed facts turn out to be incomplete or not true in every single case.
5. Even correctly done science will obtain incorrect results some of the time.

Credence

Level of confidence that a claim is true, from 0 (0 chance it is true) to 1 (100% certain it is true).

Confidence

Essentially a synonym for credence, as in "level of confidence," instead of colloquial meaning, "state of having a lot of confidence."

Accuracy

How frequently one is correct; proximity to a true value.

Calibration of Confidence

How closely confidence and accuracy correspond; that is, how accurate a person or system is at estimating the probability that they are correct.

[math]\displaystyle{ p }[/math]-value

The statistic used most often as a measure of statistical significance. The probability of getting a result as extreme or more if in fact the hypothesis is false, simply through random noise. The typical cut-off for a statistically significant [math]\displaystyle{ p }[/math]-value is [math]\displaystyle{ p }[/math] < .05.

Statistical Significance

How unlikely a given set of results would be if the null hypothesis were true (i.e. if the hypothesized effect did not actually exist).

A common misconception is that [math]\displaystyle{ p }[/math]-values are the probability of the hypothesis being false. This is not quite the same thing.

Where Credence Levels Come From

Credence levels can come from lots of places.

Past experience (how confident I feel about a statement of fact)
Instrumental uncertainties
Natural statistical variance (e.g. people come in different heights, wind speed is different across a town)

Where People Use Credence Levels

There are lots of familiar places where people already use or encounter credence levels.

Weather forecasts, specifically chances of rain.
Using polls to predict elections, phrased in terms of odds for betting (e.g. 50 to 1).
Making decisions based on a probabilistic risk assessment.
Credence levels predicting the probabilities of natural disasters within specific time frames (earthquakes, floods, wildfires etc.), and using these to make decisions about disaster preparation.
Use credence levels about getting into various colleges to decide what to use as a safety school.

My aunt still caught COVID even after taking the vaccine, therefore it's not true that the vaccine prevents COVID.

When phrased in terms of risks, the effectiveness of a vaccine is understood as the reduced probability that one would contract the disease after taking the vaccine, rather than total protection. Partial, probabilistic, improvements still have tangible causal consequences. This is expanded upon when we go over the distinction between singular and general causation.

Scientists used to say this particle is massless (its mass equals zero), but now they say it has a slightly non-zero mass. Their original measurement must've been wrong.

Despite how it's commonly described (sometimes even by scientists themselves), scientists don't typically measure the value of a quantity. Instead, they always measure a quantity to within some range of values, and then they say how confident they are that the true value falls within that range (marked with [math]\displaystyle{ \pm }[/math] or error bars).

Additional Content

You must be logged in to see this content.