Skip to content

Practice: Why AI runs on statistics

This is an orientation lesson, so the practice is about fixing the map in your head: why AI deals in probabilities, the two directions of statistical reasoning, and the base-rate trap that makes the whole track worth learning. No formulas beyond simple counting.

Six short questions. Answer each in your head before opening the collapsible. Active retrieval is where the learning sticks.

1. Why do AI systems report probabilities instead of plain yes-or-no answers?

Show answer

Because they learn from a limited sample of a noisy world, so certainty is not available to them. A model has seen some of the data, not all of it, and even a perfect model of the underlying pattern cannot make a noisy outcome certain. Reporting a degree of belief is the honest move; a hard yes-or-no would overstate what the system actually knows.

2. What is the difference between probability and statistics, in terms of direction?

Show answer

Probability runs forward: you start with a model of how chance behaves and predict what data to expect (if the coin is fair, how often do I see ten heads?). Statistics runs backward: you start with observed data and infer the model behind it (I saw sixteen heads in twenty flips, is the coin fair?). AI uses both, and the backward direction is what decides whether a system actually works.

3. A model outputs “0.97” for an input. Does that mean it is 97% likely to be correct?

Show answer

Not necessarily. The number is how the pattern the model learned scores this particular input. Whether a “0.97” is right about 97% of the time is a separate property called calibration, and a model can be confidently wrong. Confidence is a report from the model, not a guarantee about reality.

4. A detector is “95% accurate.” Why might that number tell you almost nothing?

Show answer

Because accuracy depends on the base rate of the thing being detected. If the target happens only 5% of the time, a lazy model that always says “no” is automatically 95% accurate and detects nothing at all. A single accuracy number is meaningless until you know how common the thing is.

5. Name three places inside a real AI system where ideas from this track show up.

Show answer

Any three of: describing data before modeling (center, spread, shape, which features move together); conditional probability and Bayes in classifiers like spam filters and fraud detection; distributions and expected value behind loss functions and rewards; sampling and the central limit theorem behind why a held-out test set predicts future performance; hypothesis tests and confidence intervals behind comparing two models or reading an A/B test.

6. In one sentence, what is statistical thinking really about?

Show answer

Not fooling yourself about uncertainty: stating degrees of belief precisely and checking them honestly against reality, which is exactly what you ask an AI system to do every time it reports a number.

Try it yourself: probability or statistics?

Section titled “Try it yourself: probability or statistics?”

For each statement, decide whether it is probability (running a model of chance forward to predict data) or statistics (running observed data backward to infer the model). Then check.

A. "Our churn model says this customer has a 0.12 chance of cancelling
next month."
B. "We measured a 3% higher click rate on the new design; is that a real
improvement or just noise?"
C. "If the email-spam rate is 40%, how often will three random emails in
a row all be spam?"
D. "Sixteen of twenty test users preferred version B; what does that tell
us about all our users?"
E. "Given a fair six-sided die, what is the chance of rolling two sixes
in a row?"
Show answer
  • A: probability (forward). The model is the rules; it predicts a data point (a 0.12 chance). Running forward from model to expected data.
  • B: statistics (backward). You observed a 3% difference and are inferring whether the true effect is real. Data backward to a model.
  • C: probability (forward). You assume the rule (40% spam) and predict the data (three in a row).
  • D: statistics (backward). You observed 16 of 20 and are inferring something about all users. Data backward to a model.
  • E: probability (forward). Known rules (a fair die), predict the data (two sixes).

The tell: if the rules are given and you are predicting data, it is probability. If the data is given and you are inferring the rules, it is statistics.

Try it yourself: make a 99%-accurate detector misfire

Section titled “Try it yourself: make a 99%-accurate detector misfire”

A fraud detector is 99% accurate in both directions (it catches 99% of real fraud, and wrongly flags only 1% of legitimate transactions). Fraud is rare: 1 in 1,000 transactions is fraudulent. You run it on 100,000 transactions. A transaction is flagged. What is the chance it is actually fraud? Count it out before checking.

Show answer

Count with the population of 100,000:

  • Actually fraudulent: 1 in 1,000 of 100,000 is 100 fraudulent, and 99,900 legitimate.
  • True positives: 99% of the 100 real frauds are caught, so 99 are flagged correctly.
  • False positives: 1% of the 99,900 legitimate transactions are wrongly flagged, and 1% of 99,900 is 999.
  • Total flagged: 99 + 999 = 1,098 transactions get flagged.
  • Chance a flagged one is real fraud: 99 out of 1,098, which is about 9%.

So a flag from a 99%-accurate detector means only about a 9% chance of actual fraud. Rarer than the disease in the lesson (1 in 1,000 versus 1 in 100), the result is even more lopsided: the flood of false positives from the huge legitimate group swamps the handful of true positives. The detector is not broken. The base rate is doing the damage, and any AI system hunting for something rare faces the same arithmetic.

Nine cards. Click any card to reveal the answer. Use the Print flashcards button to lay out the full set as one card per page for offline review.

Q. Why do AI systems report probabilities instead of certainties?
A.

They learn from a limited sample of a noisy world, so certainty is not available. Reporting a degree of belief is the honest move; a hard yes-or-no would overstate what the system knows.

Q. Probability vs statistics: which direction does each run?
A.

Probability runs forward (model of chance to expected data). Statistics runs backward (observed data to the model behind it). AI uses both.

Q. Does a model's '0.97' output mean it is 97% likely to be right?
A.

Not necessarily. It is how the learned pattern scores this input. Whether that number is trustworthy is calibration, a separate property; a model can be confidently wrong.

Q. Why can a '95% accurate' detector be worthless?
A.

Because of the base rate. If the target happens 5% of the time, a model that always says ‘no’ is 95% accurate and detects nothing. Accuracy is meaningless without the base rate.

Q. What is the base rate, and why does it matter?
A.

The base rate is how common the thing being detected actually is. When it is rare, even a tiny false-positive percentage on the huge negative group floods the positives, so a flag can mostly be wrong.

Q. A 99%-accurate test for a 1-in-100 disease: a positive means roughly what chance of being sick?
A.

About 50%. The rare disease produces few true positives, and 1% of the large healthy group produces just as many false positives, so half of all positives are false.

Q. Where does conditional probability and Bayes show up in AI?
A.

In classifiers like spam filters, fraud detection, and medical triage: the chance of one thing given another, updated as evidence arrives. It is also where human intuition fails most (the base-rate trap).

Q. Where does expected value show up in machine learning?
A.

Behind loss functions (the average error a model pushes down) and behind the reward an agent tries to maximize. Expected value is the average outcome of a random variable.

Q. In one line, what is statistical thinking about?
A.

Not fooling yourself about uncertainty: stating degrees of belief precisely and checking them honestly against reality.