Practice: Updating beliefs with evidence: Bayes' theorem

The skill is updating a belief correctly: start with the prior, weigh the evidence, never drop the base rate. The middle exercise is the one to get into your bones, because it is the calculation that explains most “the detector flagged it but it was fine” surprises.

Self-check

Six short questions. Answer each in your head before opening the collapsible.

1. Name the four parts of Bayes’ theorem.

Show answer

The prior P(H), the base rate before the evidence; the likelihood P(E given H), how well the evidence fits the hypothesis; the evidence P(E), the total probability of seeing the evidence from all sources; and the posterior P(H given E), the updated belief. Posterior = (likelihood x prior) / evidence.

2. What does Bayes’ theorem convert, and why do we need it?

Show answer

It converts P(evidence given hypothesis) into P(hypothesis given evidence), the two conditionals the previous lesson showed are different. We need it because the thing we can measure (a test’s hit rate, P(positive given sick)) is usually the opposite of the thing we want (P(sick given positive)).

3. Why can a 99%-accurate test for a rare disease still leave a positive result at only 50%?

Show answer

Because of the base rate. The disease is rare (say 1 in 100), so true positives are few, while the much larger healthy group produces just as many false positives even at a 1% error rate. The prior (rarity) drags the posterior down; Bayes keeps it in the calculation.

4. What goes into the denominator P(E), and what happens if you forget part of it?

Show answer

P(E) is the total probability of the evidence: true positives plus false positives, P(E given H)P(H) + P(E given not H)P(not H). If you leave out the false-positive piece (the healthy people who test positive), you shrink the denominator and overstate the posterior.

5. How do you update on a second piece of evidence?

Show answer

The posterior from the first becomes the prior for the second, then apply Bayes again. Beliefs accumulate: two independent positive tests can move a 1% base rate to 50% to 99%.

6. What is base-rate neglect, and where does it bite in AI?

Show answer

Ignoring the prior (how common the thing is) when interpreting evidence. In AI it bites when you read a detector’s hit rate as the chance a flag is real: a high-accuracy fraud or disease detector still produces mostly false alarms when the target is rare. The base rate must be combined in.

Try it yourself: the security-alert calculation

A login-security model flags logins as malicious. Malicious logins are rare: 0.5% of all logins. The model catches 95% of truly malicious logins, and wrongly flags 5% of legitimate ones. Out of 20,000 logins, the model flags one as malicious. What is the chance it really is? Count it out with natural frequencies before checking.

Show answer

Of 20,000 logins, 0.5% are malicious:
  Malicious = 100        Legitimate = 19,900

True positives:  95% of 100        = 95
False positives: 5% of 19,900      = 995
Total flagged:   95 + 995          = 1,090

P(malicious | flagged) = 95 / 1,090 = 0.087  (about 8.7%)

A flag from a 95%-sensitive model means only about a 9% chance the login is actually malicious, because the rare base rate (0.5%) means the 995 false positives swamp the 95 true ones. This is not a broken model; it is the base rate. (In Bayes terms: prior 0.005, likelihood 0.95, and the denominator 0.95 x 0.005 + 0.05 x 0.995 = 0.00475 + 0.04975 = 0.0545, giving 0.00475 / 0.0545 = 0.087, the same answer.)

Try it yourself: name the parts and predict

For the security-alert scenario above, answer these:

1. Which number is the prior? Which is the likelihood?
2. If malicious logins were even rarer (say 0.1% instead of 0.5%), would the
   posterior P(malicious | flagged) go up or down?
3. The team adds a SECOND independent signal that also flags this login.
   Roughly what happens to the belief, and why?

Show answer

1: The prior is 0.5% (the base rate of malicious logins). The likelihood is 95% (P(flagged given malicious), the model’s hit rate). The 5% false-positive rate is the other likelihood, P(flagged given legitimate).
2: It would go down. A rarer event means an even smaller prior, so the false positives dominate even more; the posterior after one flag drops below 9%. Rarer target, more lopsided result.
3: The belief jumps up. The first posterior (about 9%) becomes the prior for the second signal; applying Bayes again with another strong likelihood pushes the probability much higher. Accumulating independent evidence is how confidence is earned, the same pattern as two positive medical tests.

The takeaway: the prior controls how impressed to be by a single flag, and independent evidence compounds.

Flashcards

Eight cards. Click any card to reveal the answer. Use the Print flashcards button to lay out the full set as one card per page for offline review.

Q. State Bayes' theorem in words.

Posterior = (likelihood x prior) / evidence. The updated belief equals the old belief adjusted by how well the evidence fits, divided by the total probability of the evidence.

Q. What are the prior, likelihood, evidence, and posterior?

Prior P(H): base rate beforehand. Likelihood P(E|H): how well evidence fits the hypothesis. Evidence P(E): total chance of the evidence (true + false positives). Posterior P(H|E): the updated belief.

Q. What does Bayes' theorem convert, and why is that useful?

P(evidence | hypothesis) into P(hypothesis | evidence). Useful because we can measure the first (a test’s hit rate) but want the second (the chance you’re sick given a positive).

Q. Why can a 99%-accurate test for a 1-in-100 disease be only 50% right on a positive?

The base rate. Few true positives from the rare sick group; just as many false positives from the large healthy group at a 1% error rate. The prior pulls the posterior down. P = 99/198 = 0.5.

Q. What belongs in the denominator P(E)?

The total probability of the evidence: true positives plus false positives, P(E|H)P(H) + P(E|not H)P(not H). Omitting the false-positive piece overstates the posterior.

Q. How do you update a belief on a second piece of evidence?

Use the first posterior as the new prior, then apply Bayes again. Evidence accumulates: two independent positives can move 1% to 50% to 99%.

Q. What is base-rate neglect?

Ignoring the prior (how common the thing is) when reading evidence. It makes a rare-event detector look far more conclusive than it is; the base rate must be combined in.

Q. What is a 'naive Bayes' spam filter, and why 'naive'?

A classifier that combines the prior spam rate with each word’s likelihood via Bayes. ‘Naive’ because it assumes the words are independent given the class, a knowingly oversimplified assumption that makes the math tractable.