Summary: Probability foundations

A probability is a number from 0 to 1, and combining probabilities takes just three rules. The opening lesson argued that AI speaks in probabilities; this lesson learns the grammar. Before you can update beliefs with evidence (Bayes) or work with distributions (the next phase), you need to know what a probability is and how to combine them, and a surprising amount of reasoning about uncertainty then becomes simple arithmetic. This summary is the scan-in-five-minutes version of the full lesson.

Core ideas

What a probability is. A number between 0 and 1: 0 impossible, 1 certain, 0.5 even odds. Read it as a long-run frequency (the fraction a repeated experiment settles toward) or as a degree of belief (a calibrated confidence). The rules work either way.
Sample space and events. The sample space is all possible outcomes; an event is a subset. When outcomes are equally likely, probability is counting: favorable outcomes over total outcomes (P(even on a die) = 3/6 = 1/2).
Rule one, the complement. P(not A) = 1 - P(A). The go-to for at least one: P(at least one) = 1 - P(none). At least one head in two flips = 1 - 1/4 = 3/4.
Rule two, addition (OR). P(A or B) = P(A) + P(B) - P(A and B). Subtract the overlap so the both-happen cases are not double-counted. King or heart from a deck = 4/52 + 13/52 - 1/52 = 4/13.
Rule three, multiplication (AND). For independent events, P(A and B) = P(A) x P(B). Two heads = 1/2 x 1/2 = 1/4. Five independent 90% steps all succeed only about 59% of the time: chains erode reliability fast.
Independence is the fine print. The simple multiplication rule holds only when events do not influence each other. When one outcome changes the other’s odds (drawing without replacement), you need conditional probability, the next lesson. And independent events have no memory: the gambler’s fallacy is believing otherwise.

What changes for you

You gain the ability to reason about combined chances instead of guessing. “What is the chance this whole multi-step process works?” becomes a multiplication; “what is the chance at least one of these catches the problem?” becomes one minus the chance they all miss; “what is the chance of this or that?” becomes an addition with the overlap removed. In an AI setting this is the everyday math behind pipeline reliability, behind why long chains of mostly-reliable steps fail more than expected, and behind how a language model scores a sentence by multiplying word-by-word probabilities. Just as important, you learn to check the fine print: are these events really independent? That single question is what the next lesson, on conditional probability, is built to answer.