Summary: Counts and trials: the binomial distribution

The binomial distribution counts successes in a fixed number of independent yes-or-no trials, and it is everywhere you ask “how many out of N.” How many of 10 emails are spam, how many of 5 predictions are correct, how many of 100 visitors convert: all binomial. It is the discrete counterpart to the previous lesson’s normal bell. This summary is the scan-in-five-minutes version of the full lesson.

Core ideas

The four conditions. A fixed number of trials n; two outcomes per trial (success/failure, the single-trial Bernoulli case); a constant success probability p; and independent trials. If any fails (p drifts, trials interact), the simple binomial does not apply.
The exactly-k formula. P(exactly k successes) = C(n, k) x p^k x (1 - p)^(n - k): the chance of one arrangement (k successes, n-k failures) times C(n, k), the number of arrangements. Three coins give P(2 heads) = 3 x 0.125 = 3/8.
It handles any p. A model 80% accurate over 5 predictions: P(exactly 4 correct) = 5 x 0.8^4 x 0.2 = about 0.41. The formula does not care that p is not 0.5.
The expected count shortcut. E[X] = n x p: 5 predictions at 80% gives an expected 4 correct; 100 visitors at 3% gives 3 sign-ups. (Variance is n x p x (1 - p).)
Exactly is not at least. The formula gives exactly k; “at least k” needs a sum, and “at least one” is easiest as 1 - (1 - p)^n (the complement).
In AI. A model’s correct predictions out of n test examples is a binomial count, which is why small-test-set accuracy is noisy; conversion and click rates are binomial; and for large n the binomial smooths toward the normal (the next phase’s central limit theorem).

What changes for you

You now have the right model for any “how many out of N” question, and the discipline to check it fits before using it. When you see “the model got 47 of 50 correct” or “3% of visitors signed up,” you recognize a binomial count and can reason about it: the expected number is n times p, the chance of an exact count comes from the formula, and “at least one” is a quick complement. The deeper payoff is the bridge it builds: accuracy measured on a test set is a binomial count, so it is noisy, and asking how much to trust it is the work of the next phase, where the binomial and the normal meet in the central limit theorem.