Summary: How sure are we? confidence intervals
A confidence interval reports an estimate with its uncertainty attached, turning “90% accurate” into “90%, give or take 4 points.” The previous lesson showed every sample estimate has a standard error; a confidence interval puts that to work as a range of plausible values for the truth. The trickiest part is not building it but interpreting it correctly, which almost everyone gets wrong. This summary is the scan-in-five-minutes version of the full lesson.
Core ideas
Section titled “Core ideas”- Point to interval. A point estimate hides uncertainty; a confidence interval shows it as estimate +/- margin of error.
- The margin. Margin of error = multiplier x standard error. For 95% confidence the multiplier is about 2 (1.96, from the normal and the 68-95-99.7 rule), so a 95% interval is roughly estimate +/- 2 standard errors. Example: 90% accuracy with a 2-point standard error gives [86%, 94%].
- Two width dials. More data shrinks the standard error and narrows the interval (the square-root law); higher confidence (99% vs 95%) uses a bigger multiplier and widens it. A tight, high-confidence interval comes only from more data.
- The correct interpretation. A 95% interval means the procedure is reliable: repeat the sampling many times and about 95% of the intervals built would contain the truth. It is a long-run hit rate of the method.
- The common misreading. It is not “a 95% probability the truth is in this particular interval.” The parameter is fixed (not random); the interval is what varies. It also does not contain 95% of the data.
- In AI. Report metrics with intervals, not bare numbers; overlapping intervals mean two results are indistinguishable; and a small test set yields a wide interval that honestly signals how little you know.
What changes for you
Section titled “What changes for you”You stop trusting bare numbers and start asking for the range. When a model is “90% accurate” or a page “converts at 40%,” your reflex is now “on how much data, and how wide is the interval?” That single question separates a solid result from a noisy one. You also gain immunity to two traps: the seductive misreading that a 95% interval is a 95% chance for this specific range (it is a statement about the method, not this interval), and the leaderboard illusion where a tiny lead between two models vanishes once you notice their intervals overlap. Deciding whether a difference is real, rather than just noting overlap, is the next lesson.