Summary: The bell curve: the normal distribution
The normal distribution is the bell curve, and the 68-95-99.7 rule plus the z-score make it instantly usable. Named back in the histogram lesson, it describes heights, measurement errors, test scores, and the averages of almost anything. This lesson makes it precise: how a continuous distribution carries probability, what defines a normal, and how to judge how unusual any value is. This summary is the scan-in-five-minutes version of the full lesson.
Core ideas
Section titled “Core ideas”- Continuous means area. For a continuous distribution, probability is area under a density curve (total area 1); the probability of a range is its area. No single exact value has its own probability.
- The normal, in two numbers. A symmetric bell pinned down by its mean (center) and standard deviation (width). Change the mean to slide it, the standard deviation to widen or narrow it; the shape is always the same.
- The 68-95-99.7 rule. About 68% of values fall within 1 standard deviation of the mean, 95% within 2, 99.7% within 3. For scores with mean 500 and standard deviation 100: about 68% between 400 and 600, 95% between 300 and 700.
- The z-score. z = (value - mean) / standard deviation, how many standard deviations a value is from the mean. It is the standardization from Phase 1, and its power is comparability: a z of +2 means “near the top” on any scale. With the empirical rule, z = +1 puts about 84% of the distribution below.
- Why it is everywhere (the next phase’s punchline). The averages and sums of many independent random things tend toward a normal, whatever they started as, which is why so many real quantities are bell-shaped.
- In AI. The normal underlies feature standardization (z-scores), the default model of noise and weight initialization (Gaussian), and outlier detection (large z-scores). But not all data is normal; skewed data breaks the rule, so check the shape first.
What changes for you
Section titled “What changes for you”You can now look at any roughly normal quantity and immediately place a value: a z-score tells you how many standard deviations from typical it sits, and the 68-95-99.7 rule turns that into a percentile in your head. That is the everyday move behind reading test percentiles, spotting outliers, and standardizing features for a model. Just as useful is the caution you carry out: the normal is a powerful default, not a universal truth. Before you trust the bell, you check the histogram, because applying the empirical rule to skewed data is one of the quieter ways to be confidently wrong.