Practice: The bell curve: the normal distribution
Two skills: using the 68-95-99.7 rule to judge how unusual a value is, and computing a z-score to compare values that live on different scales. Keep a scratchpad.
Self-check
Section titled “Self-check”Six short questions. Answer each in your head before opening the collapsible.
1. For a continuous distribution, how is probability represented?
Show answer
As area under a density curve. The total area is 1, and the probability of landing in a range is the area over that range. No single exact value has its own probability; you ask about intervals (areas), not points.
2. What two numbers fully describe a normal distribution, and what does each control?
Show answer
The mean (the center, where the peak sits) and the standard deviation (the width). Changing the mean slides the bell left or right; changing the standard deviation makes it wider and flatter or narrower and taller. Same shape, shifted and scaled.
3. State the 68-95-99.7 rule.
Show answer
About 68% of values fall within 1 standard deviation of the mean, about 95% within 2 standard deviations, and about 99.7% within 3 standard deviations. It lets you judge how unusual a value is with quick arithmetic.
4. What is a z-score, and why is it useful?
Show answer
z = (value - mean) / standard deviation: how many standard deviations a value sits above (positive) or below (negative) the mean. It is useful because it strips away the original units, so a z of +2 means “near the top” whether the scale is test scores, heights, or dollars, making different distributions comparable.
5. A value sits at z = +1. Roughly what percent of the distribution is below it?
Show answer
About 84%: the 50% below the mean plus the 34% (half of 68%) between the mean and one standard deviation above. Combining the empirical rule with a z-score gives quick percentile estimates.
6. Is it safe to assume a dataset is normal? What is the risk?
Show answer
No. Many distributions are skewed or bimodal (incomes, response times), and the 68-95-99.7 rule and z-score percentiles only hold for roughly normal data. Applying them to non-normal data gives wrong answers; check the shape (a histogram) first.
Try it yourself: z-scores and the empirical rule
Section titled “Try it yourself: z-scores and the empirical rule”Adult heights in a population are roughly normal with a mean of 170 cm and a standard deviation of 8 cm. Answer each:
1. What is the z-score of a height of 178 cm? Of 186 cm? Of 154 cm?2. About what percent of people are between 162 cm and 178 cm tall?3. About what percent are taller than 186 cm?Show answer
1. z(178) = (178 - 170) / 8 = +1.0 (one sd above) z(186) = (186 - 170) / 8 = +2.0 (two sd above) z(154) = (154 - 170) / 8 = -2.0 (two sd below)
2. 162 to 178 is mean +/- 1 sd, so about 68% of people.
3. 186 is two sd above the mean. About 95% are within 2 sd (between 154 and 186), leaving 5% outside, split evenly between the two tails. So about 2.5% are taller than 186 cm.Try it yourself: compare across distributions
Section titled “Try it yourself: compare across distributions”A student takes two tests, each roughly normal:
Test A: scored 85, where the class mean was 75 with a standard deviation of 5.Test B: scored 90, where the class mean was 82 with a standard deviation of 10.The raw score on Test B is higher. But on which test did the student perform more impressively relative to the class? Use z-scores.
Show answer
z(Test A) = (85 - 75) / 5 = +2.0 (two standard deviations above the class)z(Test B) = (90 - 82) / 10 = +0.8 (less than one standard deviation above)Test A is the more impressive performance. Even though 85 is a lower raw score than 90, it sits two standard deviations above its class (roughly the 97.5th percentile), while the 90 on Test B is only 0.8 standard deviations above (roughly the 79th percentile). The z-score lets you compare performances on different scales fairly, which raw scores cannot.
Flashcards
Section titled “Flashcards”Eight cards. Click any card to reveal the answer. Use the Print flashcards button to lay out the full set as one card per page for offline review.
Q. For a continuous distribution, how is probability represented?
As area under a density curve. Total area = 1; the probability of a range is its area. No single exact value has its own probability.
Q. What two numbers define a normal distribution?
The mean (center, where the peak sits) and the standard deviation (width). Changing them slides and scales the same bell shape.
Q. State the 68-95-99.7 (empirical) rule.
About 68% of values lie within 1 standard deviation of the mean, 95% within 2, and 99.7% within 3. It makes the normal usable without a calculator.
Q. What is a z-score and its formula?
How many standard deviations a value is from the mean: z = (value - mean) / standard deviation. Positive is above the mean, negative below.
Q. Why are z-scores useful for comparison?
They strip away the original units, so a z of +2 means ‘near the top’ on any scale. That lets you compare values from different distributions (test scores vs heights) fairly.
Q. A value is at z = +2. Roughly what percent is above it?
About 2.5%. The empirical rule puts ~95% within 2 sd, leaving ~5% in the two tails, split evenly, so ~2.5% lie above +2 sd.
Q. Where do the normal distribution and z-scores show up in AI?
Feature standardization (a z-score per feature), the default model of noise and neural-net weight initialization (Gaussian), and outlier detection (flagging large z-scores).
Q. Is all data normal? What is the risk of assuming so?
No. Skewed and bimodal data are common; the empirical rule and z-score percentiles only hold for roughly normal data. Assuming normality on non-normal data gives wrong answers. Check the histogram.