Skip to content

Cheatsheet: The bell curve: the normal distribution

The normal distribution is the bell curve defined by a mean and a standard deviation. The 68-95-99.7 rule and the z-score make it usable at a glance.

Probability = AREA under a density curve. Total area = 1.
P(value in a range) = area over that range. No single exact value has a probability.
Symmetric bell, defined by TWO numbers:
mean = center (where the peak sits)
standard deviation = width (bigger = wider/flatter)
Change mean -> slide left/right. Change sd -> widen/narrow. Same shape always.
within 1 sd of the mean -> about 68%
within 2 sd -> about 95%
within 3 sd -> about 99.7%
Scores ~ Normal(mean 500, sd 100):
400 to 600 -> ~68% 300 to 700 -> ~95% 200 to 800 -> ~99.7%
above 700 (z=+2) -> ~2.5% below 600 (z=+1) -> ~84%
z = (value - mean) / standard deviation (= standardization from Phase 1)
600 in N(500,100): z = +1.0 700: z = +2.0 450: z = -0.5
Use: comparability across scales; z=+1 -> ~84% below; |z|>2 or 3 -> unusual (outlier).
  • Feature standardization = a z-score per feature value.
  • Default noise model and neural-net weight initialization = Gaussian (normal).
  • Outlier detection: flag values beyond 2 to 3 standard deviations.
  • Why so common: averages/sums of many independent things tend to normal (next phase).
  • Assuming everything is normal (skewed/bimodal data breaks the rules; check a histogram).
  • Confusing curve height with probability (probability is area over a range).
  • Forgetting a z-score needs both mean AND standard deviation.
  • Reading 99.7% as “all” (the tails extend forever; extremes are rare, not impossible).
  • Density curve: a curve whose area gives probabilities (total area 1).
  • Normal distribution: the symmetric bell set by a mean and standard deviation.
  • Empirical rule: 68-95-99.7 within 1-2-3 standard deviations.
  • z-score: (value - mean) / standard deviation; standard distance from the mean.