Sampling and the central limit theorem: brief

What you’ll learn

This is lesson 11 of Track 9 (Statistics & Probability for AI) and the opener of Phase 4 (From sample to truth), the inference phase that the whole track has been building toward. The opening lesson noted that AI learns from a sample, never the whole world; this phase makes that precise. You will learn that any number measured on a sample is an estimate with sample-to-sample wobble, how to measure that wobble with the standard error, and the central limit theorem that makes the estimate’s behavior predictable. The source curriculum is Khan Academy’s Statistics & Probability course, by Sal Khan and the Khan Academy team, freely available and cited as further study.

The lesson separates population from sample and parameter from statistic, introduces the sampling distribution (a statistic is itself a random variable), gives the standard error sigma over root n and the square-root law behind diminishing returns on data, and states the central limit theorem, the reason sample means are normal no matter the population’s shape, which pays off the “why is the normal everywhere” preview from the normal-distribution lesson and powers the confidence intervals and tests to come.

Where this fits

This is lesson 11 of 14 and the first lesson of Phase 4. It builds directly on the normal-distribution lesson (the central limit theorem is why the normal applies to estimates) and uses the random-variable idea from earlier in Phase 3. The next two lessons, Confidence intervals and Hypothesis testing, are both built on the standard error and the central limit theorem introduced here; the capstone then ties the whole track to machine learning.

Before you start

Prerequisites: The bell curve: the normal distribution (lesson 9), since the central limit theorem makes the normal apply to sample means, and z-scores recur. The random-variable lesson is helpful background. The arithmetic is light, mostly dividing by a square root.

About the math

The only computation is the standard error, sigma divided by the square root of n, worked on a few sample sizes to make the square-root law tangible. The central limit theorem is stated and explained in words and is not derived; the goal is to understand what it guarantees and why it matters, not to prove it.

By the end, you’ll be able to

Distinguish a population parameter from a sample statistic that estimates it
Explain sampling variability and what a sampling distribution is
Use the standard error (sigma over root n) to describe how a sample mean’s precision improves with sample size
State the central limit theorem and why it makes the sample mean approximately normal regardless of the population’s shape
Connect sampling to AI (a test-set metric as a sample estimate, why more data tightens an estimate)

Time and difficulty

Read time: about 12 minutes
Practice time: about 15 minutes (a self-check, a standard-error and square-root-law computation, a which-estimate-do-you-trust exercise, and flashcards)
Difficulty: standard (one formula and one big idea; the central limit theorem is conceptual, not computational)