Summary: Random variables and expected value
A random variable is a number whose value comes from chance, and its expected value is the long-run average it settles toward. Phase 2 reasoned about whether events happen; this lesson reasons about numbers that depend on chance, a payoff, a count, a loss, and the single most useful summary of one: its expected value. That quantity is the backbone of how machine-learning systems are trained. This summary is the scan-in-five-minutes version of the full lesson.
Core ideas
Section titled “Core ideas”- Random variable. A number set by a random process: a die roll, a count of tickets, a payoff. Discrete ones have listable values (counts, dice); continuous ones fill a range (heights, times), the next lesson’s focus. Each comes with a probability distribution whose probabilities sum to 1.
- Expected value, the long-run average. E[X] = sum of (value x its probability), the probability-weighted average. A fair die’s expected value is (1+2+3+4+5+6)/6 = 3.5, a number it can never show: the expectation is a long-run average, not a single prediction or an achievable outcome.
- Use it to compare options. A game paying +$10 with probability 0.2 and -$3 with probability 0.8 has expected value 2 - 2.4 = -$0.40 per play, a long-run loser even though you sometimes win. The better expected value is the better long-run bet.
- Variance is risk. Variance is the probability-weighted average squared distance from the expected value; its square root is the standard deviation. A $1 coin bet has expected value 0 but standard deviation 1. Same expected value, more variance, more risk in any single outcome.
- It is the core of AI objectives. A loss function is an expected error the model minimizes; a reward is an expected payoff an agent maximizes; reported performance is an expected value over the data. “Minimize the loss” and “maximize reward” are expected-value statements.
What changes for you
Section titled “What changes for you”You gain the vocabulary behind the two phrases you will hear most about how AI is trained: minimizing a loss and maximizing a reward. Both are expected values, long-run averages of a number that depends on chance, and now you know exactly what that means and how to compute one. You also carry forward the center-and-spread discipline from Phase 1 into the world of distributions: when you see an expected outcome (an average reward, an expected error, an expected payoff), you ask the second question automatically, how much does it vary? Because two options with the same expected value can carry very different risk, and the variance is what tells them apart.