Summary: Overfitting and the bias-variance tradeoff

Bias is being too simple to capture the pattern (underfitting); variance is being too sensitive to the specific training data (overfitting). They trade off, total generalization error is a U-shape in model complexity, and you diagnose which side you are on by comparing training error with test error. This summary is the scan version of the full lesson, which opens the final evaluation phase.

Core ideas

Two ways to be wrong. Bias: model is too simple to capture the true pattern (underfits). Variance: model is too sensitive to its specific training sample (overfits, chasing noise).
Total error = bias^2 + variance + irreducible noise. The first two are what you control; the third is the inherent randomness in the data, a hard floor.
The tradeoff. Making a model more flexible lowers bias but raises variance, and vice versa. Total error is a U-shape over model complexity, with the sweet spot in the middle.
The diagnostic (the practical capability):
- Training error HIGH, test error HIGH (similar) -> high bias, underfitting.
- Training error LOW, test error HIGH (big gap) -> high variance, overfitting.
- Training error LOW, test error LOW (small gap) -> good fit.
Regularization (ridge L2, lasso L1) penalizes complexity to push the model toward lower variance.
Where the toolbox sits: linear/logistic regression are low-variance, high-bias; deep trees are high-variance, low-bias; random forests cut variance via averaging; boosting cuts bias via chaining and can overfit if pushed too far.

What changes for you

You now have the right name for everything you have been waving at. “Overfitting” is high variance; “underfitting” is high bias; “the model is too complex” or “too simple” is a guess at where it sits on the U-curve. More valuable is the diagnostic instinct: when you fit a model, look at training and test error together, and the gap between them tells you what to do next. High training error means add complexity; low training but high test means take it away, get more data, or regularize. That ability to read two numbers and know which lever to pull is the difference between guessing and engineering. Modern deep learning complicates the picture (double descent at scale) but for the classical toolbox here, the U-curve holds. The next lesson is about making the test-error number trustworthy in the first place, through train/test splits and cross-validation.