Skip to content

Overfitting and the bias-variance tradeoff

This is lesson 13 of Track 10, the opener of Phase 4 (Knowing whether your model is any good). By the end you will be able to diagnose underfitting versus overfitting by reading a model’s training error and test error together, the foundational skill in applied machine learning. The one capability to walk away with: given a pair of error numbers, tell which side of the bias-variance U-curve you are on and name the right next move.

The track structurally mirrors StatQuest’s intuition-first machine learning videos, with Microsoft’s “ML For Beginners” as the hands-on companion for readers who want to build the models in code. Full attribution is in this lesson’s references.

The first three phases built a substantial toolbox: regression, classification, ensembles, clustering, compression. Phase 4 turns to the question hovering over every one of them: how do you know a model is actually any good? This lesson is the foundation of that phase, because bias and variance are the framework for understanding how a model can be bad. The next lesson, cross-validation, is about measuring it honestly, and the final lesson is about choosing the right metric to measure it with. Together the three lessons make the difference between guessing and engineering.

Prerequisite: Lesson 1, What machine learning actually is. You need the core rule planted there, that a model is judged on data it has never seen, because this lesson formalizes exactly how a model can fail to generalize. The lesson also synthesizes patterns from across the track (linear regression, trees, forests, boosting, SVMs); you do not need to remember every detail of those lessons, but having seen them helps the bias-variance lens land.

  • Define bias (underfitting) and variance (overfitting)
  • Explain the tradeoff and the U-shape of total error vs complexity
  • Diagnose high bias, high variance, or a good fit from training and test error
  • Place each method from the track on the bias-variance spectrum
  • Explain regularization (ridge, lasso) as the standard low-variance dial
  • Read time: about 12 minutes
  • Practice time: about 15 minutes (a diagnose-from-numbers exercise, a place-the-method question, and flashcards)
  • Difficulty: standard