Skip to content

References: Overfitting and the bias-variance tradeoff

Source material (conceptual spine):
• StatQuest with Josh Starmer: "Machine Learning Fundamentals: Bias and Variance"
Creator: Josh Starmer
YouTube: https://www.youtube.com/watch?v=EuBBz3bI-aA
Channel / site: https://statquest.org/
License: as published on StatQuest's public YouTube channel (link-out only)
Source material (hands-on companion):
• Microsoft: "ML For Beginners" (introductory modules; evaluation appears across
the curriculum)
Repository: https://github.com/microsoft/ML-For-Beginners
License: MIT
Clawdemy provides original notes, summaries, and quizzes derived from this material
for educational purposes. All rights to the original videos and curriculum remain
with their creators.
  • StatQuest’s “Bias and Variance” anchors the two-failure-modes definitions, the tradeoff intuition, and the U-curve of total error over model complexity. The explicit train-vs-test diagnostic table, the placement of every method from the track on the spectrum, and the double-descent caveat are built out here as the lesson’s synthesis.
  • Microsoft’s ML-For-Beginners is the hands-on companion for evaluating models in Python, including computing training and test error for the diagnostic in practice.

The placement-of-methods table tying bias-variance back to every model in the track is Clawdemy’s own connective tissue, and the brief regularization callout is the deliberate folding-in of ridge and lasso noted in this track’s Phase 0.

  • StatQuest with Josh Starmer. The bias-variance fundamentals video plus StatQuest’s regularization series (ridge regression, lasso, elastic net) for the standard low-variance dial.
  • Microsoft ML-For-Beginners. Hands-on evaluation in scikit-learn; the training and test error computations the diagnostic depends on.
  • Cross-validation (the next lesson). Makes the test-error estimate trustworthy by averaging over several splits, so the diagnostic in this lesson is not at the mercy of one lucky or unlucky holdout.
  • Regularization (ridge, lasso, elastic net). The standard tools for moving leftward on the U-curve in linear models. Implicit cousins (max-depth limits, pruning, early stopping in boosting) play the same role in tree-based models.
  • Double descent. A recent finding that very large neural networks can have a second descent of test error past the apparent overfitting peak; the classical U-curve is not the whole story at deep-learning scale. Outside this track’s scope.

None selected for this lesson. The bias-variance framework is well covered by the StatQuest resource above. If a canonical discussion surfaces, it will be added at the next review.