References: Overfitting and the bias-variance tradeoff
Source material
Section titled “Source material”Source material (conceptual spine):• StatQuest with Josh Starmer: "Machine Learning Fundamentals: Bias and Variance" Creator: Josh Starmer YouTube: https://www.youtube.com/watch?v=EuBBz3bI-aA Channel / site: https://statquest.org/ License: as published on StatQuest's public YouTube channel (link-out only)
Source material (hands-on companion):• Microsoft: "ML For Beginners" (introductory modules; evaluation appears across the curriculum) Repository: https://github.com/microsoft/ML-For-Beginners License: MIT
Clawdemy provides original notes, summaries, and quizzes derived from this materialfor educational purposes. All rights to the original videos and curriculum remainwith their creators.What this lesson draws from each source
Section titled “What this lesson draws from each source”- StatQuest’s “Bias and Variance” anchors the two-failure-modes definitions, the tradeoff intuition, and the U-curve of total error over model complexity. The explicit train-vs-test diagnostic table, the placement of every method from the track on the spectrum, and the double-descent caveat are built out here as the lesson’s synthesis.
- Microsoft’s ML-For-Beginners is the hands-on companion for evaluating models in Python, including computing training and test error for the diagnostic in practice.
The placement-of-methods table tying bias-variance back to every model in the track is Clawdemy’s own connective tissue, and the brief regularization callout is the deliberate folding-in of ridge and lasso noted in this track’s Phase 0.
Going deeper
Section titled “Going deeper”- StatQuest with Josh Starmer. The bias-variance fundamentals video plus StatQuest’s regularization series (ridge regression, lasso, elastic net) for the standard low-variance dial.
- Microsoft ML-For-Beginners. Hands-on evaluation in scikit-learn; the training and test error computations the diagnostic depends on.
Adjacent topics
Section titled “Adjacent topics”- Cross-validation (the next lesson). Makes the test-error estimate trustworthy by averaging over several splits, so the diagnostic in this lesson is not at the mercy of one lucky or unlucky holdout.
- Regularization (ridge, lasso, elastic net). The standard tools for moving leftward on the U-curve in linear models. Implicit cousins (max-depth limits, pruning, early stopping in boosting) play the same role in tree-based models.
- Double descent. A recent finding that very large neural networks can have a second descent of test error past the apparent overfitting peak; the classical U-curve is not the whole story at deep-learning scale. Outside this track’s scope.
Community discussion
Section titled “Community discussion”None selected for this lesson. The bias-variance framework is well covered by the StatQuest resource above. If a canonical discussion surfaces, it will be added at the next review.