Skip to content

References: Train, test, and cross-validation

Source material (conceptual spine):
• StatQuest with Josh Starmer: "Machine Learning Fundamentals: Cross Validation"
Creator: Josh Starmer
YouTube: https://www.youtube.com/watch?v=fSytzGwwBVw
Channel / site: https://statquest.org/
License: as published on StatQuest's public YouTube channel (link-out only)
Source material (hands-on companion):
• Microsoft: "ML For Beginners" (evaluation appears across modules)
Repository: https://github.com/microsoft/ML-For-Beginners
License: MIT
Clawdemy provides original notes, summaries, and quizzes derived from this material
for educational purposes. All rights to the original videos and curriculum remain
with their creators.
  • StatQuest’s “Cross Validation” anchors the k-fold procedure, why a single split is unstable, and the role of averaging across folds. The validation-set-vs-test-set distinction, the worked 5-fold computation, and the explicit data-leakage taxonomy (four common traps) are built out here as the lesson’s practical core.
  • Microsoft’s ML-For-Beginners is the hands-on companion for running cross-validation in scikit-learn, including stratified splits and pipelines that fit preprocessing inside cross-validation correctly.

The “what question to ask of any reported accuracy” framing and the explicit setup of metrics for the next lesson are Clawdemy’s own.

  • StatQuest with Josh Starmer. The cross-validation video plus material on training, testing, and the related fundamentals.
  • Microsoft ML-For-Beginners. Hands-on lessons using scikit-learn’s train_test_split, KFold, and Pipeline (which is the standard tool for fitting preprocessing inside CV correctly).
  • Pipelines (in scikit-learn or similar). The standard way to wrap preprocessing with the model so cross-validation refits each step on each training fold, automatically preventing the preprocessing-leak trap named here.
  • Nested cross-validation. A more careful procedure when you both tune hyperparameters and want an unbiased generalization estimate; an outer CV evaluates while an inner CV tunes. Outside this track’s scope, worth knowing exists.
  • Confusion matrix, precision, recall, ROC (the next lesson). The right metrics to evaluate with, once you have the right way to evaluate set up.

None selected for this lesson. Cross-validation is well covered by the StatQuest and Microsoft resources above. If a canonical discussion surfaces, it will be added at the next review.