Skip to content

Train, test, and cross-validation

This is lesson 14 of Track 10, in Phase 4 (Knowing whether your model is any good). By the end you will be able to design a cross-validation scheme and explain in plain words why you never report performance on training data. The one capability to walk away with: given a project, lay out a sound evaluation setup (split structure plus cross-validation choice) and recognize the four common data-leakage traps that silently invalidate scores.

The track structurally mirrors StatQuest’s intuition-first machine learning videos, with Microsoft’s “ML For Beginners” as the hands-on companion for readers who want to build the models in code. Full attribution is in this lesson’s references.

The previous lesson handed you the train-vs-test diagnostic for bias and variance, with a quiet assumption: that the test error you read is honest. This lesson makes good on that assumption with the standard machinery (splits, validation sets, k-fold cross-validation) and names the leakage mistakes that quietly break it. The final lesson then turns to what to measure, since accuracy is one metric and on imbalanced problems it lies.

Prerequisite: Lesson 13, Overfitting and the bias-variance tradeoff. You need the idea of training error vs test error and the diagnostic that compares them, because this lesson is about how to get those numbers honestly. No new math; the worked example is a 5-fold cross-validation average.

  • Explain why training accuracy is never a real measure of generalization
  • Use a train/validation/test split correctly
  • Design a k-fold cross-validation scheme and compute its averaged score
  • Recognize when to use stratified, leave-one-out, or time-series CV
  • Identify and prevent the four common data-leakage traps
  • Read time: about 12 minutes
  • Practice time: about 15 minutes (a CV-averaging computation, a spot-the-leakage exercise, and flashcards)
  • Difficulty: standard