Train, test, and cross-validation
What you’ll learn
Section titled “What you’ll learn”This is lesson 14 of Track 10, in Phase 4 (Knowing whether your model is any good). By the end you will be able to design a cross-validation scheme and explain in plain words why you never report performance on training data. The one capability to walk away with: given a project, lay out a sound evaluation setup (split structure plus cross-validation choice) and recognize the four common data-leakage traps that silently invalidate scores.
The track structurally mirrors StatQuest’s intuition-first machine learning videos, with Microsoft’s “ML For Beginners” as the hands-on companion for readers who want to build the models in code. Full attribution is in this lesson’s references.
Where this fits
Section titled “Where this fits”The previous lesson handed you the train-vs-test diagnostic for bias and variance, with a quiet assumption: that the test error you read is honest. This lesson makes good on that assumption with the standard machinery (splits, validation sets, k-fold cross-validation) and names the leakage mistakes that quietly break it. The final lesson then turns to what to measure, since accuracy is one metric and on imbalanced problems it lies.
Before you start
Section titled “Before you start”Prerequisite: Lesson 13, Overfitting and the bias-variance tradeoff. You need the idea of training error vs test error and the diagnostic that compares them, because this lesson is about how to get those numbers honestly. No new math; the worked example is a 5-fold cross-validation average.
By the end, you’ll be able to
Section titled “By the end, you’ll be able to”- Explain why training accuracy is never a real measure of generalization
- Use a train/validation/test split correctly
- Design a k-fold cross-validation scheme and compute its averaged score
- Recognize when to use stratified, leave-one-out, or time-series CV
- Identify and prevent the four common data-leakage traps
Time and difficulty
Section titled “Time and difficulty”- Read time: about 12 minutes
- Practice time: about 15 minutes (a CV-averaging computation, a spot-the-leakage exercise, and flashcards)
- Difficulty: standard