Skip to content

Cheatsheet: Train, test, and cross-validation

ClaimVerdict
”99% training accuracy”proves nothing; could be memorization
”evaluated on held-out test data”meaningful
”cross-validated”meaningful and stable
SetupTrainValidationTestUse
Simple~80%~20%no hyperparameter tuning
Three-way60-70%10-20%~20%tune on validation; one-shot test

The test set is one-shot: untouched during training and tuning.

StepAction
1split data into k equal folds (commonly 5 or 10)
2for each fold i: train on the other k-1; test on i; record score
3average the k scores
Outputa stable cross-validated estimate of generalization

Worked: 5 fold scores 0.81, 0.79, 0.83, 0.82, 0.80 -> sum 4.05 / 5 = 0.81.

VariantUse when
Stratified k-foldclassification with imbalanced classes (preserves proportions)
Leave-one-out (LOOCV)very small datasets; slow, high-variance
Time-series CVtime-ordered data; split chronologically, never random
TrapEffectFix
Tuning on the test setoptimistic, contaminateduse validation or CV
Preprocess before splittingtest stats leak into trainsplit first; fit scaler on train only
Random fold on time-seriesfuture used to predict pastchronological splits
Duplicates across train/testmodel “memorizes”; inflatedde-duplicate first