Summary: Asking the right questions: decision trees
A decision tree classifies by asking a sequence of yes/no questions, like a flowchart, funnelling each example down to a leaf that holds the prediction. It is built greedily, at each step choosing the question that best separates the classes, and a single tree is powerfully interpretable but unstable, which is exactly the flaw the next lesson fixes. This summary is the scan version of the full lesson.
Core ideas
Section titled “Core ideas”- A tree is a flowchart. Root (first question), internal nodes (follow-up questions), leaves (predictions). To predict, follow the path of answers from root to leaf.
- No boundary, just questions. Unlike logistic regression’s single straight boundary, a tree carves the space into boxes with a sequence of splits, capturing non-linear patterns.
- Built by reducing impurity. At each node, the algorithm picks the question that best separates the classes into purer groups. Purity is measured by Gini impurity or entropy: 0 for all-one-class, maximum for a 50/50 mix. The best split reduces impurity the most, then the process repeats on each branch.
- An unrestrained tree overfits. Left alone it splits until every leaf is one example, memorizing noise. Trees are reined in by depth limits, minimum leaf sizes, or pruning.
- Strengths: interpretable, non-linear, handles mixed feature types, no rescaling needed.
- Key weakness: instability. A small change in the data can produce a very different tree (high variance). A single tree overfits easily.
- Regression trees apply the same idea to predict a number: each leaf outputs the average of the values that land there.
What changes for you
Section titled “What changes for you”A decision tree is the rare model whose reasoning you can read like a recipe, which is why it matters anywhere a decision must be explained: a tree that denies a loan shows you the exact path of questions that led there, something a neural network cannot offer. It also reframes a headline you may have heard: the models that actually win on spreadsheet-shaped data (random forests, gradient-boosted trees) are not exotic, they are crowds of these simple trees. And the tree’s one real flaw, its instability, is not a dead end but a setup. The next lesson turns “one tree is unreliable” into “so grow many and let them vote,” which is the surprisingly powerful idea behind the random forest.