Cheatsheet: Wisdom of crowds: random forests
The idea
Section titled “The idea”| Item | Detail |
|---|---|
| What it is | an ensemble of many decision trees |
| Combine (classification) | majority vote |
| Combine (regression) | average of the trees’ predictions |
| Why it beats one tree | independent errors cancel; shared signal survives |
The two sources of diversity
Section titled “The two sources of diversity”| Source | What it does |
|---|---|
| Bagging | each tree trains on a bootstrap sample (random draw with replacement) |
| Random feature subsets | each split may use only a random subset of features |
| Combined effect | trees are individually decent and make mistakes in different places |
Why averaging lowers error
Section titled “Why averaging lowers error”| Component | Single deep tree | Random forest |
|---|---|---|
| Bias | low | low (kept) |
| Variance | high (unstable) | sharply lowered |
| Generalization | overfits | better on new data |
Worked vote (5-tree forest, one email)
Section titled “Worked vote (5-tree forest, one email)”| Tree | Vote |
|---|---|
| 1 | SPAM |
| 2 | SPAM |
| 3 | NOT SPAM |
| 4 | SPAM |
| 5 | NOT SPAM |
| Result | 3 vs 2 -> SPAM |
Bonus and trade-offs
Section titled “Bonus and trade-offs”| Item | Note |
|---|---|
| Out-of-bag error | free generalization estimate from each tree’s left-out ~1/3 of data |
| Lost | interpretability (cannot read hundreds of trees) |
| Gained | accuracy, stability, low tuning, feature importances |
| More trees | help then plateau; do not cause overfitting |