Cheatsheet: Building a hierarchy: hierarchical clustering
The procedure (agglomerative)
Section titled “The procedure (agglomerative)”| Step | Action |
|---|---|
| 1 | every point is its own cluster |
| 2 | merge the two closest clusters |
| 3 | repeat until one cluster remains |
| output | the full merge history (a tree), no k chosen up front |
Reading the dendrogram
Section titled “Reading the dendrogram”| Feature | Meaning |
|---|---|
| Leaves (bottom) | individual points |
| A merge connector | two clusters joining |
| Height of a merge | distance between them when merged |
| Low merge | very similar |
| High merge | quite different |
| Left-right order | meaningless (branches rotate freely) |
Cutting it into clusters
Section titled “Cutting it into clusters”| Cut | Result |
|---|---|
| Horizontal line | branches crossed = number of clusters |
| Cut low | many small, tight clusters |
| Cut high | few broad clusters |
| Best cut | across the tallest gap (no merges) = most natural split |
Linkage (cluster-to-cluster distance)
Section titled “Linkage (cluster-to-cluster distance)”| Type | Distance used |
|---|---|
| Single | the two nearest points (can chain) |
| Complete | the two farthest points (compact) |
| Average | mean over all pairs |
| Ward’s | least increase in within-cluster spread |
Hierarchical vs k-means
Section titled “Hierarchical vs k-means”| K-means | Hierarchical | |
|---|---|---|
| Choose k first | yes | no (cut later) |
| Output | flat groups | a tree (dendrogram) |
| Multi-scale view | no | yes |
| Speed / scale | fast, scalable | slower, poor on large data |