Building a hierarchy: hierarchical clustering
What you’ll learn
Section titled “What you’ll learn”This is lesson 10 of Track 10, in Phase 3 (Finding structure without labels). By the end you will be able to read a dendrogram, the tree that hierarchical clustering produces, and choose where to cut it to get the clusters you want. The one capability to walk away with: look at a dendrogram, tell which points are most similar from the merge heights, and pick a cut that yields a sensible grouping.
The track structurally mirrors StatQuest’s intuition-first machine learning videos, with Microsoft’s “ML For Beginners” as the hands-on companion for readers who want to build the models in code. Full attribution is in this lesson’s references.
Where this fits
Section titled “Where this fits”This is the second clustering lesson. K-means (the previous lesson) gave a flat set of groups and made you choose their number up front. Hierarchical clustering builds a full tree instead, so you can see structure at every scale and decide the number of clusters afterward by cutting the tree. Together the two lessons cover the clustering half of unsupervised learning. The next lesson turns to the other half, dimensionality reduction, starting with principal component analysis.
Before you start
Section titled “Before you start”Prerequisite: Lesson 9, Grouping without labels: k-means clustering. You need the idea of clustering unlabeled data and of distance between points, because this lesson contrasts hierarchical clustering’s tree with k-means’ flat groups and builds on the same notion of “closeness.” No math beyond comparing distances.
By the end, you’ll be able to
Section titled “By the end, you’ll be able to”- Describe agglomerative clustering as repeatedly merging the two closest clusters
- Read a dendrogram and interpret merge height as distance
- Cut a dendrogram to produce a chosen number of clusters
- Identify the most natural cut (across the tallest gap)
- Explain linkage methods and compare hierarchical clustering with k-means
Time and difficulty
Section titled “Time and difficulty”- Read time: about 11 minutes
- Practice time: about 15 minutes (a read-and-cut-the-dendrogram exercise, a misreading check, and flashcards)
- Difficulty: standard