References: Seeing high-dimensional data: t-SNE
Source material
Section titled “Source material”Source material (conceptual spine):• StatQuest with Josh Starmer: "t-SNE, clearly explained" Creator: Josh Starmer YouTube: https://www.youtube.com/watch?v=NEaUSP4YerM Channel / site: https://statquest.org/ License: as published on StatQuest's public YouTube channel (link-out only)
Clawdemy provides original notes, summaries, and quizzes derived from this materialfor educational purposes. All rights to the original videos remain with the creator.What this lesson draws from each source
Section titled “What this lesson draws from each source”- StatQuest’s “t-SNE, clearly explained” anchors the high-D-to-2D similarity-matching idea, the role of perplexity, and the central honesty that t-SNE preserves local but not global structure. The three explicit misreadings (between-cluster distance, cluster size, single-run stability) and the practical guidance to vary perplexity and seeds are built out here as the lesson’s central capability.
The framing of t-SNE as visualization-only (not preprocessing), the contrast table with PCA, and the explicit closing into Phase 4’s evaluation question are Clawdemy’s own connective tissue across the track.
Going deeper
Section titled “Going deeper”- StatQuest with Josh Starmer. The t-SNE explainer plus StatQuest’s dimensionality reduction and clustering material.
- “How to Use t-SNE Effectively” at Distill.pub. A widely-shared interactive article that demonstrates exactly the misreadings flagged in this lesson, with live examples you can play with. Strongly recommended after this lesson.
Adjacent topics
Section titled “Adjacent topics”- UMAP (Uniform Manifold Approximation and Projection). A modern nonlinear dimensionality reduction method, often faster than t-SNE and tending to preserve more global structure. A common alternative; worth trying alongside t-SNE.
- PCA + t-SNE pipeline. A common workflow: PCA first to reduce noise and speed, then t-SNE on the reduced data for the final 2D picture.
- Bias and variance (the next lesson). Phase 4 opens by formalizing the central modeling tension hovering over every choice we have made so far.
Community discussion
Section titled “Community discussion”The Distill article above is the strongest public discussion of how to read t-SNE plots; little additional material adds durable value beyond it. If a canonical discussion surfaces, it will be added at the next review.