Summary: Change of basis

The very first lesson flagged that a vector’s coordinates are a description in a chosen frame, not the vector itself, and promised to come back to it. This lesson cashes that promise and makes it operational. The whole thing reduces to one line: coordinates are a choice of language, not a fact about the vector; M and M^-1 translate between bases, and M^-1 · A · M re-describes a transformation in a new basis without changing what it does. This is the scan-it-in-five-minutes version.

Core ideas

Coordinates are relative. A vector is a geometric object that exists before any numbers; coordinates appear only once you choose a basis to measure against. The list [3, 4] answers “how many of i-hat and j-hat build this arrow?” Change the basis and the same arrow gets a different list. The standard basis is the default, not the truth.
The basis matrix M has the other basis vectors (written in your coordinates) as its columns. It is the translator from their language to yours: M · [x', y'] = x'·b1 + y'·b2, their coordinates expressed in your system.
M^-1 translates the other way, from your coordinates to theirs. This needs det(M) ≠ 0: the other basis must actually span the space, or there is no clean way back. The 2x2 shortcut: M^-1 = (1/det)·[[d, -b], [-c, a]].
A transformation gets a different matrix in a different basis, given by the sandwich A_their_basis = M^-1 · A · M, read right to left: translate into your basis, apply A, translate back. Same physical operation, different numerical description.
Worked anchors (Jennifer’s basis b1 = [2,1], b2 = [-1,1], M = [[2,-1],[1,1]], det = 3): her [1,1] is our [1,2]; our [3,2] is her [5/3, 1/3] (and translating back returns [3,2]); the 90-degree rotation [[0,-1],[1,0]] becomes the uglier [[1/3,-2/3],[5/3,-1/3]] in her basis, the identical spin in awkward coordinates.
The lesson ends on a question: is there a best basis for a given transformation, one in which the matrix is as simple as possible (a clean diagonal of stretch factors)? That basis is built from eigenvectors, the next lesson.
This is why change of basis matters for AI. It is the engine under dimensionality reduction: PCA finds a basis aligned with the data’s directions of greatest variation (keep the first few coordinates to compress); whitening finds a basis where every direction has unit variance; SVD finds bases that expose a matrix’s structure. Choosing the right basis turns a confusing description into a clear one.

What changes for you

Before this lesson, “coordinates” probably felt like an intrinsic property of a vector, the numbers it simply has. Now they are a choice of reference frame, and you can translate between frames with M and M^-1 and re-express any transformation with the M^-1 · A · M sandwich. That reframing is the conceptual key to a whole family of techniques (PCA, whitening, SVD, and the eigen-analysis of network layers) that all amount to picking a basis that makes the structure obvious. The next lesson asks the natural follow-up: for a given transformation, which basis is the simplest of all? The answer is its eigenvectors.