Skip to content

Lesson: Coordinates as a choice, change of basis

Twelve lessons ago, in the very first one, we flagged something and promised to come back to it: the coordinates of a vector are “a description of the vector in a particular coordinate system, not the vector itself.” This is the lesson that cashes that promise. By the end, coordinates will feel like what they are, a choice, and you will be able to translate a vector’s coordinates from one basis to another and back.

The mental shift is the real payoff here. The arithmetic is straightforward; the idea that coordinates are relative, not absolute, is the thing worth rewiring.

A vector is a geometric object: an arrow in space, a displacement, a thing you can add and scale. That object exists before any numbers are attached to it. Coordinates appear only once you choose a basis, a set of vectors to measure against. The list 3, 4 does not live inside the arrow; it is the answer to the question “how many of i-hat and how many of j-hat add up to this arrow?” Change the basis, ask the question against different reference vectors, and the same arrow gets a different list of numbers.

The standard basis, i-hat the vector 1, 0 and j-hat the vector 0, 1, is just the default choice, not a privileged truth. Another person could describe the same plane with a completely different pair of basis vectors, and their coordinates for any given arrow would differ from ours, while the arrow itself is unchanged. Neither description is more correct. They are two languages for the same geometry.

The same arrow has different coordinates in different bases A two-dimensional coordinate grid. The standard horizontal-and-vertical grid is in faint gray. Overlaid in teal-amber is a tilted second grid for Jennifer's basis with b1 equal to [1, 1] in teal and b2 equal to [-1, 1] in amber. A single purple arrow runs from the origin to the point (3, 1) in standard coordinates. Two labels read: in our basis [3, 1], and in Jennifer's basis [2, -1]. The diagonal Jennifer-grid spacings step exactly one "Jennifer-unit" along each of b1 and b2. b1 = [1, 1] b2 = [-1, 1] same arrow two coordinate readouts: in our basis: [3, 1] in Jennifer's: [2, -1] the arrow is the same; onlythe numbers differ
One arrow, two grids, two readouts. With our standard x and y the arrow's coordinates are [3, 1]. With Jennifer's diagonal grid the same arrow's coordinates are [2, -1]. The picture is identical; only the language changes.

Suppose another observer, call her Jennifer, uses her own pair of basis vectors. We can write her basis vectors in our standard coordinates: say her first basis vector is 2, 1 and her second is negative-1, 1. Build the matrix whose columns are her basis vectors expressed in our system:

M = [ 2 -1 ]
[ 1 1 ]

This M is the translator from Jennifer’s language to ours. If Jennifer says a vector has coordinates x-prime, y-prime in her basis, she means “x-prime of her first basis vector plus y-prime of her second.” In our coordinates, that is exactly x-prime times her first basis vector plus y-prime times her second, which is M applied to her vector. The columns of M are where her basis vectors land in our world, the same column trick from the transformations lesson, now read as a translation between languages.

Worked translation. Jennifer points to a vector and calls it 1, 1 in her basis. What do we call it? Apply M:

M · [1, 1] = 1 · [2, 1] + 1 · [-1, 1] = [1, 2]

Her 1, 1 is our 1, 2. Same arrow, two names.

To go the other way, from our coordinates into Jennifer’s, we undo M, which means applying M-inverse. (This is where the inverses lesson pays off: the translation is reversible exactly because M is invertible, the determinant of M is nonzero. If Jennifer’s “basis” vectors were linearly dependent, they would not span the plane, M would collapse, and there would be no clean way back.)

For a 2x2 matrix there is a quick formula for the inverse, written here in terms of the four entries of M:

M^-1 = (1 / det(M)) · [ d -b ]
[ -c a ]

Compute it for our matrix:

det(M) = (2)(1) - (-1)(1) = 3
M^-1 = (1/3) · [[1, 1], [-1, 2]]

Worked translation back. We have a vector we call 3, 2. What does Jennifer call it? Apply M-inverse:

M^-1 · [3, 2] = (1/3) · [3 + 2, -3 + 4] = (1/3) · [5, 1] = [5/3, 1/3]

Our 3, 2 is her five-thirds, one-third.

Check the round trip, to confirm the two translations really are inverses. Take her five-thirds, one-third back to our basis with M:

(5/3) · [2, 1] + (1/3) · [-1, 1] = [10/3, 5/3] + [-1/3, 1/3] = [9/3, 6/3] = [3, 2]

We are back where we started. M and M-inverse translate in opposite directions and cancel.

Round trip: our-basis coordinates [3, 1] go to Jennifer's [2, -1] via M-inverse, and back to [3, 1] via M Three boxes in a horizontal row. The left box shows the same arrow's coordinates in our basis as [3, 1] in purple. An arrow labeled M-inverse points to the middle box, which shows the coordinates in Jennifer's basis as [2, -1] in purple. A second arrow labeled M points to the right box, which shows the same arrow back in our basis as [3, 1], identical to the start. The round trip cycle illustrates that M and M-inverse are inverse translators between coordinate systems. our basis [3, 1] arrow's coordinates here M⁻¹ translate to Jen Jennifer's basis [2, -1] same arrow, new numbers M translate back our basis (again) [3, 1] recovered original the round trip is the identity: M · M⁻¹ · [3, 1] = [3, 1]
The same arrow's coordinates travel from our basis to Jennifer's basis through M-inverse, then back through M. The round trip returns the original numbers: M and M-inverse are perfect translators between coordinate systems, undoing each other.

Transformations look different in different bases

Section titled “Transformations look different in different bases”

Here is the deeper move, and the one that sets up the next lesson. A linear transformation, say a rotation, is a physical operation: it spins the plane the same way no matter whose coordinates you use. But its matrix depends on the basis. The rotation has one matrix in our basis and a different matrix in Jennifer’s, because the matrix is a description, and descriptions are relative.

To find a transformation’s matrix in Jennifer’s basis, think about what has to happen to one of her vectors. We only know how to apply the transformation in our basis, so the recipe is: take her vector, translate it to our basis, apply the transformation there, then translate the result back to her basis. As a product, reading right to left like every composition since the matrix-multiplication lesson:

A_jennifer = M^-1 · A · M

The rightmost M translates from her basis to ours, the middle A does the transformation in our basis, and M-inverse translates back to hers. The sandwich is the transformation as Jennifer experiences it.

Worked transformation translation. Take the 90-degree counterclockwise rotation you have used since the transformations lesson, R with first column 0, 1 and second column negative-1, 0, and find its matrix in Jennifer’s basis. Compute M-inverse times R times M.

First R times M, applying R to each column of M:

R · [2, 1] = 2 · [0, 1] + 1 · [-1, 0] = [-1, 2]
R · [-1, 1] = -1 · [0, 1] + 1 · [-1, 0] = [-1, -1]

so R times M is the matrix with first column negative-1, 2 and second column negative-1, negative-1. Then apply M-inverse to each of those columns, with M-inverse equal to one-third times the matrix with first row 1, 1 and second row negative-1, 2:

M^-1 · [-1, 2] = (1/3) · [-1 + 2, 1 + 4] = (1/3) · [1, 5] = [1/3, 5/3]
M^-1 · [-1, -1] = (1/3) · [-1 - 1, 1 - 2] = (1/3) · [-2, -1] = [-2/3, -1/3]

So in Jennifer’s basis the same rotation is

R_jennifer = [ 1/3 -2/3 ]
[ 5/3 -1/3 ]

That matrix is far uglier than the clean rotation with first column 0, 1 and second column negative-1, 0, but it describes the identical physical rotation. Jennifer’s basis just happens to be an awkward one for talking about this particular spin. The operation did not change; only the language did.

To see why the sandwich is the right recipe, follow one of Jennifer’s vectors through it one stage at a time. Take her 1, 1:

M · [1, 1] = [1, 2] (her vector -> our basis)
R · [1, 2] = 1·[0, 1] + 2·[-1, 0] = [-2, 1] (rotate, in our basis)
M^-1 · [-2, 1] = (1/3)·[-2 + 1, 2 + 2] = [-1/3, 4/3] (back to her basis)
R_jennifer · [1, 1] = [1/3 - 2/3, 5/3 - 1/3] = [-1/3, 4/3] (combined, directly)

Same answer. The three-step journey and the single sandwich matrix agree, because the sandwich is precisely those three steps bundled into one matrix.

The lesson lands on a question it does not yet answer: if a transformation’s matrix depends on the basis, is there a best basis, one in which the matrix is as simple as possible? Sometimes a transformation that looks like a tangle of rotation and shear in our basis becomes pure stretching along the axes in the right basis, and its matrix collapses to a clean diagonal, just stretch factors down the diagonal and zeros elsewhere. Finding that special basis is the entire subject of the next lesson, and the vectors that define it are called eigenvectors.

Change of basis is the engine under most of dimensionality reduction. Principal Component Analysis finds a new basis for a dataset, the principal-component basis, aligned with the directions of greatest variation, so that the first coordinate captures the most spread, the next the second-most, and so on. Keeping only the first few coordinates compresses the data while losing as little as possible. Whitening re-expresses data in a basis where every direction has equal, unit variation. Matrix factorizations like SVD work by finding bases in which a matrix’s structure is laid bare.

None of these techniques would exist without the idea this lesson makes operational: coordinates are a choice, and choosing the right basis can turn a confusing description into a clear one. In neural networks, analyzing a layer’s matrix in a well-chosen basis (often the eigenvector basis of the next lesson) reveals which directions it stretches and which it shrinks, which in turn informs how gradients flow and how stable training is.

Thinking the standard basis is the “real” one. It is the default, not the truth. Every basis describes the same vectors; the standard basis is just the one we usually start in. Jennifer’s coordinates are as valid as ours.

Confusing which matrix translates which way. The basis matrix M (columns are the other basis written in our coordinates) translates from the other basis to ours. M-inverse goes the other way. If you mix them up, you translate in the wrong direction.

Forgetting the sandwich order. The matrix in Jennifer’s basis is M-inverse times A times M, read right to left: into our basis, transform, back to hers. Getting the order or the inverses wrong gives a matrix that describes a different operation.

Assuming a “prettier” matrix means a different transformation. A transformation can look simple in one basis and ugly in another while being the exact same physical operation. The matrix changes with the basis; the operation does not.

  • Coordinates describe a vector relative to a chosen basis, not absolutely. The same arrow has different coordinates in different bases, and none of them is more correct than the others. This is the promise from the first lesson, now made operational.
  • To translate coordinates, use the basis matrix M whose columns are the other basis written in your coordinates: M takes their coordinates to yours, and M-inverse takes yours to theirs (which requires a nonzero determinant of M).
  • A transformation gets a different matrix in a different basis, given by the sandwich M-inverse times A times M. Same physical operation, different numerical description. The right basis can make a transformation’s matrix dramatically simpler, which is exactly what the next lesson chases with eigenvectors.

A vector’s coordinates were never a fact about the vector; they were always a choice of language. Change the basis and the numbers change while the geometry holds still. The next lesson asks the natural follow-up: for a given transformation, which basis makes it simplest? The answer is its eigenvectors.