Skip to content

Lesson: Matrices between dimensions

Every matrix in this track so far has been square: 2x2 taking the plane to the plane, 3x3 taking space to space. Input and output had the same number of dimensions. That was a convenience, not a law. This lesson drops it.

A rectangular matrix moves between dimensions. It can take a flat 2D input and place it inside 3D space, or take 3D space and squash it down onto a plane. And here is the pattern-extends payoff, the same one as the jump to 3D: almost nothing new is required. The columns are still where the basis vectors land, rank still measures the column space, and the rank-nullity accounting still balances. You are reading the same machine, in a new shape.

A rectangular matrix has a row count and a column count, and it maps an input space whose dimension is the number of columns to an output space whose dimension is the number of rows. The cleanest way to keep this straight is to read it through the columns, exactly as before:

  • The column count is the input dimension because the input space has one basis vector per column, and each column is where one of them lands.
  • Each column holds as many numbers as the output dimension because each landing spot is a vector in the output space.

A 3x2 matrix has two columns (2D input) each holding three numbers (3D output): it maps 2D to 3D. A 2x3 matrix has three columns (3D input) each holding two numbers (2D output): it maps 3D to 2D. The shape tells you the direction of the mapping.

Take the 3x2 matrix

[ 1 0 ]
[ 0 1 ]
[ 1 1 ]

Two columns, three rows: it takes a 2D vector in and gives a 3D vector out. The columns are 1, 0, 1 and 0, 1, 1, the landing spots of i-hat and j-hat, now living in 3D. Apply it to the vector 3, 4:

3 · [1, 0, 1] + 4 · [0, 1, 1] = [3, 0, 3] + [0, 4, 4] = [3, 4, 7]

A 2D input came out as a 3D point. Geometrically, the transformation takes the flat input plane and lays it down as a tilted plane sitting inside 3D space. The two columns are independent, so they span a genuine 2D plane (not a line): that plane, through the origin, is the column space, and the rank is 2. Nothing in the 2D input gets crushed, so the null space is just the origin. This is an embedding: a small space placed intact inside a bigger one.

A 3 by 2 matrix maps a 2D input plane to a tilted 2D plane inside 3D space Two side-by-side panels. The left panel shows a flat 2D source plane with the standard grid, a teal i-hat, and an amber j-hat. The right panel shows 3D space in isometric projection with faint x, y, z axes. A tilted translucent purple parallelogram sits in 3D, spanned by two 3D column vectors L of i-hat equals [1, 0, 0.5] in teal and L of j-hat equals [0.4, 1, 0.7] in amber, both originating at the origin. The label "image of L (a 2D plane inside 3D)" sits near the patch. 2D source plane î ĵ 3D output space x y z L(î) = [1, 0, 0.5] L(ĵ) = [0.4, 1, 0.7] image of L = a 2D plane inside 3D
A 3 by 2 matrix takes 2D input vectors and produces 3D output vectors. The image of the input plane is still a flat 2D plane; it just sits tilted inside the 3D output space, spanned by the matrix's two 3D column vectors.

Now the other direction. Take the 2x3 matrix

[ 1 0 0 ]
[ 0 1 0 ]

Three columns, two rows: it takes a 3D vector in and gives a 2D vector out. The columns are 1, 0, then 0, 1, then 0, 0. Apply it to the vector 3, 4, 5:

3 · [1, 0] + 4 · [0, 1] + 5 · [0, 0] = [3, 0] + [0, 4] + [0, 0] = [3, 4]

The x-component and y-component came straight through, and the z-component vanished. This matrix is a projection: it drops 3D space flat onto the 2D plane, throwing away height. The first two columns already span the full 2D output, so the rank is 2 (full, for a matrix that can output at most 2D).

But something did get destroyed, and the null space names it. Any input of the form 0, 0, z, anywhere along the z-axis, gets sent to the point 0, 0. The entire z-axis is crushed to the origin, so the null space is that z-axis, a 1D line, and the nullity is 1. The transformation cannot tell two points apart if they differ only in height.

A 2 by 3 matrix projects 3D space onto the 2D xy-plane, with the z-axis as its null space Two side-by-side panels showing the projection M = [[1, 0, 0], [0, 1, 0]] from 3D to 2D. The left panel is a 3D scene with x, y, z axes in axonometric projection; the entire z axis is highlighted in red dashed, labeled "null space: all of z gets crushed to 0". A purple input vector at [1.4, 0.8, 1.5] is drawn from the origin, and its drop to the xy-plane at [1.4, 0.8, 0] is shown as a dashed projection line. The right panel is a flat 2D xy-plane showing the projected output vector [1.4, 0.8] in purple. The label "M ignores z; only (x, y) survive" sits at the bottom. 3D input space x y z = null space v = [1.4, 0.8, 1.5] 2D output (xy-plane only) M·v = [1.4, 0.8] M ignores z; only (x, y) survive
A 2 by 3 matrix takes 3D inputs and produces 2D outputs. The projection [[1, 0, 0], [0, 1, 0]] is the simplest example: drop the z. Every vector along the z-axis lands at the origin, making the z-axis the null space; the surviving xy data is the output.

That last example is the moment to restate the conservation law from the previous lesson, because rectangular matrices make it more interesting:

rank + nullity = number of columns (the input dimension)

For the 2x3 projection: rank 2 plus nullity 1 equals 3, the input dimension. Notice that this matrix is full rank and still has a nontrivial null space, because the input (3D) is bigger than the output (2D) can hold. Whenever a matrix maps from a higher dimension to a lower one, something must be crushed; there is simply not enough room in the output to keep every input distinct. The 3x2 embedding had the opposite situation: input 2D, output 3D, plenty of room, so nothing was lost.

One more case to round out the classification. Take

[ 1 2 ]
[ 2 4 ]
[ 3 6 ]

A 3x2 matrix again, so 2D in, 3D out. But its columns are 1, 2, 3 and 2, 4, 6, and the second column is 2 times the first: dependent. They do not span a plane, only the line through 1, 2, 3. So the rank is 1, and the column space is a 1D line in 3D output space. For the null space, solve for the inputs sent to zero: every row reduces to x plus 2y equals 0, so x equals negative 2y, giving the line through negative-2, 1 in the 2D input. Nullity 1. Rank 1 plus nullity 1 equals 2, the input dimension, and the books balance again. This matrix collapses its 2D input down to a single line in 3D, the rectangular version of the determinant-zero collapse from earlier.

The capability: classify any rectangular matrix

Section titled “The capability: classify any rectangular matrix”

Put it together. Given any rectangular matrix, you can read off its full character:

  1. Input dimension is the number of columns.
  2. Output dimension is the number of rows.
  3. Rank is the dimension of the column space (the span of the columns), capped by the smaller of the two dimensions.
  4. Direction and meaning: more rows than columns and full rank is an embedding (small space into big); more columns than rows is a projection (big space into small, always with a null space); dependent columns mean rank-deficient, a collapse onto something smaller than even the output allows.

Run it once on a fresh matrix. Take the matrix with first row 2, 0, 1 and second row 0, 2, 1: three columns, two rows, so it maps 3D input to 2D output, a projection. Its columns 2, 0, then 0, 2, then 1, 1 include two that already span the full 2D output, so the rank is 2, full. Being a projection, it has to crush something: solving for the inputs sent to zero gives two conditions, twice the x-coordinate plus the z-coordinate equals zero, and twice the y-coordinate plus the z-coordinate equals zero, so the x-coordinate equals the y-coordinate and the z-coordinate is negative twice the x-coordinate, the line through 1, 1, negative-2. Nullity 1, and rank 2 plus nullity 1 is 3, the input dimension. Four readings, one matrix.

Column space always lives in the output space; null space always lives in the input space. Keeping those two on the correct side is most of what makes rectangular matrices feel clear.

This is the shape of nearly every layer in a neural network. A linear layer that takes a 768-dimensional embedding and produces a 256-dimensional hidden state is a 256x768 matrix: more columns than rows, a projection that compresses 768 dimensions down to 256. A layer going the other way, 256 up to 768, is a 768x256 matrix: an embedding that expands. The matrices that compress are doing exactly the 3D-to-2D projection you just saw, with a null space of directions the layer chooses to discard; the ones that expand are doing the 2D-to-3D embedding, placing a smaller representation inside a larger space. Dimension reduction, the workhorse of autoencoders, attention projections, and model compression, is rectangular matrices earning their keep.

Reading the shape backward. A rectangular matrix maps input to output: the column count is the input dimension, the row count is the output dimension. When in doubt, count the columns: that is how many basis vectors the input has.

Putting column space and null space on the wrong sides. Column space is in the output (it is the set of reachable outputs). Null space is in the input (it is the set of inputs crushed to zero). A 3x2 matrix has its column space in 3D and its null space in 2D.

Expecting full rank to mean no null space. For a projection (more columns than rows), full rank still leaves a null space, because the output is too small to keep every input distinct. Full rank only forces a trivial null space when the matrix is square.

Trying to take a determinant of a rectangular matrix. Determinants are defined only for square matrices; there is no area or volume scaling factor when the input and output dimensions differ. Rank is the right tool for rectangular matrices.

  • A rectangular matrix maps input to output: the number of columns is the input dimension (one column per input basis vector), and each column has as many entries as the output dimension. The shape tells you the direction of the mapping.
  • More rows than columns embeds a small space into a bigger one; more columns than rows projects a big space into a smaller one (and a projection always crushes something, so it always has a null space). Dependent columns mean rank-deficient, a collapse.
  • The rules carry over unchanged. Rank is the dimension of the column space (in the output); the null space lives in the input; and rank plus nullity equals the number of columns still balances exactly.

A matrix does not have to return space to itself. It can carry one dimension count to another, embedding or projecting, and the same columns, rank, and null space describe what it does. The next lesson looks at the most extreme rectangular case of all: a matrix with a single row, which turns a vector into one number, and turns out to be the dot product in disguise.