Skip to content

Summary: Abstract vector spaces

The very first lesson said a vector is anything you can add and scale coherently, even if it is not an arrow or a list, and called that the deepest of the three views. This final lesson cashes that promise. The whole thing reduces to one line: a vector is anything you can add and scale, so functions and polynomials are vectors, the derivative is a matrix, and every tool from this track works on any space that follows the two rules. This is the scan-it-in-five-minutes version.

  • The math view is the real definition: a vector is anything you can add and scale coherently. Arrows and lists were the teaching model; the algebra is the definition, and it runs on anything obeying the two rules.
  • Functions are vectors. You can add two pointwise ((f+g)(x) = f(x)+g(x)) and scale one pointwise ((c·f)(x) = c·f(x)), and both stay inside the set of functions. So the set of functions is a vector space, though no function is an arrow.
  • Polynomials have clean coordinates. Degree-3-or-less polynomials have basis {1, x, x^2, x^3} and dimension 4. 2x^2 + 5x + 7 becomes the coordinate vector [7, 5, 2, 0], and adding polynomials is adding coordinate vectors ((3x^2+1) + (x+2) = [3,1,3,0] = 3x^2 + x + 3).
  • The derivative is a matrix. Differentiation is linear, so it is captured by where it sends each basis polynomial: D = [[0,1,0,0],[0,0,2,0],[0,0,0,3],[0,0,0,0]]. Then D · [7,5,2,0] = [5,4,0,0] = 4x + 5, the derivative of 2x^2+5x+7. Calculus by matrix multiplication.
  • A vector space is any set whose addition and scaling obey the standard axioms (commutativity, distributivity, a zero, and so on). The takeaway is not the axiom list but its consequence: satisfy them, and every tool from the entire track applies, spans, bases, dimension, transformations, determinants, eigenvectors, change of basis. Quick proofs: a polynomial’s coordinates change with the basis (change of basis on {1, (1+x), (1+x)^2} gives 2x^2+5x+7 the coordinates [4,1,2]); and the eigenvectors of the derivative are the exponentials, since d/dx(e^(kx)) = k·e^(kx).
  • This is where the track pays off for AI, which lives in abstract vector spaces, not 2D arrows. Embeddings (word, sentence, image) are high-dimensional vectors ranked by cosine similarity and compressed by PCA. Every layer is a matrix (a linear transformation) plus a nonlinearity, in a space too big to draw. Function spaces appear directly in the theory (signals, kernels). “Latent space,” “embedding space,” and “function space” all name a vector space in this sense.

You started the track with three definitions of a vector that seemed to describe different things, and you end knowing they were always one thing: an object you can add and scale, whether an arrow, a list, a polynomial, or the internal state of a model with thousands of dimensions. The arrows were the scaffolding; the algebra was the building. The practical upshot is that the next time you read about an embedding space, a latent space, or a function space, you will know it is a vector space in exactly this sense, and the geometric intuition you built on a flat grid still holds there. That was the entire point of the track, and you now have it: the matrix manipulations in a machine learning paper read as moves in space, not opaque symbols.