What vectors actually are

Open a NumPy tutorial and you will see a vector created as an array of two numbers, 3 and 4, introduced as a vector. Sit in a physics lecture and a vector is an arrow with a little hat on the letter, something with a magnitude and a direction. Open a linear algebra textbook and a vector is “an element of a vector space,” a phrase that sounds engineered to tell you nothing. Three fields, three definitions, one word.

If you have ever quietly suspected these were three different things wearing the same name, this lesson is the one that connects them. They are not three things. They are one object seen from three angles, and once you can move between the angles at will, most of the math you have been avoiding stops being intimidating.

This matters more than it looks. The vector is the atom of everything that comes later: word embeddings are vectors, and a language model’s internal state is vectors being added and scaled. Get this one idea solid and the rest of the series has something to stand on. Leave it fuzzy and every later topic inherits the fuzziness.

Three views of the same thing

Start with the three perspectives, because each one is correct and each one is incomplete on its own.

The physics view: a vector is an arrow. A vector describes a movement, a displacement: how far and which way. The arrow has a length (how far) and a direction (which way). An arrow pointing 3 units right and 4 units up, which we will write as the coordinate pair 3, 4, has length 5 and points off toward the upper right. Where you draw it on the page does not matter; that same arrow is the same vector whether you draw it at the origin or off in the corner, because what defines it is the displacement, not the starting point. By convention we usually root it at the origin so we have a single canonical place to draw it, but the arrow is fundamentally about “how far and which way,” not “where it sits.”

The computer science view: a vector is an ordered list of numbers. The pair 3, 4 is a vector. The order is load-bearing: 3, 4 and 4, 3 are different vectors, the same way the points 3, 4 and 4, 3 are different points. The length of the list is the dimension. A two-number list is a 2D vector, a three-number list is 3D, and a list of three hundred numbers is a 300-dimensional vector, which you cannot draw but can absolutely compute with.

The math view: a vector is anything you can add and scale coherently. This is the one that sounds like a non-answer, and it is actually the deepest of the three. The mathematician does not care whether your vector is an arrow or a list. They care about two questions: can you add two of them and get another of the same kind, and can you scale one by a number and stay the same kind? The result must stay inside the same system. If both operations behave sensibly, the thing is a vector, full stop. This is why functions, polynomials, and quantum states can all be treated as vectors, even though none of them look like arrows.

The reason the textbook definition feels empty is that it is deliberately refusing to commit to arrows or lists. It is naming the only thing the two views have in common: the operations. Hold onto that, because it is the punchline of the whole lesson.

The bridge: why all three views matter

Here is the move that makes linear algebra useful instead of merely true. The arrow view and the list view are connected by a coordinate system. Lay down two perpendicular axes, agree on a unit length, and every arrow gets a unique list of numbers: the arrow that goes 3 units right and 4 units up becomes the pair 3, 4.

A vector is two numbers, [3, 4]. On a grid that means an arrow from the origin to the point three over and four up. The two numbers are exactly the coordinates of the tip; they are also exactly the steps you take to get there.

Run it the other way and every list becomes an arrow. The dictionary between geometry and numbers is exact.

That dictionary is the entire trick that lets computers do geometry and lets geometry organize data. A computer cannot manipulate an arrow, but it can manipulate the pair 3, 4 all day. A human cannot picture a 300-number list, but they can reason about “two points that are close together” or “two arrows pointing the same way.” The list view gives you the machine; the arrow view gives you the intuition; the math view tells you exactly which operations are allowed to travel between them. You will spend the rest of this series moving back and forth across this bridge, so it is worth naming explicitly now: numbers for the computer, arrows for your head, the same object underneath.

Notation: how we write them down

The standard way to write a vector is as a column of numbers inside square brackets:

v = [ 3 ]
    [ 4 ]

Written inline, that is the pair 3, 4. The numbers are called the components or coordinates of the vector. Reading them top to bottom (or left to right inline), each number says how far to move along one axis: first number along the horizontal axis, second along the vertical, and so on into higher dimensions. The coordinates describe how far the vector moves along each axis. Mathematically, a column vector is a matrix of size N by 1, which in code looks like a nested array, the column 3, 4 written as a list of one-entry rows:

[[3], [4]]

Knowing this prevents shape-mismatch surprises when we reach matrix multiplication later in the track.

One thing to flag early, because it causes more confusion than almost anything else in linear algebra: the coordinates are a description of the vector in a chosen coordinate system, not the vector itself. Choose different axes and the same arrow gets a different list of numbers. We return to this at change of basis later in the track. For now, file away that the pair 3, 4 is what this arrow looks like in the standard frame, the way “third house on the left” locates a house relative to where you stand.

The two operations that define everything

Remember the math view: a vector is anything you can add and scale. So the two operations are not side topics. They are the definition. Get comfortable with exactly these two and you have the foundation for everything built on vectors.

Adding vectors

To add two vectors, add them component by component:

[ 1 ]   [ 3 ]   [ 1 + 3 ]   [ 4 ]
[ 2 ] + [ 1 ] = [ 2 + 1 ] = [ 3 ]

To add two vectors, walk along the first one, then walk along the second one starting from where you ended. The arrow straight from your start to your finish is the sum. The arithmetic (add the components) and the geometry (walk tip-to-tail) are the same picture.

That is the numeric rule, and it is almost too simple to trust. The geometry is what makes it meaningful. Picture the first arrow, going 1 right and 2 up. Now, starting from the tip of that arrow, draw the second one, going 3 right and 1 up. The place you end up, measured from the original origin, is the sum: 4 right and 3 up. This is the tip-to-tail picture, and it is the reason vector addition is worth a name. Adding the coordinates and walking the arrows tip to tail give the same answer, every time. The number rule and the picture are two descriptions of one move.

A warning that catches almost every programmer at least once: vector addition is not list concatenation. Adding the vectors 1, 2 and 3, 4 gives 4, 6, not the four-number list 1, 2, 3, 4. You can only add vectors of the same dimension, and the result keeps that dimension.

Scaling vectors

To scale a vector, multiply every component by the same number. That number is called a scalar, precisely because its job is to scale:

        [ 3 ]   [ 6 ]
    2 · [ 1 ] = [ 2 ]

Multiplying a vector by a number is geometric: 2 stretches it to twice as long, 0.5 squishes it to half, and (-1) flips it to point the other way. Same direction, new length, or opposite direction if the number is negative. That word, "scalar," is named for exactly this scaling job.

Geometrically, multiplying by 2 stretches the arrow to twice its length while keeping its direction. Multiplying by 0.5 squishes it to half length. Multiplying by a negative number flips it to point the opposite way: negative 1 times a vector is the same arrow pointing backward. The word “scalar” earns itself here; a scalar is a number whose entire role is to stretch, squish, or flip a vector without rotating it off its line (or collapse it to the origin entirely, if you multiply by zero).

That is the complete toolkit. Addition combines vectors; scalar multiplication resizes them. Every other operation you will meet in this series, linear combinations, spans, matrix transformations, is assembled out of these two primitive moves. There is nothing else underneath.

Why this matters when you use AI

The inside of a modern AI model is almost nothing but vectors being added and scaled. That is why arrows on a grid are worth your time: the arrows are what is inside.

AI systems represent words as vectors, lists of hundreds of numbers, arranged so that distance and direction carry meaning. Early word embedding systems famously showed patterns like this: take the vector for “king,” subtract the vector for “man” (subtraction is just scaling by negative 1 then adding), add the vector for “woman,” and you land in the immediate neighborhood of the vector for “queen.” That calculation is exactly the component-by-component subtraction and addition you just learned, run in a few hundred dimensions instead of two. (The result is not always exactly “queen” on the coordinates; a nearest-neighbor search by similarity is what finds it.) The arithmetic is identical; only the length of the lists changed. When a model weighs how much one word should attend to another, it compares their vectors using the dot product, an operation you will meet later in this track that is itself built from multiplying and adding numbers. When the model learns, it nudges its vectors by small scaled amounts, over and over, which is scalar multiplication and addition doing the work.

You do not need to picture three hundred dimensions to use any of this. In machine learning, those dimensions often represent features or learned properties rather than physical directions in space. The arrow picture stops being literally drawable past three dimensions, but the algebra does not change at all: add component by component, scale every component by the same number. The rules you learned on a flat grid are the exact rules running inside the model, just with longer lists. That is the payoff of the list view: it scales to any dimension without asking your imagination for anything.

Common pitfalls

A few mistakes are common enough to name directly.

Thinking the arrow has to start in a specific place. A vector is a displacement, “how far and which way,” not a fixed location. We root arrows at the origin by convention so there is one tidy place to draw them, but two arrows of the same length and direction are the same vector wherever they sit.

Confusing the vector with its coordinates. The list 3, 4 is how the vector looks in one chosen coordinate system. Pick different axes and the same arrow gets different numbers. The vector is the underlying object; the coordinates are one description of it.

Treating addition as concatenation. Adding 1, 2 and 3, 4 gives 4, 6. You add matching components; you do not glue the lists together. Vectors of different dimensions cannot be added at all.

Assuming higher dimensions are mystical. A 300-dimensional vector is a list of 300 numbers. You cannot draw it, but you do not need to. Every rule from the 2D grid applies unchanged. “High-dimensional” means “long list,” not “magic.”

Believing a vector must be an arrow or a list. The most general view is the math one: anything you can add and scale coherently is a vector. Arrows and lists are the two most common examples, not the definition.

What you should remember

A vector is one object with three faces: an arrow you can picture, a list of numbers you can compute with, and, most generally, anything you can add and scale coherently. A coordinate system translates exactly between the arrow and the list.
Two operations define a vector: addition and scalar multiplication. Add component by component (tip to tail in the picture); scale by multiplying every component by one number (stretch, squish, or flip). Everything later in this series is built from these two moves and nothing else.
Coordinates are a description, not the thing. Change the frame and the numbers change while the vector stays put. Keep the object and its coordinates separate in your mind and half of the usual confusion never starts.

A vector is an arrow you can compute with, a list of numbers you can picture, and anything you can add and scale coherently. Three faces of one object, and the smallest unit everything later in this series builds on.

Next: the cheatsheet puts the rules and worked numbers on one page, and the references link Grant Sanderson’s video if you want to watch the arrows move.