Summary: When two things move together: correlation
Correlation measures how tightly two quantities move together, and the warning that rides along is the most misused idea in data analysis: correlation is not causation. Ice cream sales and drownings rise together every summer, but eating ice cream does not drown anyone; hot weather drives both. This lesson is the tool (the scatterplot and the correlation coefficient) and the discipline (refusing to read cause into co-variation), and both matter because machine-learning models are built almost entirely from correlations. This summary is the scan-in-five-minutes version of the full lesson.
Core ideas
Section titled “Core ideas”- Start with the scatterplot. One dot per observation, placed by its two values. It shows the direction (up-together, or one-up-one-down) and the strength of a relationship before any number summarizes it.
- The correlation coefficient puts a number on it. Written r, it runs from -1 to +1. The sign is the direction (positive = rise together, negative = move oppositely); the magnitude is the strength (near the ends = tight straight line, near 0 = no linear drift). It is built from the same standardized z-scores met earlier: roughly the average of the products of the two variables’ z-scores.
- It measures only straight lines. A strong curved relationship (a U-shape) can have r near zero. A near-zero coefficient means no linear relationship, not no relationship, so always look at the picture.
- Correlation is not causation. A correlation can come from X causing Y, Y causing X, a hidden third variable (a confounder) causing both, or coincidence. Observing the correlation alone cannot tell you which; establishing cause usually takes a controlled experiment.
- In machine learning, both halves bite. Highly correlated features are redundant (same information twice). And because a model chases correlation, it can latch onto a spurious signal (a confounder) that holds in training data and fails in the world.
- Correlation describes; regression predicts. Measuring how tightly two variables move together is correlation. Fitting a line or curve to predict one from another is regression, taught in the Classical Machine Learning track, not here. This track stays on the descriptive side.
What changes for you
Section titled “What changes for you”You gain one tool and one reflex. The tool: when two quantities are involved, you reach for a scatterplot and a correlation coefficient, and you read both direction and strength from them, while remembering the coefficient is blind to curves. The reflex, the more valuable of the two: every time you hear “X is linked to Y” or “X is associated with Y,” you stop and run through the alternatives, reverse cause, a hidden confounder, coincidence, before letting yourself believe X causes Y. In an AI context that reflex is protective: models surface correlations by the thousand, and the practitioner who treats every one as a cause will trust signals that quietly break the moment the world shifts.