Summary: Higher-order derivatives
A derivative is itself a function, so you can differentiate it again. The second derivative f'' measures how the slope is changing, which turns out to mean acceleration in physics and curvature on a graph. It is the tool that tells a maximum from a minimum, and it underwrites the second-order methods that improve on plain gradient descent. This is the scan-it-in-five-minutes version.
Core ideas
Section titled “Core ideas”- Higher derivatives are derivatives of derivatives.
f -> f' -> f'' -> f''' -> ...; in Leibniz,d²f/dx²,d³f/dx³, and so on. Differentiatingx⁴repeatedly gives4x³, 12x², 24x, 24, 0: every polynomial eventually flattens to zero, which the Taylor series in the next lesson exploits. - In physics, the second derivative of position is acceleration.
s(t) -> s'(t) = velocity -> s''(t) = acceleration. Newton’sF = mais mass times this second derivative; classical mechanics is written in second derivatives. (The third is jerk.) For free-falls(t) = -16t² + 64t + 32, velocity is-32t + 64, acceleration is the constant-32(gravity), and at the top of the arc (t = 2, height96ft) the velocity is zero but the acceleration is still-32. - On a graph, the second derivative is curvature.
f'' > 0: graph cups upward (smiling).f'' < 0: cups downward (frowning).f'' = 0with a sign change: inflection point. The picture: positive curvature is a cup that holds water; negative curvature spills it. - The second-derivative test classifies critical points. At
f'(x) = 0:f'' > 0is a local minimum (cup holds water),f'' < 0is a local maximum,f'' = 0is inconclusive. Forf(x) = x³ - 3x: critical points atx = ±1;f'' = 6xgives a min at(1, -2)and a max at(-1, 2), with inflection at the origin. - Two signature cases.
sin'' = -sin(the oscillation equationf'' = -fthat governs springs, pendulums, sound, AC, light); and every derivative ofe^x, to any order, ise^x(the relentless self-reproduction that makes its Taylor series clean).
What changes for you
Section titled “What changes for you”The second derivative stops being a curiosity (“the derivative of the derivative”) and becomes the precise object behind two intuitive ideas: acceleration in time, curvature in space. With it you can map the full shape of a curve from two derivatives, sort the hills from the valleys without plotting, and recognize that the oscillation equation f'' = -f is why sine governs everything that swings or waves. In machine learning, the second derivative is the engine of second-order optimization: Newton’s method uses the Hessian (the matrix of all second partial derivatives) to take better-informed steps than plain gradient descent, Adam keeps a running estimate of how the gradient is changing as an informal curvature signal, K-FAC approximates the Hessian for efficient training, and the analysis of a model’s loss landscape, saddle points, flat basins, sharp ridges, is entirely a story about second derivatives. The track’s final lesson takes higher derivatives to their natural limit: using the whole tower of them at a single point to rebuild a function as a polynomial, the Taylor series.