Skip to content

Cheatsheet: The derivative as a rate

  • Paradox: a derivative is the “rate of change at an instant,” but nothing changes in zero time.
  • Fix: the derivative is the value the average rate approaches as the measuring interval shrinks toward zero. Not “rate at an instant”; “the limit of the rate as the span vanishes.”
average rate over [t, t+dt] = (change in quantity) / dt
derivative = limit of that ratio as dt -> 0
avg velocity = (16(t+dt)^2 - 16t^2) / dt
= (32t·dt + 16·dt^2) / dt
= 32t + 16·dt
as dt -> 0: instantaneous velocity = 32t

At t = 2: 32·2 = 64 ft/s.

Worked: derivative of s(t) = t^3 from scratch

Section titled “Worked: derivative of s(t) = t^3 from scratch”
(t+dt)^3 - t^3 = 3t^2·dt + 3t·dt^2 + dt^3
divide by dt: 3t^2 + 3t·dt + dt^2
as dt -> 0: s'(t) = 3t^2

Shrinking dt does not complicate the calculation, it cleans it (the dt terms vanish). Cubic to quadratic. (The power rule, named next lesson.)

Average rate over [t, t+dt] = slope of the secant line through two points. As dt -> 0 the two points merge and the secant rotates into the tangent line. The derivative is the slope of the tangent at a point.

Shorthand for the limit, not a fraction of infinitesimals. “The rate at which y changes with x,” computed as rise-over-run with the run shrinking to zero. The derivative is itself a function: position -> velocity at every instant.

Training follows the derivative of the loss downhill: “if I nudge this parameter a little, how much does the loss change” is rise-over-run as the run shrinks. The gradient is a vector of such derivatives (one per parameter), recomputed each step; automatic differentiation gets the exact limit by applying derivative rules through the network.

  • Reading dy/dx as a fraction of infinitesimals. It is limit notation.
  • “Rate at an instant.” It is the rate the average approaches as the span vanishes.
  • Forgetting it is a function. s'(t) gives a slope at every point; “at t=2” gives a number.
  • Plugging dt = 0 directly. That is 0/0. Simplify first, then take dt -> 0.

A derivative is the limit of rise over run as the run shrinks to zero, which is the slope of the tangent line, and dy/dx is shorthand for that limit, not a fraction.