Skip to content

Practice: The derivative as a rate

Six short questions. Answer each one in your head (or on paper) before opening the collapsible. Trying to retrieve the answer is where the learning sticks; rereading feels productive but does much less.

1. State the paradox of the derivative and its one-sentence resolution.

Show answer

Paradox: a derivative is the “rate of change at a single instant,” but over an instant of zero duration nothing changes, so how can there be a rate? Resolution: stop asking for the rate at an instant and instead take the value the average rate approaches as the measuring interval shrinks toward zero. “Rate at an instant” becomes “the limit of the average rate as the span vanishes,” which is perfectly well defined.

2. How do you compute a derivative as “rise over run,” and what makes it instantaneous?

Show answer

Over an interval from t to t + dt, the average rate is (change in the quantity) divided by dt, ordinary rise over run. To make it instantaneous, let dt shrink toward zero and take the value the ratio approaches. The derivative is that limit, not any single measurement over a real interval.

3. What is the geometric picture, secant to tangent?

Show answer

The average rate over [t, t + dt] is the slope of the secant line connecting two points on the curve. As dt shrinks, the second point slides toward the first and the secant pivots; in the limit, when the points merge, the secant becomes the tangent line. So the derivative is the slope of the tangent at that point, the steepness of the curve exactly where you stand.

4. What does dy/dx actually mean?

Show answer

It is shorthand for the whole limiting process: the value that rise over run approaches as the run shrinks to zero. It is not a fraction of two tiny infinitesimal numbers. Read ds/dt as “the rate at which s changes with t,” computed as a limit, not as a literal division.

5. Why is a derivative “a function,” not just a number?

Show answer

Differentiating gives a new function with a value at every point, not a single number. Differentiating position s(t) = 16t² gives velocity 32t, which has a value at every instant t. Asking “what is the derivative” yields a function; asking “what is the derivative at t = 2” yields a number (here, 64).

6. Why can’t you just set dt = 0 in the rise-over-run ratio?

Show answer

Because that makes it 0/0, which is undefined: the rise also goes to zero as dt does. You must simplify first (cancel the dt in the denominator), and only then let dt shrink toward zero. The order matters; simplifying before taking the limit is what avoids the 0/0.

Try it yourself, part 1: compute a derivative from scratch

Section titled “Try it yourself, part 1: compute a derivative from scratch”

Pen and paper, about 7 minutes. Find the derivative of s(t) = 5t² using only the limit of rise over run (no power-rule shortcut, that is the next lesson).

Steps. (1) Write the change over [t, t + dt]: s(t + dt) - s(t). (2) Divide by the run dt and simplify. (3) Let dt shrink to zero. (4) Evaluate the result at t = 3.

Show answer
change = 5(t + dt)² - 5t²
= 5(t² + 2t·dt + dt²) - 5t²
= 5(2t·dt + dt²) = 10t·dt + 5·dt²
rise / run = (10t·dt + 5·dt²) / dt = 10t + 5·dt
as dt -> 0: s'(t) = 10t

So the derivative of 5t² is 10t. At t = 3: 10 · 3 = 30. Notice the 5·dt term vanished cleanly as dt shrank, exactly as the 16·dt and dt² terms did in the lesson; shrinking dt simplifies rather than complicates.

Try it yourself, part 2: watch the averages converge

Section titled “Try it yourself, part 2: watch the averages converge”

About 4 minutes. Same function s(t) = 5t², fixed at the instant t = 3. The average rate over [3, 3 + dt] works out to 30 + 5·dt (you can get this from part 1). Fill in the average rate for each dt, and say what value they are marching toward, and why that is the derivative.

dt = 1.0 -> ?
dt = 0.5 -> ?
dt = 0.1 -> ?
dt = 0.01 -> ?
Show answer
dt = 1.0 -> 30 + 5·1.0 = 35
dt = 0.5 -> 30 + 5·0.5 = 32.5
dt = 0.1 -> 30 + 5·0.1 = 30.5
dt = 0.01 -> 30 + 5·0.01 = 30.05

The averages march toward 30 and keep closing the gap as dt shrinks, but they never need dt to actually reach zero. That target value, 30, is the instantaneous rate at t = 3, the derivative s'(3) = 10·3 = 30. This is the “approaches” idea as a concrete fact: the derivative is the number the averages home in on, read off from how they behave, not from any single measurement over a real interval (and not from setting dt = 0, which would give 0/0).

Nine cards. Click any card to reveal the answer. Use the Print flashcards button to lay out the full set as one card per page, ready to print or save as a PDF for offline review.

Q. What is the paradox of the derivative?
A.

A derivative is called the “rate of change at a single instant,” but over an instant of zero duration nothing changes, so it seems there can be no rate. The fix is to take the value the average rate approaches as the interval shrinks, not the rate “at” the instant.

Q. How is a derivative computed as rise over run?
A.

Over [t, t + dt], average rate = (change in the quantity) / dt. The derivative is the value this ratio approaches as dt shrinks toward zero, the limit of rise over run, not any single measurement over a real interval.

Q. What is the secant-to-tangent picture?
A.

The average rate is the slope of the secant line through two points on the curve. As dt shrinks the points merge and the secant pivots into the tangent line. The derivative is the slope of the tangent at that point.

Q. What does dy/dx mean (and not mean)?
A.

It is shorthand for the limit of rise over run as the run shrinks to zero, “the rate at which y changes with x.” It is not a fraction of two tiny infinitesimal numbers you divide.

Q. Why is a derivative a function, not a number?
A.

Differentiating produces a new function with a value at every point. s(t) = 16t² differentiates to 32t (velocity at every instant). “The derivative” is a function; “the derivative at t = 2” is a number.

Q. How do you compute the derivative of t³ from scratch?
A.

Form ((t+dt)³ - t³)/dt = (3t²·dt + 3t·dt² + dt³)/dt = 3t² + 3t·dt + dt², then let dt -> 0. The dt terms vanish, leaving 3t². (Likewise gives 2t.)

Q. Why can't you set dt = 0 directly in rise/run?
A.

Because it gives 0/0 (the rise vanishes too). Simplify first by cancelling the dt in the denominator, then let dt shrink to zero. The order matters.

Q. For s(t) = 5t², what is the derivative and the value at t = 3?
A.

s'(t) = 10t (from (10t·dt + 5·dt²)/dt = 10t + 5·dt, then dt -> 0). At t = 3, the instantaneous rate is 10·3 = 30.

Q. How does this lesson connect to training a model?
A.

Training nudges each parameter by asking “if I change this a tiny amount, how much does the loss change,” which is rise-over-run as the run shrinks. The gradient is a vector of such derivatives, one per parameter, recomputed every step.