Lesson: Why area equals slope
The last lesson handed you the fundamental theorem of calculus as a tool: to compute the integral of the function from the lower limit to the upper limit, find an antiderivative (a function whose derivative is the original function) and compute its value at the upper limit minus its value at the lower limit. It worked, and we used it. But it should bother you a little. Why would the area under a curve equal the difference of values of some related function? Area and slope look like completely unrelated ideas, one about how much region sits under a graph, the other about how steeply the graph rises. This lesson shows they are secretly the same thing, and the proof is a single picture.
Define the running area
Section titled “Define the running area”Fix the left end at the lower limit and let the right end slide. Define the area function:
A(x) = ∫_a^x f(t) dtIn words: the area function is the area under the curve, accumulated from the fixed start at the lower limit up to a movable position, the input. It is genuinely a function: feed it a stopping point, and it returns the total area built up by then. As you slide the input rightward, the area function grows.
The question that unlocks everything is: how fast does the area function grow? In other words, what is the derivative of the area function?
The one geometric move
Section titled “The one geometric move”Slide the right end out by a tiny step in the input, from the input to the input plus that tiny step. How much new area does the area function pick up? Just a thin sliver against the right edge of the region, a strip whose width is that tiny step in the input and whose height equals the curve’s value there, the function at the input. So the extra area is approximately a rectangle:
A(x + dx) - A(x) ≈ f(x) · dxDivide both sides by that tiny step in the input:
[ A(x + dx) - A(x) ] / dx ≈ f(x)The left side is exactly the difference-quotient that defines a derivative (from the rate lesson). Take the limit as that tiny step in the input shrinks to zero, the sliver becoming a perfect rectangle, and the approximation becomes exact:
A'(x) = f(x)There it is. The derivative of the area function is the original curve. The rate at which accumulated area grows, at any point, is just the height of the curve at that point, because that height is what each new sliver of area is made of.
Put numbers on it for the function x squared, whose area function (from the last lesson) is x cubed over 3. At the input equal to 2, the curve’s height is 4, so the area should be growing at rate 4 there. Check with a small step: the area function at 2 is 8 over 3, about 2.6667, and the area function at 2.01 is 2.01 cubed over 3, about 2.7069, so the area grew by about 0.0402 over a step of 0.01, a rate of 0.0402 divided by 0.01, about 4.02. That is essentially 4, the function at 2, and it closes in on exactly 4 as the step shrinks. The area function really does grow at the curve’s height.
That single fact is the fundamental theorem
Section titled “That single fact is the fundamental theorem”Watch the theorem fall out. We just showed that the derivative of the area function equals the original function, which means the area function is an antiderivative of the original function. Any other antiderivative differs from the area function only by a constant, and that constant cancels in a difference. (Why must they differ by only a constant? If the other antiderivative and the area function have the same derivative, then the derivative of their difference is zero everywhere, and a function whose rate of change is zero at every point never changes; it is a constant.) So:
F(b) - F(a) = A(b) - A(a) = (area up to b) - (area up to a) = ∫_a^b f(x) dxThat is the fundamental theorem, and it came entirely from one observation: extending an area by a sliver adds the function at the input times a tiny step in the input, so the area grows at the rate given by the function at the input. The reason the antiderivative trick works is that the area function is an antiderivative. Integration and differentiation are inverse operations because the slope of an accumulated area is the very thing being accumulated.
See it on familiar functions
Section titled “See it on familiar functions”A polynomial. For the function x squared, the area function is the integral of t squared from 0 to the input, which equals x cubed over 3 (from the last lesson). Differentiate it: the derivative of the area function is x squared, the original function. The area function’s slope is the curve, exactly as promised.
An exponential. For the function Euler’s number raised to the input, the area function is the integral of Euler’s number raised to the input from 0 to the input, which equals Euler’s number raised to the input minus 1. Differentiate: the derivative of the area function is Euler’s number raised to the input, the original function. The constant negative 1 differentiates away, and the slope of the accumulated area is again the original curve.
A trig function. For the sine function, the area function is the integral of sine from 0 to the input, which equals 1 minus cosine of the input (the antiderivative of sine is negative cosine, evaluated from 0). Differentiate: the derivative of the area function is sine of the input, the original function, since the derivative of 1 minus cosine of the input is sine of the input. The pattern holds for the trig functions too.
The circle, one last time. For the circle’s circumference, 2 pi times the radius, the accumulated area is the integral of 2 pi times the radius from 0 to the outer radius, which equals pi times the outer radius squared, the disk’s area. Differentiate: the derivative of the area function is 2 pi times the outer radius, the circumference at that radius. This is precisely the observation the very first lesson made by hand, “the rate of change of the accumulated area is exactly the circumference being accumulated,” now revealed as a special case of the general fact that the derivative of the area function equals the original function. The track opened by noticing this on a circle; it closes the loop here by proving it for every curve.
The everyday version
Section titled “The everyday version”Strip away the symbols and the idea is something you already know. Picture a bucket under a tap. The amount of water in the bucket is an accumulated total; the tap’s flow rate is how fast that total grows. The two are locked together: the rate at which the bucket fills is the flow coming out of the tap at that moment. Open the tap wider and the water level climbs faster; the level’s rate of change tracks the flow exactly.
That is the fundamental theorem in a sentence: the rate at which a total accumulates is the thing being accumulated. The bucket’s water is the integral; the tap’s flow is the function; the fact that the fill-rate equals the flow is the statement that the derivative of the area function equals the original function. Calculus’s two halves, rates and accumulations, are not separate subjects. They are the water level and the flow, two descriptions of one process.
A car’s two dashboard gauges tell the same story. The odometer reads accumulated distance, a running total; the speedometer reads current speed, a rate. They are a derivative-of-the-area-function-equals-the-original-function pair: the speed is how fast the odometer’s number climbs, so distance is the integral of speed and speed is the derivative of distance. You have been reading a live demonstration of the fundamental theorem every time you drive.
Why this matters when you use AI
Section titled “Why this matters when you use AI”The pairing this lesson proves, “the slope of an accumulation is the thing accumulated,” is the exact relationship between two functions you meet constantly in machine learning.
In continuous probability, the cumulative distribution function is the accumulated probability up to the input, an integral of the probability density: the cumulative function equals the integral of the density. By this lesson, the derivative of the cumulative function is the density: the density is the derivative of the cumulative function, and the cumulative function is the integral of the density. Every PDF-and-CDF pair you encounter is an instance of the derivative of the area function equaling the original function. The same pairing appears whenever you track a running total against the rate that feeds it: a cumulative loss over training time whose rate of change is the current loss, a cumulative count whose slope is the instantaneous rate. Reading a loss curve and reading its accumulated version are two views of the same data, linked by exactly the theorem proved here, which is why a practitioner can move between a rate and its running total without ever recomputing from scratch.
Common pitfalls
Section titled “Common pitfalls”Treating the FTC as a coincidence to memorize. It is not a lucky formula; it follows from one fact, that extending an area by a tiny step in the input adds a rectangle of height equal to the function at the input. If you remember the sliver, you can reconstruct why the theorem holds.
Forgetting the area function depends on a choice of start. The area function, the integral of the function from the lower limit to the input, is measured from a fixed lower limit. Different choices of lower limit give area functions differing by a constant, which is exactly why antiderivatives carry a plus C and why the constant cancels when you subtract the antiderivative’s value at the lower limit from its value at the upper limit.
Confusing the curve with its area function. The original function is the height of the graph; the area function is the area accumulated under it. They are different functions, related by the fact that the derivative of the area function is the original function. The curve is the rate; the area is the total.
Thinking area and slope are different worlds. That is the intuition this lesson overturns. The slope of the area function is the curve’s height, so accumulation and rate are inverses, not strangers.
What you should remember
Section titled “What you should remember”- Define the area function as the integral of the function from the lower limit to the input, the area accumulated up to a moving right end. Extending it by a tiny step in the input adds a thin rectangle of height equal to the function at the input, so the change in the area function is approximately the function at the input times that tiny step, and in the limit the derivative of the area function is the original function: the derivative of the area function is the original curve.
- That single fact is the fundamental theorem. Because the area function is an antiderivative of the original function, any antiderivative gives its value at the upper limit minus its value at the lower limit, which equals the integral of the function from the lower limit to the upper limit. Integration and differentiation are inverse operations because the slope of an accumulated area is the thing being accumulated.
- In everyday terms, the rate a total accumulates is the thing accumulating: the bucket fills at the tap’s flow rate. In machine learning this is the PDF-and-CDF pairing (the derivative of the cumulative function is the density) and every running-total-versus-its-rate relationship you read off a curve.
Area and slope turned out to be one idea seen twice: the slope of the accumulated area is the curve itself. The first lesson noticed this on a circle; this lesson proved it for everything. With both halves of calculus built and bound together, the final two lessons go deeper into rates, first higher-order derivatives, then the Taylor series that approximates any function from them.