Summary: Integration and the fundamental theorem

The very first lesson found a circle’s area by slicing it into rings and adding them up, integration done informally. This lesson makes accumulation precise: it defines the integral as a limit of thin rectangles and states the fundamental theorem of calculus, which ties accumulation to differentiation. The headline, almost too convenient to believe: to add up a quantity over a range, find a function whose rate of change is that quantity and subtract its endpoint values. This is the scan-it-in-five-minutes version.

Core ideas

The definite integral is area under a curve. ∫_a^b f(x) dx is defined as the limit of Riemann sums: chop [a, b] into thin strips, approximate each by a rectangle f(x_i)·Δx, add, and let the width shrink to zero. For ∫_0^1 x² dx the sums 0.469, 0.385, 0.338 (n = 4, 10, 100) march toward 1/3. The ∫ is an elongated S for “sum”; dx is the strip width.
The fundamental theorem: ∫_a^b f(x) dx = F(b) - F(a), where F is any antiderivative of f (F' = f). To accumulate, you do not sum rectangles; you find a function whose rate of change is f and subtract its endpoint values. Differentiation and integration are inverse operations.
Antiderivatives are derivative rules reversed. ∫ x^n dx = x^(n+1)/(n+1) + C (except n = -1, which gives ln|x| + C); ∫ e^x dx = e^x + C; ∫ sin x dx = -cos x + C. Your differentiation knowledge is your integration knowledge, read backward.
Worked. ∫_0^1 x² dx = 1/3; ∫_0^1 e^x dx = e - 1 ≈ 1.718; ∫_1^2 (1/x) dx = ln 2 ≈ 0.693; ∫_0^π sin x dx = 2. And ∫_0^R 2πr dr = [πr²]_0^R = πR², closing the circle from lesson 1 in a single line.
Definite vs indefinite. Definite (∫_a^b) gives a number; indefinite (∫) gives a function F(x) + C. The + C exists because a constant’s derivative is zero, so f has a whole family of antiderivatives; the C cancels in a definite integral.

What changes for you

You now have the formal machinery behind the slice-and-add that opened the track, and the theorem that makes accumulation the exact inverse of finding a rate, so you can compute an area by reversing differentiation instead of summing infinitely many rectangles. That single fact, the fundamental theorem, is what links the two halves of calculus into one subject. In machine learning, integration is the mathematics of continuous probability, which is everywhere: a probability density integrates to 1, an expected value is ∫ x·f(x) dx, and entropy and KL divergence (in the loss functions of generative models and variational methods) are integrals over distributions. Continuous-time models like neural differential equations and diffusion perform their forward pass by solving an integral. In practice these are usually computed numerically rather than with antiderivatives, but the fundamental theorem is what connects the rates a model learns to the accumulations it optimizes. The next lesson dwells on why the fundamental theorem is true, unpacking geometrically how an area can equal a difference of antiderivative values.