Taylor series
What you’ll learn
Section titled “What you’ll learn”This is the track’s finale. The single capability it builds: approximate a function near a point with a polynomial built from its derivatives (the Taylor series), and recognize this as the calculus idea machine learning leans on most.
Near a point a, a function f is approximated by f(x) ≈ f(a) + f'(a)·(x-a) + (f''(a)/2!)·(x-a)² + .... You will see why the factorials are required (the matching property: dividing by k! makes the polynomial and f agree on value, slope, concavity, and every higher derivative at a), and watch the polynomial wrap one more order tighter around the curve with each new term (the first two terms are the tangent line; add a term and the line bends into a parabola matching concavity). You will build the canonical series for e^x, sin x, and cos x, watch their partial sums converge (e^x at x = 1 to 2.717 ≈ e; sin(1) partial sums marching 1, 0.833, 0.842, 0.84147), and recognize the small-angle approximation and L’Hôpital’s rule as Taylor’s first-order results in disguise. You will then apply Newton’s method, the first-order Taylor approximation used to find a root, to compute √2 (from x_0 = 1.5: 1.5 → 1.41667 → 1.41421 in two steps). And you will see why Taylor is structural in machine learning: gradient descent is a first-order Taylor step on the loss, Newton’s method is second-order, the neural tangent kernel is a first-order Taylor expansion of a network at initialization, and hardware computes sin, exp, and log as truncated Taylor-style polynomials.
Where this fits
Section titled “Where this fits”This is lesson 13 of Phase 3 and the finale of the track. It is the synthesis: it needs the power rule (the polynomial structure), the trig and exponential derivatives (the canonical series), the chain rule (composing series), higher-order derivatives (the ingredients), and the limit concept (convergence). It is also the calculus-side foundation for second-order optimization across the AI tracks: gradient descent, Newton’s method, the neural tangent kernel, and loss-landscape analysis all read as moves you understand after this lesson. (This lesson bundles two short 3Blue1Brown chapters, Ch14 Taylor series and Ch15 Taylor series (geometric view), into one capstone, both developing the same single capability.)
Before you start
Section titled “Before you start”Prerequisite (within this track): lesson 12, Higher-order derivatives, since the k-th Taylor term uses f^(k)(a), the k-th derivative at the center. You also lean on the derivative rules from Phases 1 and 2 (power, trig, e, chain), and on the limit concept (lesson 9) for convergence. The math is multiplying small numbers and dividing by small factorials; a calculator helps, and no coding or installation is required.
By the end, you’ll be able to
Section titled “By the end, you’ll be able to”- Build a Taylor series of a function at a point from its derivatives, including the canonical series for e^x, sin x, and cos x
- Explain the matching property and why each term carries a 1/k! factorial
- Recognize the small-angle approximation and L’Hopital’s rule as Taylor’s first-order results
- Apply Newton’s method as the first-order Taylor approximation for finding a root, and use a Taylor truncation to approximate a transcendental value
- Connect Taylor to gradient descent (first-order), Newton’s method (second-order), the neural tangent kernel, and how hardware computes sine, exp, and log
Time and difficulty
Section titled “Time and difficulty”- Read time: about 12 minutes
- Practice time: about 14 minutes (building cosine’s series and watching it converge, a Newton’s-method root-finder, and whole-finale flashcards)
- Difficulty: standard