Cheatsheet: What's next
Six directions in the LLM landscape
Section titled “Six directions in the LLM landscape”| Direction | What it changes | Builder watch |
|---|---|---|
| Longer context | More fits per call (8K → 1M+ tokens) | Doesn’t eliminate retrieval; cost/latency still scale with input |
| Multimodality | Text + images + audio + (video) inputs and outputs | Lesson 1-7 patterns generalize; new categories open |
| Smaller specialized models | Narrow tasks at fraction of frontier cost | Mix architecture (small inner sub-tasks + frontier outer synthesis) |
| Build-vs-buy spectrum | Hosted (start) → fine-tune → train from scratch | Bar for “train” rises; bar for “fine-tune small” falls |
| Agents | Multi-step tool-use loops | Latency stretches, cost compounds, eval gets harder, new failure modes |
| Reasoning models | Better on multi-step problems | 5-20x thinking tokens per answer; use deliberately, not as default |
How each direction interacts with the three productive limits (L2)
Section titled “How each direction interacts with the three productive limits (L2)”| Direction | Context | Cost | Latency |
|---|---|---|---|
| Longer context | Bigger budget | Input cost scales with what you put in | TTFT scales with prefill |
| Multimodality | New token types (images/audio) | Often higher per-multimodal-input | Higher TTFT for multimodal prefill |
| Smaller specialized | Same budget | Lower per-call for inner sub-tasks | Often lower TTFT and tokens/sec |
| Build-vs-buy | Same | Hosted has provider pricing; fine-tune-then-serve has serving cost | Self-serve can be lower latency |
| Agents | Each step shares its own | Compounds with steps | Multiplies with step count |
| Reasoning | Big “thinking” budget burned per task | Higher per-task (many invisible tokens) | Higher TTFT-to-final-answer |
The build-vs-buy decision tree (T21 spec)
Section titled “The build-vs-buy decision tree (T21 spec)”Start: hosted API (almost always correct)Prompting fails consistently on a specific recurring task at scale? -> fine-tune an open model (lesson 9)Research / structural data advantage no hosted model can match? -> train from scratch (Track 15 territory; rare for app teams)The “mix” architecture (worth knowing by name)
Section titled “The “mix” architecture (worth knowing by name)”[Router] -> small specialized model[Retriever-rewriter] -> small specialized model[Re-ranker / classifier] -> small specialized model[User-facing synthesis] -> frontier model[LLMOps wrapping everything] -> per lesson 7Lowers per-request cost without sacrificing user-facing quality.
When to reach for a reasoning model
Section titled “When to reach for a reasoning model”- Multi-step math, code with constraints, logic puzzles, tasks where intermediate steps matter.
- NOT a default. Per-task cost includes many “thinking” tokens (5-20x visible response).
- Lesson-7 A/B test on real traffic decides per-task whether the quality lift earns the extra cost.
Adopting a new capability safely
Section titled “Adopting a new capability safely”1. Read it through the three productive limits.2. Place it on the build-vs-buy spectrum.3. Identify which lesson 1-7 patterns generalize and which need new techniques.4. Run the lesson-7 regression suite on the new model BEFORE switching.5. A/B test on real traffic if quality looks comparable.6. Adopt with versioned prompts + logged comparison.The builder’s instinct (the durable takeaway)
Section titled “The builder’s instinct (the durable takeaway)”Specific models go stale within months. The way you READ each new release should not. Three productive limits + build-vs-buy spectrum + LLMOps discipline + lesson 1-7 patterns = the lens that outlasts the churn.
Words to use precisely
Section titled “Words to use precisely”- Survey lesson: lighter pedagogy, breadth-over-depth, points forward to deeper lessons.
- Mix architecture: small specialized models for inner sub-tasks + frontier model for outer synthesis.
- Build-vs-buy spectrum: hosted API → fine-tune → train from scratch.
- Builder’s instinct: read new capability through productive limits + build-vs-buy + LLMOps + lesson 1-7 patterns.
Source
Section titled “Source”- Full Stack Deep Learning, LLM Bootcamp (Spring 2023): What’s Next?
fullstackdeeplearning.com/llm-bootcamp. Independent structural mirror in original prose; see references.