Skip to content

Cheatsheet: Industry perspective

Phase 1 (L1-3): SHIP a minimum app + PROMPTS as engineering
L1 Launch an LLM app -> 5 components, working app in an hour
L2 LLM foundations -> 3 properties + 3 productive limits (the lens)
L3 Prompt engineering -> Learn to Spell + held-out test set + triage
Phase 2 (L4-7): PUSH past closed corpus + DESIGN interface + DISCIPLINE
L4 Augmented LMs -> RAG (7 parts) + tool use (4 steps)
L5 Project walkthrough -> read askFSDL with a production eye
L6 UX for LUIs -> 5 patterns (stream/cite/regen/hedge/recover)
L7 LLMOps -> 5 pillars (the floor for every production app)
Phase 3 (L8-11): DIRECTIONS + DEEP DIVES + CAPSTONE
L8 What's next -> directions + build-vs-buy + mix architecture
L9 Train your own LLM -> 3-things-true test + staged pipeline
L10 Agents -> L4 loop + 3 tests + 5 failure modes
L11 Industry perspective-> synthesis + fireside reading rules (this lesson)

Through-line: L2’s three productive limits (context, cost, latency) + L7’s five engineering pillars (observability, eval-in-production, prompt versioning, cost-and-latency monitoring, regression testing). Carries forward into every future LLM-application decision.

Three rules:

RuleWhat it means
1. Attribute, do not absorb”I think X” stays “the speaker thinks X.” Not “X is canon.”
2. Separate durable bets from speaker betsField-converged vs one-practitioner view. Act on first; ask the second of your product.
3. Use the chat as a question generatorThree questions to bring back to your product. Not predictions.
BetImplication
1. Models keep getting better / cheaper per tokenDon’t over-optimize for current model; build to swap.
2. Evaluation is the moat, not the modelBuild a real held-out eval set before clever prompts.
3. Interaction surface keeps expandingDesign taste is now the bottleneck, not raw capability.
4. Most teams should not train their own modelHosted for user-facing; fine-tune inner sub-tasks; train-from-scratch is rare.
5. Operational discipline beats clever architectureShip L7 before L4 / L10 cleverness.

Speaker views (ask these; don’t adopt them)

Section titled “Speaker views (ask these; don’t adopt them)”

These are real questions where the field has NOT converged. Bring them home, don’t bring positions home.

• "What does an LLM-first product feel like?"
• "Where in the stack does the moat actually live?"
• "How fast should you build before the next model
release surpasses your scaffolding?"
• "What is the right level of agent autonomy for production?"

Different practitioners take different positions on each. Reader’s call after asking the question of your own product.

1. SHIP the smallest version of your application with L7 discipline.
(prompt versioning + held-out evals + basic logs + visible cost/latency)
That bar is the difference between a demo and an application.
2. PICK one durable bet, act on it this month.
(For most builders: "evaluation is the moat", focused week on a
real held-out eval set; chart pass rate over 4 weeks.)
3. READ the fireside + 1 other industry source.
Write down the 3 questions it raises about your product.
Ask them in next planning.
ReadBuildTry
Fireside chat (this lesson’s source)L7 discipline on what you shipped”Three tests” (L10) on your next “let’s make it an agent” idea
One other practitioner talk near your domainOne field-direction experiment from L8 closest to your product”Three-things-true” test (L9) on your next “let’s fine-tune” idea
Track 14 (using-side LLM library work)Multimodal extension if text-only; small agent if single-turn”Where prompts run out” triage (L3) on next misbehaving feature
Track 15 (build-the-model deep dive)Fine-tune one inner sub-task per L9 if volume justifies
Track 20 (full agents track)
Synthesis + careful read of a primary source.
NOT a forecast.

Forecasts age fast (“models will do X by date Y” → useless when model Z ships). Synthesis ages well (“here is the arc, here are the durable bets, here are the questions”).

  • Any framing that treats fireside opinions as canon
  • Predictions of specific model capabilities or release dates
  • Contested debates about agent autonomy, alignment, safety, or wider AI policy

Real and important; require their own framing in their own forum with the right stakeholders. The fireside does not retroactively license absorbing those debates as canon.

  • Durable bet: a field-converged observation many sources confirm; act on directly.
  • Speaker view: one practitioner’s position on a question the field has not converged on; ask of your product, don’t adopt.
  • Track arc: the journey from demo (L1) to production-grade application (L7) to deep dives + industry perspective (L8-L11).
  • Through-line: L2 productive limits + L7 engineering pillars; what carries forward into every future LLM-application decision.
  • Full Stack Deep Learning, LLM Bootcamp (Spring 2023): Fireside Chat with Peter Welinder (OpenAI). fullstackdeeplearning.com/llm-bootcamp. Independent structural mirror in original prose; see references.