Skip to content

Cheatsheet: Project walkthrough

askFSDL = Q&A over the FSDL course materials. Scoped, source-citing, streamed. The shape of a large class of production LLM apps.

user input
-> prompt template
-> RAG retrieval over scoped corpus (with source labels carried)
-> hosted-model API call (streaming, citations asked)
-> render with citations
StageDecisionPrinciple
Knowledge sourceNarrow, well-known corpusNarrow that works > broad that doesn’t
ChunkingTutorial-shaped chunks + metadata (source, section)Chunk for content’s natural unit; tag for citations + filters
Retrievaltop-k with source labels kept on every chunkCitations are carried, not bolted on
System promptScope-honest, citation-asking, refuses out-of-scopeRefusing OOS is part of the spec, not a failure
User messageRetrieved chunks (with sources) + user question at endEnd-place critical instructions (lesson 3)
GenerationStream + cite + capped max_tokensUX + cost + latency decisions at the call site
LoggingQ + retrieval IDs + prompt version + model + params + response + user feedbackSeed of LLMOps; near-impossible to backfill later

What the walkthrough defers (and where it goes)

Section titled “What the walkthrough defers (and where it goes)”
DeferredLesson
Sophisticated UX (regeneration, hedging, recoverable failure)6
Production observability + evaluation pipelines7
Multiple-tool / agentic flow10

Honest scoping: core well, missing pieces named.

The “five hours, not five weeks” reframing

Section titled “The “five hours, not five weeks” reframing”
A real LLM app of this shape is small:
a few hundred lines of Python pipeline
+ prompts (the spec)
+ indexed corpus
+ a hosted model someone else trained.
The complexity is in the DECISIONS, not the line count.

Read-this-design checklist (use on any LLM app)

Section titled “Read-this-design checklist (use on any LLM app)”
  • Is the knowledge source scoped enough to retrieve over well?
  • Are chunks sized for the content’s natural unit, with metadata?
  • Does the source label travel through retrieval into the prompt?
  • Is the system prompt scope-honest (refuses out-of-scope) and citation-asking?
  • Are streaming + citations decided at the call site and rendered in the UI?
  • Are the 5-10 logging fields in place (Q, retrieval IDs, prompt version, model+params, response, user feedback)?
  • Is max_tokens capped deliberately?
  • Has retrieval been evaluated on a held-out set (separate from end-to-end answer quality)?
  • Scope-honest system prompt: answers in-scope, cites sources, refuses out-of-scope plainly.
  • Source-carrying retrieval: every chunk keeps its source label through the entire pipeline.
  • Citation discipline: the model is asked to cite which chunks it used; UI renders citations as links back to source.
  • Production-decision eye: the ability to see the deliberate choices baked into a real app’s pipeline stages.
  • Full Stack Deep Learning, LLM Bootcamp (Spring 2023): Project Walkthrough (askFSDL). fullstackdeeplearning.com/llm-bootcamp. Independent structural mirror in original prose; see references.