Cheatsheet: Launch an LLM app in one hour
The five components
Section titled “The five components”| # | Component | What it does | Don’t |
|---|---|---|---|
| 1 | Hosted model (provider API) | Inference; the provider serves it | Host your own first |
| 2 | API key | Authenticates your calls | Commit it to source; hard-code it |
| 3 | Prompt template | The spec for behavior | Treat it as an afterthought |
| 4 | Application code | Wires input -> prompt -> API -> response | Hide the wiring in magic |
| 5 | UI + deployment | How users reach it | Skip deployment; it’s the same app on a server |
The pipeline shape (every LLM app)
Section titled “The pipeline shape (every LLM app)”user input -> prompt template -> hosted-model API call -> renderEvery later lesson refines one stage. The shape does not change.
Minimal app: ~30 lines
Section titled “Minimal app: ~30 lines”import osimport streamlit as stfrom anthropic import Anthropic
client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"]) # env var, never committed
SYSTEM_PROMPT = ( "You are a helpful, careful assistant. Answer the user's question " "in two to four sentences. If you do not know, say so plainly.")
st.title("Ask me anything")question = st.text_input("Your question:")
if question: response = client.messages.create( model="claude-sonnet-4-6", max_tokens=400, system=SYSTEM_PROMPT, messages=[{"role": "user", "content": question}], ) st.write(response.content[0].text)Run: streamlit run app.py (with the env var set). Same shape works with other hosted-model APIs by swapping the client and the model name.
Why “in one hour” is honest
Section titled “Why “in one hour” is honest”The provider does: training, inference serving, scaling.You do: orchestration (prompt + wiring + deploy).Orchestration fits in an hour even on a first attempt.
What the minimum app is NOT yet
Section titled “What the minimum app is NOT yet”| Gap | Fixed in |
|---|---|
| No knowledge beyond pretraining + prompt | Lesson 4 (augmented LMs, RAG, tools) |
| No rigorous prompt engineering | Lesson 3 (prompt engineering toolkit) |
| No real UX (streaming, citations, hedging) | Lesson 6 (UX for LUIs) |
| No observability / evaluation in production | Lesson 7 (LLMOps) |
| Thin understanding of the model’s behavior | Lesson 2 (LLM foundations) |
Each gap is a later lesson; the minimum app is the template they all refine.
Common pitfalls (and the fix)
Section titled “Common pitfalls (and the fix)”| Pitfall | Fix |
|---|---|
| API key in code / committed | Env var; .gitignore; secrets manager |
Picking max_tokens too low | Set it generous enough to finish, not so high you waste cost |
| No system prompt | Use one; it is the spec |
| Hard-coding the model name in many places | One constant; one place to swap providers |
The reframing
Section titled “The reframing”The model is the easy part now. The production work is application engineering on top of someone else’s model: orchestration, retrieval, prompts, UX, evaluation, monitoring.
Words to use precisely
Section titled “Words to use precisely”- Hosted model: a provider’s API you call (Anthropic’s Claude API, or another); not a model you host.
- System prompt: the prompt that defines the assistant’s behavior, separate from the user message.
- Application code: the wiring (input -> prompt fill -> API call -> render). Small.
- Deployment target: where the application runs (cloud function, Space, server, etc.).
Source
Section titled “Source”- Full Stack Deep Learning, LLM Bootcamp (Spring 2023): Launch an LLM App in One Hour.
fullstackdeeplearning.com/llm-bootcamp. Independent structural mirror in original prose; see references.