Skip to content

Lesson: Launch an LLM app in one hour

Most material about LLMs starts at the model and works outward. This track does it the other way around. You will ship a working application in this lesson, and then learn what makes it actually good across the rest of the track. The bootcamp source opens this way for a reason: building an LLM application is much more accessible than it sounds, and the fastest way to demystify it is to put one in your own hands.

Keep a notebook or a code editor open. You will need a free or paid account with one of the hosted LLM providers and an API key.

Strip a production application down to its essentials and you find a small pipeline:

user input -> prompt template -> hosted model API call -> render the response

Five components show up in every LLM app, and naming them now makes the rest of the track legible:

  1. A hosted model. For the first version, do not host your own; use a provider’s API (Anthropic’s Claude API, or another hosted model). Their API is the thing that does the hard work. You are renting model inference, not running it.
  2. An API key, managed safely. Stored as an environment variable, never committed to source control. This sounds boring; getting it wrong is the most common way an MVP becomes a credentials incident.
  3. A prompt. Often a template with placeholders that get filled with user input. The prompt is the spec; everything else is plumbing.
  4. Application code. A small Python (or other language) script that takes input, fills the template, calls the model, and returns the response. Usually a couple dozen lines.
  5. A UI and a deployment target. The UI is how a user talks to it: a Streamlit or Gradio app, a web frontend, a chat panel, sometimes just a CLI. The deployment target is where it runs: a cloud function, a hosted notebook, a Hugging Face Space, a small server, whatever fits.

Five components, in that order. Every later lesson refines one of them.

To make the shape concrete, here is a working LLM application in about thirty lines, using Anthropic’s Claude API (the same shape works for any hosted-model API). It is a simple Q&A interface: the user types a question, the app sends it to the model with a system prompt, and the model’s response is shown.

app.py
import os
import streamlit as st
from anthropic import Anthropic
# 1. API key from environment, never committed
client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
# 2. Prompt template (the spec)
SYSTEM_PROMPT = (
"You are a helpful, careful assistant. Answer the user's question "
"in two to four sentences. If you do not know, say so plainly."
)
# 3. UI (Streamlit)
st.title("Ask me anything")
question = st.text_input("Your question:")
# 4. The pipeline
if question:
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=400,
system=SYSTEM_PROMPT,
messages=[{"role": "user", "content": question}],
)
st.write(response.content[0].text)

Run it with the streamlit run command and your environment variable set, and you have a working LLM application reachable in a browser. About thirty lines. The Streamlit decorators handle the UI; the Anthropic client handles the API call; your prompt is the spec; your code is the wiring.

This same shape works with other hosted-model APIs by swapping the client and the model name. Pick whichever provider fits your situation; the shape does not change.

The bootcamp’s “in one hour” claim sounds like marketing. It is not. The hosted model does the hard part for you. You are not training, you are not standing up GPUs, you are not running an inference server. The provider has done all of that, and you are calling an HTTPS endpoint with a JSON payload. The work you do is orchestration: write a prompt, wire input to output, deploy somewhere. Those are tractable software-engineering tasks that fit comfortably in an hour even if you have never built an LLM app before.

The shift from “I want to use AI in my product” to “I am using AI in my product” is, mechanically, that small. The depth that follows is in making it good, not in making it exist.

The whole point of the rest of the track is to refine each of the five components. Be honest about what the minimal app is missing:

  • Knowledge. It only knows what is in the model’s pretrained weights plus whatever you put in the prompt. It cannot look anything up. Lesson 4 introduces augmented language models, retrieval, and tool use.
  • Reliable prompting. The system prompt above works for a demo, not a product. Lesson 3 is the prompt-engineering toolkit (clarity, format constraints, few-shot, chain-of-thought, system prompts as specs).
  • A real UX. Streamlit’s defaults are fine for a prototype, not for a product. Lesson 6 is language-user-interface UX: streaming, citations, regeneration, recoverable failure.
  • Observability and evaluation. When this app misbehaves in front of a user you will find out from a support email. Lesson 7 is LLMOps: logging, evaluation in production, prompt versioning, cost and latency monitoring.
  • A real understanding of what is happening inside. “It works” is not “I understand it.” Lesson 2 is the working picture of how an LLM actually generates, the productive limits (context, cost, latency) that bound everything else.

Each future lesson is a refinement of the minimal app you built here.

There is a recurring pattern in technical education where learners stall on foundations and never ship anything. This track’s first lesson refuses that. Shipping a small thing first is a forcing function: the minimal app surfaces every practical question (How do I keep the key safe? Where does this run? Which model do I call? What does the prompt look like?) that the rest of the track teaches you to answer well. It is also the cheapest way to build the confidence that LLM applications are accessible, and the most accurate map of what production work actually is, mostly application engineering on top of a model someone else trained. The rest of the track is refinement on top of this base.

  • An LLM application has five components: a hosted model, an API key (safely managed), a prompt template, application code, and a UI plus deployment target. Every later lesson refines one of these.
  • A minimum-viable app is small. About thirty lines of Python (Streamlit + a model client) is a complete working LLM application in a browser.
  • “In one hour” is honest because the hosted model does the hard part. You orchestrate; the provider serves.
  • The minimum app is intentionally not production-grade: it lacks retrieval (lesson 4), proper prompt engineering (lesson 3), real UX (lesson 6), and observability (lesson 7). Those gaps are the rest of the track.
  • Ship first, then refine. Shipping a small thing surfaces every practical question the rest of the track answers, and is the cheapest way to build confidence that LLM applications are accessible engineering, not magic.
  • The model is the easy part now. Most of the production work is application engineering on top of someone else’s model: orchestration, retrieval, UX, evaluation, monitoring.

You can put a working LLM application in someone’s hands in an afternoon. The rest of this track is how to make it actually good. Ship the minimal version, then refine the five components one at a time, with the production discipline the bootcamp source teaches.