Summary: UX for language user interfaces

A language user interface (LUI) is a fundamentally different interaction surface from forms and buttons, and the conventional UX toolkit applies poorly. The five core patterns that separate LLM apps people use from ones that technically work: streaming (tokens as they arrive, with a thinking indicator and a Stop button; the cheapest and largest single UX win); citations (sourced answers become evidence to verify, calibrate trust, and debug); regeneration (treat the model’s non-determinism as a feature; offer a Regenerate button and ideally branching from edited prior messages); hedging (prompt the model to acknowledge uncertainty when context is thin; render uncertainty visibly; uncomfortable for designers, exactly the discipline that builds long-term trust); recoverable failure (legible failure messages + recovery actions + preserved input + full logs; failing well, not red-banner-and-lost-input). Supporting details that lift quality: strong input affordances, markdown rendering, code blocks with copy, conversation persistence, latency-masking status lines. Taught as interaction-design throughout; content-policy and moderation questions are out of scope.

Core ideas

LUIs are a new interaction surface. Streaming, citations, regeneration, hedging, recoverable failure are core; form-and-button UX applies poorly.
Streaming = biggest cheapest UX win. Render tokens as they arrive; show a thinking indicator before TTFT; provide a Stop button.
Citations turn claims into evidence. Verifiable, debuggable, trust-calibrating. Rendering style is brand; that they are real and traceable is not optional.
Regeneration uses non-determinism. A Regenerate button (and ideally branching from edited prior messages) handles “first answer wasn’t quite right” without forced rephrase.
Hedging is uncomfortable and right. Prompt the model to hedge on thin context; render uncertainty visibly; honest uncertainty beats confident wrong over time.
Recoverable failure fails well. Legible failure messages, recovery actions, preserved input, full logs. Multi-modal failures (timeout vs retrieval miss vs refusal vs malformed) each need their own message and action.
Supporting details: input affordances + example questions, markdown rendering, code blocks with copy, conversation persistence, latency-masking status lines, skeleton UI.
Out of scope: content policy, moderation, labeling AI-generated content, similar policy questions. Real but require their own framing in their own forum.

What changes for you

Most of what distinguishes “I shipped an LLM app” from “people actually use the LLM app I shipped” is the patterns in this lesson. Streaming alone takes a five-second response from “slow” to “fast” without changing model code. Citations move “trust me” to “verify yourself” with a perceived quality difference that is large. Hedging makes users trust the application more because it tells the truth when it does not know. Recoverable failure separates an application a user returns to from one they uninstall after the first error. None of this requires more model capability or compute; it requires interaction-design discipline applied at the application layer. The next lesson, LLMOps, is the operational discipline that keeps all of this working over time.

A LUI is a new surface, and the discipline that makes one usable is interaction design applied at the application layer, not more model capability. Streaming, citations, regeneration, hedging, and recoverable failure separate the LLM apps people return to from the ones they uninstall.