AI Streaming Chat UI

⬅️ Day 6: Grow Overview

If you're building a B2B SaaS in 2026 with an LLM-powered chat interface — copilot, support agent, code assistant, customer-facing AI — the chat UI is one of the most-visible parts of your product. Users compare to ChatGPT, Claude, and Cursor. The naive approach: + send button + render full responses. The structured approach: token streaming with cursor, message-list virtualization, tool-call rendering, regenerate / edit / branching, attachments, error recovery, accessibility. AI chat UX is increasingly a commodity expectation; failing to nail it makes your product feel dated. (See <code>ai-features-implementation-chat.md</code> for AI feature strategy; this guide covers chat UI craft.)</p> <h2>1. Pick chat library / framework</h2> <pre><code class="language-text">Pick AI chat stack. Vercel AI SDK (recommended for Next.js): - @ai-sdk/react: useChat hook - @ai-sdk/* providers - Streaming + tool calls + structured output - Best-in-class for Next.js - See vercel-ai-sdk skill Vercel Chat SDK: - Higher-level: full chat experience - Multi-platform (Slack, Telegram, Teams, Discord, GitHub, Linear) - Built on AI SDK - See vercel:chat-sdk skill LangChain.js: - Generic JS LLM framework - Chat-specific helpers - Heavier; more options Custom (DIY): - fetch + ReadableStream + SSE / WebSocket - Full control - Most work Components: - shadcn-ui Chat (preview / community) - assistant-ui (open-source chat components) - llamaindex Chat UI For 2026 React stack: - Vercel AI SDK + assistant-ui OR shadcn-chat for components - Best balance of control + speed Output: 1. Stack recommendation 2. Library choices 3. Custom components vs library 4. Bundle size 5. SSR / streaming considerations </code></pre> <p>The 2026 default for Next.js: Vercel AI SDK + assistant-ui. Streaming, tool calls, attachments, message persistence — all handled.</p> <h2>2. Token streaming — the table-stakes UX</h2> <p>Without streaming, AI chat feels broken in 2026.</p> <pre><code class="language-text">Implement token streaming. Streaming protocols: Server-Sent Events (SSE): - HTTP-based; one-way (server → client) - Works through HTTP/2 - Simple; widely supported - Vercel AI SDK default WebSocket: - Full duplex - More overhead - Better for bidirectional (rare for chat) Client implementation (with AI SDK): import { useChat } from '@ai-sdk/react'; function Chat() { const { messages, input, handleInputChange, handleSubmit, isLoading } = useChat({ api: '/api/chat' }); return ( <> {messages.map(m => ( <Message key={m.id} role={m.role} content={m.content} /> ))} <form onSubmit={handleSubmit}> <input value={input} onChange={handleInputChange} disabled={isLoading} /> </form> </> ); } Server (Next.js Route Handler): import { streamText } from 'ai'; export async function POST(req: Request) { const { messages } = await req.json(); const result = streamText({ model: 'anthropic/claude-sonnet-4-6', // via Vercel AI Gateway messages, }); return result.toDataStreamResponse(); } Cursor / typing indicator: - During streaming: show blinking cursor at end of last message - "▋" character after streaming text - CSS animation: 1s blink Smooth scrolling: - Auto-scroll to bottom as tokens arrive - Pause if user scrolled up (don't fight them) - Resume on new message Render markdown progressively: - See markdown-rendering-sanitization-chat - Auto-close incomplete syntax (** without close) - Re-render on each chunk (memo previous messages) Performance: - Memoize all but actively-streaming message - Avoid re-rendering message list on each token Output: 1. Streaming protocol (SSE recommended) 2. Library setup 3. Cursor / typing indicator 4. Smooth scroll handling 5. Markdown rendering </code></pre> <p>The blinking-cursor detail: small touch; signals "actively generating." Without it, users can't tell if it's done. Take the 5 minutes to add it.</p> <h2>3. Message types — beyond plain text</h2> <p>Modern AI chat has many message types.</p> <pre><code class="language-text">Render message types. Text messages: - User: right-aligned bubble - Assistant: left-aligned; markdown rendered - System: special styling (rare to show) Code blocks: - Syntax highlighting (Prism / Shiki) - Language detection - Copy button (top-right of block) - Possibly: inline run (Pyodide, etc.) Tool calls: - "Calling tool: search('rate limiting')" - Show with icon + status (running / done / error) - Collapsible to see args + output - Useful for transparency Tool results: - "Found 5 results" - Structured display (table, list, etc.) Function calls (legacy): - Similar to tool calls Images: - User-uploaded: show inline - AI-generated: render with caption - Modal on click for fullscreen Files: - File card: name + size + type - Click to download or preview Charts / visualizations: - AI-generated chart - Embedded as image or interactive component Citations: - "Source: [Doc Name]" with links - Hover for excerpt - See AI customer support agent Suggestions / quick replies: - 2-3 button suggestions below assistant message - "Tell me more" / "Show example" / "Done" Errors: - "Failed to generate" with retry button - Error type badge Streaming-status: - "Searching docs..." (during tool call) - "Generating response..." For [USE CASE], output: 1. Message types you'll render 2. Per-type component 3. Tool-call rendering pattern 4. Citation display 5. Mobile considerations </code></pre> <p>The tool-call transparency rule: show users when AI is using a tool. Hidden tool calls feel magical when they work; mysterious when they fail. Transparency builds trust.</p> <h2>4. Message list — virtualization + memoization</h2> <p>Chat history can grow large. Plan performance.</p> <pre><code class="language-text">Optimize message list rendering. Virtualization: - Don't render off-screen messages - Use react-virtuoso or react-window - Especially important for 100+ message threads Memoization: - Each message component memoized - Re-render only the actively-streaming message - Avoid layout thrash Smooth scroll: Anchor scroll to bottom: - New message → scroll to bottom - User scrolled up → don't auto-scroll (let them read) - "New messages" indicator if scrolled up Scroll restore: - Returning to chat: restore scroll position - Or: scroll to bottom Lazy load history: - Initial load: latest 50 messages - Scroll up → fetch older 50 - Cursor pagination Persistence: - Messages saved to DB per conversation - Conversation list in sidebar - See multi-tenancy Optimistic UI: - User sends message → appears immediately - AI response streams in - On error: revert + show retry Output: 1. Virtualization setup 2. Memoization strategy 3. Smooth-scroll behavior 4. Lazy-load history 5. Optimistic update </code></pre> <p>The virtuoso library is 2026 default for chat UIs. Handles auto-scroll, anchor at bottom, virtualization, and lazy load gracefully.</p> <h2>5. Input UX — composing messages</h2> <p>The input matters as much as the output.</p> <pre><code class="language-text">Design chat input. Input behavior: Auto-resize textarea: - Grows with content (1-N lines) - react-textarea-autosize library - Max-height before scroll Multi-line: - Enter sends (default for chat) - Shift+Enter for newline - Cmd+Enter alternative Send button: - Right side; disabled when empty / loading - Replaces with stop button during streaming Stop / cancel: - Stop streaming response - AI SDK: stop() function - User regret: "Wait, that's not right" Keyboard shortcuts: - Up arrow: edit last message - Esc: clear input Attachments: - File picker button (paperclip icon) - Drag-drop onto chat - Show attachment thumbnails above input - Remove (X) on attachment Slash commands (advanced): - /clear, /summarize, /search - Show menu when "/" typed Mentions: - @user (in team chat) - @doc / @file (Notion-style references) Persona / model picker: - Dropdown to choose model (GPT-4o vs Claude vs custom persona) - For products with multiple options Voice input: - Microphone icon - Web Speech API or Whisper Mobile: - Auto-focus on tap - Native keyboard - Send button enlarged for thumb - File picker uses native sheet Output: 1. Input component 2. Send / stop logic 3. Attachment handling 4. Keyboard shortcuts 5. Mobile UX </code></pre> <p>The Cmd+Enter alternative: power users prefer Cmd+Enter over Enter. Some products: Enter sends; Cmd+Enter sends. Some products: opposite. Pick a default; allow toggle.</p> <h2>6. Regenerate, edit, and branching</h2> <p>Beyond send-and-receive, modern chat supports:</p> <pre><code class="language-text">Implement regenerate / edit / branch. Regenerate: - Button below assistant message - Generates new response with same input - Replaces current OR appends as variant - Useful when response is bad Edit user message: - Click pencil on user message - Edit + resubmit - Re-runs from that point (deletes downstream) Branching (advanced): - Edit creates new branch - Original branch preserved - UI: dropdown to switch branches - Used by: ChatGPT, Claude Implementation: Server side: - POST /chat with messages array including edited message - Streaming response from edit point - Old branch optionally archived UI: - Per-message hover actions (edit / regenerate / copy) - Branch indicator if multiple variants - Switch between variants State management: - conversation has many branches - Each branch is a path through messages - Tree structure or linear with parent_id Anti-patterns: - No regenerate (users stuck with bad response) - Edit without proper history - Confusing branching UX Output: 1. Regenerate implementation 2. Edit-and-resubmit flow 3. Branching (if applicable) 4. State model 5. UI for switching branches </code></pre> <p>The regenerate pattern is now table-stakes. Every modern chat UI has it. Without it, users feel stuck with bad responses.</p> <h2>7. Attachments + file handling</h2> <p>Users upload files; AI processes them.</p> <pre><code class="language-text">Handle file attachments. Upload UX: - Drag-drop on chat area - Paperclip icon for file picker - Paste image from clipboard - Show thumbnails before send File types: - Images: thumbnail + full view on click - PDFs: icon + filename + size - Other: icon + filename + size Preview: - Hover over thumbnail: tooltip with name - Click: modal for fullsize image / PDF preview - See file-preview-document-viewer-chat Server processing: - Image: send to vision LLM (Claude / GPT-4o) - PDF: extract text first (see document-parsing-ocr) - Other: embed via RAG Limits: - File size (5-50 MB depending on context) - File type allowlist - Per-message limit (e.g., 5 attachments) Inline rendering: - AI can reference: "I see in the image..." - Show attachment inline in user message Errors: - Upload failed: retry - File too large: clear error - Unsupported type: clear message Multi-file: - Allow selecting multiple - Show all thumbnails - Bulk remove Mobile: - Camera capture for images - File picker for documents - Native sheet UX Output: 1. Upload UX 2. File-type handling 3. Preview component 4. Server processing pipeline 5. Mobile considerations </code></pre> <p>The image-paste-from-clipboard: power users love it. Cmd+V → screenshot pastes into chat. Easy to implement; high satisfaction.</p> <h2>8. Error states + recovery</h2> <p>AI fails. UX fails gracefully.</p> <pre><code class="language-text">Handle AI chat errors. Error types: Rate limit: - "Too many requests. Wait 30s." - Auto-retry after delay - Show countdown Model unavailable: - "Service temporarily unavailable" - Fallback to alternative model (if configured) - Retry button Context too long: - "Conversation too long; start fresh" - Suggest: summarize + new chat - Or auto-truncate older messages (with notice) Inappropriate content (filter triggered): - "Can't help with that request" - Per-policy explanation - Don't be hostile Network failure: - "Network error; retry" - Auto-retry with backoff - Cache user message; don't lose Streaming interrupted: - "Response cut off" - Continue / regenerate options Generic 500: - Apology + retry - Log to error monitoring (Sentry) - Don't expose internal errors UI patterns: - Error in-message (replaces "..." indicator) - Or: toast for transient - Retry button visible User input preservation: - Failure doesn't lose user's message - Repopulate input or keep visible Output: 1. Error type taxonomy 2. Per-type handling 3. Retry / fallback logic 4. Input preservation 5. Logging / observability </code></pre> <p>The "preserve user message on error" rule: if AI errors, user's message stays in input or visible. Otherwise they retype the whole thing. Frustrating.</p> <h2>9. Performance + cost</h2> <p>AI chat is expensive. Optimize.</p> <pre><code class="language-text">Optimize chat performance + cost. Server-side: Stream from start: - Don't buffer; pipe LLM stream to client - First-token latency matters Caching: - Cache identical prompts (rare for chat) - Cache tool results (search queries) - Use Vercel Runtime Cache Context management: - Don't send unbounded history - Truncate old messages OR - Summarize older context (memory) - Token-count aware Tool calls: - Parallel where possible - Timeout long-running tools - Stream partial results if possible Provider routing: - Vercel AI Gateway: failover across providers - Cheap model for simple queries; expensive for complex Cost optimization: - Per-user quotas (see quotas-limits-plan-enforcement) - Track tokens per request - Alert on cost spikes Caching at API level: - Anthropic prompt caching: save 90% on repeated context - 5-min TTL; ideal for system prompts - See claude-api skill Frontend: Bundle size: - AI SDK + assistant-ui ≈ 60KB - Reasonable Memoization: - Memo all messages except streaming - useMemo on message list Render budget: - 60fps during streaming - Profile if slow Network: - Compression on stream - HTTP/2 multiplexing Output: 1. Server streaming setup 2. Context management 3. Provider routing 4. Cost monitoring 5. Frontend perf </code></pre> <p>The Anthropic prompt caching insight: huge saving on repeated system prompts. If you have 5K-token system prompt, prompt cache can drop costs by 90%. Use it.</p> <h2>10. Accessibility</h2> <p>Chat UIs are notoriously inaccessible. Don't be.</p> <pre><code class="language-text">Make AI chat accessible. Required: Screen reader: - Each message has role (user / assistant) - aria-live="polite" on chat region for new messages - Streaming text: chunk announcements (don't spam SR with every token) Keyboard: - Tab through messages - Send on Enter - Esc to close - Arrow keys to navigate messages Focus: - After send: focus stays in input - After AI response: focus stays in input (most users) - Or: focus on response (some prefer) Color independence: - Don't rely on color alone (user vs assistant) - Use position + role labels Motion: - prefers-reduced-motion: disable cursor blink - Or: keep but slow Voice (optional): - Read responses aloud - Web Speech API or Cloud TTS Errors: - Announce via aria-live="assertive" - Clear actionable message Test: - VoiceOver / NVDA - Keyboard-only - Lighthouse / axe-core Common failures: - Streaming text: SR announces every token (overwhelming) - No way to skip past long messages - Focus lost after send Output: 1. ARIA pattern 2. Keyboard navigation 3. Streaming announcements (debounce) 4. Reduced motion 5. Test plan </code></pre> <p>The streaming-announcement throttle: announce every 50-200 chars to screen readers, not every token. Otherwise overwhelming.</p> <h2>What Done Looks Like</h2> <p>A v1 AI streaming chat UI for B2B SaaS in 2026:</p> <ul class="contains-task-list"> <li class="task-list-item"><input type="checkbox" disabled> Vercel AI SDK + assistant-ui or shadcn-chat</li> <li class="task-list-item"><input type="checkbox" disabled> Token streaming with cursor indicator</li> <li class="task-list-item"><input type="checkbox" disabled> Markdown rendering (sanitized)</li> <li class="task-list-item"><input type="checkbox" disabled> Tool-call display (transparency)</li> <li class="task-list-item"><input type="checkbox" disabled> Code block with syntax highlight + copy</li> <li class="task-list-item"><input type="checkbox" disabled> Auto-resize input with send + stop</li> <li class="task-list-item"><input type="checkbox" disabled> Regenerate + edit user message</li> <li class="task-list-item"><input type="checkbox" disabled> File attachments with previews</li> <li class="task-list-item"><input type="checkbox" disabled> Error states with retry + input preservation</li> <li class="task-list-item"><input type="checkbox" disabled> Message list virtualization</li> <li class="task-list-item"><input type="checkbox" disabled> Smooth scroll (don't fight user)</li> <li class="task-list-item"><input type="checkbox" disabled> Accessibility (ARIA + keyboard + reduced motion)</li> <li class="task-list-item"><input type="checkbox" disabled> Mobile-friendly UX</li> </ul> <p>Add later when product is mature:</p> <ul class="contains-task-list"> <li class="task-list-item"><input type="checkbox" disabled> Branching conversations</li> <li class="task-list-item"><input type="checkbox" disabled> Voice input / output</li> <li class="task-list-item"><input type="checkbox" disabled> Slash commands</li> <li class="task-list-item"><input type="checkbox" disabled> @mentions for docs / users</li> <li class="task-list-item"><input type="checkbox" disabled> Multi-model picker</li> <li class="task-list-item"><input type="checkbox" disabled> Citations + sources</li> <li class="task-list-item"><input type="checkbox" disabled> Inline charts / visualizations</li> <li class="task-list-item"><input type="checkbox" disabled> Real-time collaboration</li> </ul> <p>The mistake to avoid: <strong>non-streaming responses</strong>. Feels broken in 2026; users wait 30s for full response.</p> <p>The second mistake: <strong>no regenerate button</strong>. Users stuck with bad responses.</p> <p>The third mistake: <strong>lose user input on error</strong>. Frustrating; retypes.</p> <h2>See Also</h2> <ul> <li><a href="/grow/ai-features-implementation-chat">AI Features Implementation</a> — strategy (companion)</li> <li><a href="/grow/rag-implementation-chat">RAG Implementation</a> — RAG-backed chat</li> <li><a href="/grow/markdown-rendering-sanitization-chat">Markdown Rendering & Sanitization</a> — render LLM output</li> <li><a href="/grow/llm-cost-optimization-chat">LLM Cost Optimization</a> — cost</li> <li><a href="/grow/llm-quality-monitoring-chat">LLM Quality Monitoring</a> — quality</li> <li><a href="/grow/comments-threading-mentions-chat">Comments, Threading & @Mentions</a> — adjacent UX</li> <li><a href="/grow/file-preview-document-viewer-chat">File Preview & Document Viewer</a> — attachment preview</li> <li><a href="/grow/file-uploads-chat">File Uploads</a> — upload pipeline</li> <li><a href="/grow/empty-states-loading-error-states-chat">Empty States, Loading & Error States</a> — error states</li> <li><a href="/grow/toast-notifications-ui-chat">Toast Notifications UI</a> — error toasts</li> <li><a href="/grow/real-time-collaboration-chat">Real-Time Collaboration</a> — adjacent realtime</li> <li><a href="/grow/websocket-sse-implementation-chat">WebSocket / SSE Implementation</a> — streaming protocols</li> <li><a href="/grow/performance-optimization-chat">Performance Optimization</a> — perf</li> <li><a href="/grow/quotas-limits-plan-enforcement-chat">Quotas, Limits & Plan Enforcement</a> — usage limits</li> <li><a href="https://vibereference.dev/ai-development/ai-sdk">VibeReference: AI SDK</a> — Vercel AI SDK</li> <li><a href="https://vibereference.dev/ai-development/ai-sdk-ui">VibeReference: AI SDK UI</a> — UI hooks</li> <li><a href="https://vibereference.dev/ai-development/ai-sdk-core">VibeReference: AI SDK Core</a> — Core</li> <li><a href="https://vibereference.dev/ai-models/claude">VibeReference: Anthropic Claude</a> — Claude</li> <li><a href="https://vibereference.dev/ai-models/openai-gpt">VibeReference: OpenAI GPT</a> — GPT</li> <li><a href="https://vibereference.dev/cloud-and-hosting/ai-gateways">VibeReference: AI Gateways</a> — AI gateway</li> <li><a href="https://vibereference.dev/cloud-and-hosting/vercel-ai-gateway">VibeReference: Vercel AI Gateway</a> — Vercel AI Gateway</li> <li><a href="https://vibereference.dev/ai-development/ai-customer-support-agents">VibeReference: AI Customer Support Agents</a> — support-agent integration</li> <li><a href="https://vibereference.dev/frontend/shadcn">VibeReference: shadcn/ui</a> — components</li> </ul> </article></div></div></main><footer class="border-t border-gray-100 mt-20 px-6 lg:px-8"><div class="max-w-7xl mx-auto py-8 text-sm text-gray-400"><div class="flex flex-col sm:flex-row items-center justify-between gap-4"><span>Built by <a href="https://www.buildadataadvantage.com" target="_blank" rel="noopener noreferrer" class="hover:text-gray-600 transition-colors">Data Advantage</a></span><a href="https://launchweek.ai" target="_blank" rel="noopener noreferrer" class="inline-flex items-center gap-1.5 text-violet-500 hover:text-violet-700 font-medium transition-colors">You built it. Now launch it → launchweek.ai</a></div></div></footer><template data-dgst="BAILOUT_TO_CLIENT_SIDE_RENDERING"></template><script src="/_next/static/chunks/0tqtjom54jba3.js?dpl=dpl_9VWDqKXn4KP9nKdssSGmN1FeMeJo" id="_R_" async=""></script><script>(self.__next_f=self.__next_f||[]).push([0])</script><script>self.__next_f.push([1,"1:\"$Sreact.fragment\"\n2:I[22016,[\"/_next/static/chunks/09sdbivtpgpue.js?dpl=dpl_9VWDqKXn4KP9nKdssSGmN1FeMeJo\",\"/_next/static/chunks/0d3shmwh5_nmn.js?dpl=dpl_9VWDqKXn4KP9nKdssSGmN1FeMeJo\"],\"\"]\n3:I[39756,[\"/_next/static/chunks/09sdbivtpgpue.js?dpl=dpl_9VWDqKXn4KP9nKdssSGmN1FeMeJo\",\"/_next/static/chunks/0d3shmwh5_nmn.js?dpl=dpl_9VWDqKXn4KP9nKdssSGmN1FeMeJo\"],\"default\"]\n4:I[37457,[\"/_next/static/chunks/09sdbivtpgpue.js?dpl=dpl_9VWDqKXn4KP9nKdssSGmN1FeMeJo\",\"/_next/static/chunks/0d3shmwh5_nmn.js?dpl=dpl_9VWDqKXn4KP9nKdssSGmN1FeMeJo\"],\"default\"]\nd:I[68027,[\"/_next/static/chunks/09sdbivtpgpue.js?dpl=dpl_9VWDqKXn4KP9nKdssSGmN1FeMeJo\",\"/_next/static/chunks/0d3shmwh5_nmn.js?dpl=dpl_9VWDqKXn4KP9nKdssSGmN1FeMeJo\"],\"default\",1]\n:HL[\"/_next/static/chunks/0bmh5g.n12b_v.css?dpl=dpl_9VWDqKXn4KP9nKdssSGmN1FeMeJo\",\"style\"]\n:HL[\"/_next/static/media/797e433ab948586e-s.p.0.q-h669a_dqa.woff2?dpl=dpl_9VWDqKXn4KP9nKdssSGmN1FeMeJo\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff2\"}]\n:HL[\"/_next/static/media/caa3a2e1cccd8315-s.p.16t1db8_9y2o~.woff2?dpl=dpl_9VWDqKXn4KP9nKdssSGmN1FeMeJo\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff2\"}]\n"])</script><script>self.__next_f.push([1,"0:{\"P\":null,\"c\":[\"\",\"grow\",\"ai-streaming-chat-ui-chat\"],\"q\":\"\",\"i\":false,\"f\":[[[\"\",{\"children\":[[\"section\",\"grow\",\"d\",null],{\"children\":[[\"topic\",\"ai-streaming-chat-ui-chat\",\"d\",null],{\"children\":[\"__PAGE__\",{}]}]}]},\"$undefined\",\"$undefined\",16],[[\"$\",\"$1\",\"c\",{\"children\":[[[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/_next/static/chunks/0bmh5g.n12b_v.css?dpl=dpl_9VWDqKXn4KP9nKdssSGmN1FeMeJo\",\"precedence\":\"next\",\"crossOrigin\":\"$undefined\",\"nonce\":\"$undefined\"}],[\"$\",\"script\",\"script-0\",{\"src\":\"/_next/static/chunks/09sdbivtpgpue.js?dpl=dpl_9VWDqKXn4KP9nKdssSGmN1FeMeJo\",\"async\":true,\"nonce\":\"$undefined\"}],[\"$\",\"script\",\"script-1\",{\"src\":\"/_next/static/chunks/0d3shmwh5_nmn.js?dpl=dpl_9VWDqKXn4KP9nKdssSGmN1FeMeJo\",\"async\":true,\"nonce\":\"$undefined\"}]],[\"$\",\"html\",null,{\"lang\":\"en\",\"children\":[\"$\",\"body\",null,{\"className\":\"geist_deef94d5-module__Sms4YG__variable geist_mono_1bf8cbf6-module__FlyLvG__variable min-h-screen bg-white text-gray-900 antialiased font-sans\",\"children\":[[\"$\",\"header\",null,{\"className\":\"border-b border-gray-100 sticky top-0 bg-white/95 backdrop-blur z-10 px-6 lg:px-8\",\"children\":[\"$\",\"div\",null,{\"className\":\"max-w-7xl mx-auto flex items-center justify-between h-14\",\"children\":[[\"$\",\"$L2\",null,{\"href\":\"/\",\"className\":\"font-semibold text-gray-900 hover:text-gray-600 transition-colors\",\"children\":\"VibeWeek\"}],[\"$\",\"nav\",null,{\"className\":\"flex items-center gap-1 overflow-x-auto\",\"children\":[[\"$\",\"$L2\",\"create\",{\"href\":\"/create\",\"className\":\"text-sm text-gray-500 hover:text-gray-900 px-2 py-1 rounded hover:bg-gray-50 transition-colors whitespace-nowrap\",\"children\":\"Day 1\"}],[\"$\",\"$L2\",\"refine\",{\"href\":\"/refine\",\"className\":\"text-sm text-gray-500 hover:text-gray-900 px-2 py-1 rounded hover:bg-gray-50 transition-colors whitespace-nowrap\",\"children\":\"Day 2\"}],[\"$\",\"$L2\",\"build\",{\"href\":\"/build\",\"className\":\"text-sm text-gray-500 hover:text-gray-900 px-2 py-1 rounded hover:bg-gray-50 transition-colors whitespace-nowrap\",\"children\":\"Day 3\"}],[\"$\",\"$L2\",\"position\",{\"href\":\"/position\",\"className\":\"text-sm text-gray-500 hover:text-gray-900 px-2 py-1 rounded hover:bg-gray-50 transition-colors whitespace-nowrap\",\"children\":\"Day 4\"}],[\"$\",\"$L2\",\"launch\",{\"href\":\"/launch\",\"className\":\"text-sm text-gray-500 hover:text-gray-900 px-2 py-1 rounded hover:bg-gray-50 transition-colors whitespace-nowrap\",\"children\":\"Day 5\"}],[\"$\",\"$L2\",\"grow\",{\"href\":\"/grow\",\"className\":\"text-sm text-gray-500 hover:text-gray-900 px-2 py-1 rounded hover:bg-gray-50 transition-colors whitespace-nowrap\",\"children\":\"Grow\"}]]}]]}]}],[\"$\",\"main\",null,{\"className\":\"px-6 lg:px-8 py-10\",\"children\":[\"$\",\"div\",null,{\"className\":\"max-w-7xl mx-auto\",\"children\":[\"$\",\"$L3\",null,{\"parallelRouterKey\":\"children\",\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L4\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":[[[\"$\",\"title\",null,{\"children\":\"404: This page could not be found.\"}],[\"$\",\"div\",null,{\"style\":{\"fontFamily\":\"system-ui,\\\"Segoe UI\\\",Roboto,Helvetica,Arial,sans-serif,\\\"Apple Color Emoji\\\",\\\"Segoe UI Emoji\\\"\",\"height\":\"100vh\",\"textAlign\":\"center\",\"display\":\"flex\",\"flexDirection\":\"column\",\"alignItems\":\"center\",\"justifyContent\":\"center\"},\"children\":[\"$\",\"div\",null,{\"children\":[[\"$\",\"style\",null,{\"dangerouslySetInnerHTML\":{\"__html\":\"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}\"}}],[\"$\",\"h1\",null,{\"className\":\"next-error-h1\",\"style\":{\"display\":\"inline-block\",\"margin\":\"0 20px 0 0\",\"padding\":\"0 23px 0 0\",\"fontSize\":24,\"fontWeight\":500,\"verticalAlign\":\"top\",\"lineHeight\":\"49px\"},\"children\":404}],[\"$\",\"div\",null,{\"style\":{\"display\":\"inline-block\"},\"children\":[\"$\",\"h2\",null,{\"style\":{\"fontSize\":14,\"fontWeight\":400,\"lineHeight\":\"49px\",\"margin\":0},\"children\":\"This page could not be found.\"}]}]]}]}]],[]],\"forbidden\":\"$undefined\",\"unauthorized\":\"$undefined\"}]}]}],[\"$\",\"footer\",null,{\"className\":\"border-t border-gray-100 mt-20 px-6 lg:px-8\",\"children\":[\"$\",\"div\",null,{\"className\":\"max-w-7xl mx-auto py-8 text-sm text-gray-400\",\"children\":[\"$\",\"div\",null,{\"className\":\"flex flex-col sm:flex-row items-center justify-between gap-4\",\"children\":[\"$L5\",\"$L6\"]}]}]}],\"$L7\"]}]}]]}],{\"children\":[\"$L8\",{\"children\":[\"$L9\",{\"children\":[\"$La\",{},null,false,null]},null,false,\"$@b\"]},null,false,\"$@b\"]},null,false,null],\"$Lc\",false]],\"m\":\"$undefined\",\"G\":[\"$d\",[\"$Le\"]],\"S\":true,\"h\":null,\"s\":\"$undefined\",\"l\":\"$undefined\",\"p\":\"$undefined\",\"d\":\"$undefined\"}\n"])</script><script>self.__next_f.push([1,"f:I[2355,[\"/_next/static/chunks/09sdbivtpgpue.js?dpl=dpl_9VWDqKXn4KP9nKdssSGmN1FeMeJo\",\"/_next/static/chunks/0d3shmwh5_nmn.js?dpl=dpl_9VWDqKXn4KP9nKdssSGmN1FeMeJo\"],\"Analytics\"]\n11:I[97367,[\"/_next/static/chunks/09sdbivtpgpue.js?dpl=dpl_9VWDqKXn4KP9nKdssSGmN1FeMeJo\",\"/_next/static/chunks/0d3shmwh5_nmn.js?dpl=dpl_9VWDqKXn4KP9nKdssSGmN1FeMeJo\"],\"OutletBoundary\"]\n12:\"$Sreact.suspense\"\n15:I[97367,[\"/_next/static/chunks/09sdbivtpgpue.js?dpl=dpl_9VWDqKXn4KP9nKdssSGmN1FeMeJo\",\"/_next/static/chunks/0d3shmwh5_nmn.js?dpl=dpl_9VWDqKXn4KP9nKdssSGmN1FeMeJo\"],\"ViewportBoundary\"]\n17:I[97367,[\"/_next/static/chunks/09sdbivtpgpue.js?dpl=dpl_9VWDqKXn4KP9nKdssSGmN1FeMeJo\",\"/_next/static/chunks/0d3shmwh5_nmn.js?dpl=dpl_9VWDqKXn4KP9nKdssSGmN1FeMeJo\"],\"MetadataBoundary\"]\n5:[\"$\",\"span\",null,{\"children\":[\"Built by\",\" \",[\"$\",\"a\",null,{\"href\":\"https://www.buildadataadvantage.com\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"className\":\"hover:text-gray-600 transition-colors\",\"children\":\"Data Advantage\"}]]}]\n6:[\"$\",\"a\",null,{\"href\":\"https://launchweek.ai\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"className\":\"inline-flex items-center gap-1.5 text-violet-500 hover:text-violet-700 font-medium transition-colors\",\"children\":\"You built it. Now launch it → launchweek.ai\"}]\n7:[\"$\",\"$Lf\",null,{}]\n8:[\"$\",\"$1\",\"c\",{\"children\":[null,[\"$\",\"$L3\",null,{\"parallelRouterKey\":\"children\",\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L4\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":\"$undefined\",\"forbidden\":\"$undefined\",\"unauthorized\":\"$undefined\"}]]}]\n9:[\"$\",\"$1\",\"c\",{\"children\":[null,[\"$\",\"$L3\",null,{\"parallelRouterKey\":\"children\",\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L4\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":\"$undefined\",\"forbidden\":\"$undefined\",\"unauthorized\":\"$undefined\"}]]}]\na:[\"$\",\"$1\",\"c\",{\"children\":[\"$L10\",null,[\"$\",\"$L11\",null,{\"children\":[\"$\",\"$12\",null,{\"name\":\"Next.MetadataOutlet\",\"children\":\"$@13\"}]}]]}]\n14:[]\nb:\"$W14\"\nc:[\"$\",\"$1\",\"h\",{\"children\":[null,[\"$\",\"$L15\",null,{\"children\":\"$L16\"}],[\"$\",\"div\",null,{\"hidden\":true,\"children\":[\"$\",\"$L17\",null,{\"children\":[\"$\",\"$12\",null,{\"name\":\"Next.Metadata\",\"children\":\"$L18\"}]}]}],[\"$\",\"meta\",null,{\"name\":\"next-size-adjust\",\"content\":\"\"}]]}]\ne:[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/_next/static/chunks/0bmh5g.n12b_v.css?dpl=dpl_9VWDqKXn4KP9nKdssSGmN1FeMeJo\",\"precedence\":\"next\",\"crossOrigin\":\"$undefined\",\"nonce\":\"$undefined\"}]\n"])</script><script>self.__next_f.push([1,"19:T5184,"])</script><script>self.__next_f.push([1,"\u003ch1\u003eAI Streaming Chat UI\u003c/h1\u003e\n\u003cp\u003e\u003ca href=\"../6-grow/\"\u003e⬅️ Day 6: Grow Overview\u003c/a\u003e\u003c/p\u003e\n\u003cp\u003eIf you're building a B2B SaaS in 2026 with an LLM-powered chat interface — copilot, support agent, code assistant, customer-facing AI — the chat UI is one of the most-visible parts of your product. Users compare to ChatGPT, Claude, and Cursor. The naive approach: \u003ctextarea\u003e + send button + render full responses. The structured approach: token streaming with cursor, message-list virtualization, tool-call rendering, regenerate / edit / branching, attachments, error recovery, accessibility. AI chat UX is increasingly a commodity expectation; failing to nail it makes your product feel dated. (See \u003ccode\u003eai-features-implementation-chat.md\u003c/code\u003e for AI feature strategy; this guide covers chat UI craft.)\u003c/p\u003e\n\u003ch2\u003e1. Pick chat library / framework\u003c/h2\u003e\n\u003cpre\u003e\u003ccode class=\"language-text\"\u003ePick AI chat stack.\n\nVercel AI SDK (recommended for Next.js):\n- @ai-sdk/react: useChat hook\n- @ai-sdk/* providers\n- Streaming + tool calls + structured output\n- Best-in-class for Next.js\n- See vercel-ai-sdk skill\n\nVercel Chat SDK:\n- Higher-level: full chat experience\n- Multi-platform (Slack, Telegram, Teams, Discord, GitHub, Linear)\n- Built on AI SDK\n- See vercel:chat-sdk skill\n\nLangChain.js:\n- Generic JS LLM framework\n- Chat-specific helpers\n- Heavier; more options\n\nCustom (DIY):\n- fetch + ReadableStream + SSE / WebSocket\n- Full control\n- Most work\n\nComponents:\n- shadcn-ui Chat (preview / community)\n- assistant-ui (open-source chat components)\n- llamaindex Chat UI\n\nFor 2026 React stack:\n- Vercel AI SDK + assistant-ui OR shadcn-chat for components\n- Best balance of control + speed\n\nOutput:\n1. Stack recommendation\n2. Library choices\n3. Custom components vs library\n4. Bundle size\n5. SSR / streaming considerations\n\u003c/code\u003e\u003c/pre\u003e\n\u003cp\u003eThe 2026 default for Next.js: Vercel AI SDK + assistant-ui. Streaming, tool calls, attachments, message persistence — all handled.\u003c/p\u003e\n\u003ch2\u003e2. Token streaming — the table-stakes UX\u003c/h2\u003e\n\u003cp\u003eWithout streaming, AI chat feels broken in 2026.\u003c/p\u003e\n\u003cpre\u003e\u003ccode class=\"language-text\"\u003eImplement token streaming.\n\nStreaming protocols:\n\nServer-Sent Events (SSE):\n- HTTP-based; one-way (server → client)\n- Works through HTTP/2\n- Simple; widely supported\n- Vercel AI SDK default\n\nWebSocket:\n- Full duplex\n- More overhead\n- Better for bidirectional (rare for chat)\n\nClient implementation (with AI SDK):\n\nimport { useChat } from '@ai-sdk/react';\n\nfunction Chat() {\n const { messages, input, handleInputChange, handleSubmit, isLoading } = useChat({\n api: '/api/chat'\n });\n \n return (\n \u0026#x3C;\u003e\n {messages.map(m =\u003e (\n \u0026#x3C;Message key={m.id} role={m.role} content={m.content} /\u003e\n ))}\n \u0026#x3C;form onSubmit={handleSubmit}\u003e\n \u0026#x3C;input value={input} onChange={handleInputChange} disabled={isLoading} /\u003e\n \u0026#x3C;/form\u003e\n \u0026#x3C;/\u003e\n );\n}\n\nServer (Next.js Route Handler):\n\nimport { streamText } from 'ai';\n\nexport async function POST(req: Request) {\n const { messages } = await req.json();\n const result = streamText({\n model: 'anthropic/claude-sonnet-4-6', // via Vercel AI Gateway\n messages,\n });\n return result.toDataStreamResponse();\n}\n\nCursor / typing indicator:\n\n- During streaming: show blinking cursor at end of last message\n- \"▋\" character after streaming text\n- CSS animation: 1s blink\n\nSmooth scrolling:\n- Auto-scroll to bottom as tokens arrive\n- Pause if user scrolled up (don't fight them)\n- Resume on new message\n\nRender markdown progressively:\n- See markdown-rendering-sanitization-chat\n- Auto-close incomplete syntax (** without close)\n- Re-render on each chunk (memo previous messages)\n\nPerformance:\n- Memoize all but actively-streaming message\n- Avoid re-rendering message list on each token\n\nOutput:\n1. Streaming protocol (SSE recommended)\n2. Library setup\n3. Cursor / typing indicator\n4. Smooth scroll handling\n5. Markdown rendering\n\u003c/code\u003e\u003c/pre\u003e\n\u003cp\u003eThe blinking-cursor detail: small touch; signals \"actively generating.\" Without it, users can't tell if it's done. Take the 5 minutes to add it.\u003c/p\u003e\n\u003ch2\u003e3. Message types — beyond plain text\u003c/h2\u003e\n\u003cp\u003eModern AI chat has many message types.\u003c/p\u003e\n\u003cpre\u003e\u003ccode class=\"language-text\"\u003eRender message types.\n\nText messages:\n- User: right-aligned bubble\n- Assistant: left-aligned; markdown rendered\n- System: special styling (rare to show)\n\nCode blocks:\n- Syntax highlighting (Prism / Shiki)\n- Language detection\n- Copy button (top-right of block)\n- Possibly: inline run (Pyodide, etc.)\n\nTool calls:\n- \"Calling tool: search('rate limiting')\"\n- Show with icon + status (running / done / error)\n- Collapsible to see args + output\n- Useful for transparency\n\nTool results:\n- \"Found 5 results\"\n- Structured display (table, list, etc.)\n\nFunction calls (legacy):\n- Similar to tool calls\n\nImages:\n- User-uploaded: show inline\n- AI-generated: render with caption\n- Modal on click for fullscreen\n\nFiles:\n- File card: name + size + type\n- Click to download or preview\n\nCharts / visualizations:\n- AI-generated chart\n- Embedded as image or interactive component\n\nCitations:\n- \"Source: [Doc Name]\" with links\n- Hover for excerpt\n- See AI customer support agent\n\nSuggestions / quick replies:\n- 2-3 button suggestions below assistant message\n- \"Tell me more\" / \"Show example\" / \"Done\"\n\nErrors:\n- \"Failed to generate\" with retry button\n- Error type badge\n\nStreaming-status:\n- \"Searching docs...\" (during tool call)\n- \"Generating response...\"\n\nFor [USE CASE], output:\n1. Message types you'll render\n2. Per-type component\n3. Tool-call rendering pattern\n4. Citation display\n5. Mobile considerations\n\u003c/code\u003e\u003c/pre\u003e\n\u003cp\u003eThe tool-call transparency rule: show users when AI is using a tool. Hidden tool calls feel magical when they work; mysterious when they fail. Transparency builds trust.\u003c/p\u003e\n\u003ch2\u003e4. Message list — virtualization + memoization\u003c/h2\u003e\n\u003cp\u003eChat history can grow large. Plan performance.\u003c/p\u003e\n\u003cpre\u003e\u003ccode class=\"language-text\"\u003eOptimize message list rendering.\n\nVirtualization:\n- Don't render off-screen messages\n- Use react-virtuoso or react-window\n- Especially important for 100+ message threads\n\nMemoization:\n- Each message component memoized\n- Re-render only the actively-streaming message\n- Avoid layout thrash\n\nSmooth scroll:\n\nAnchor scroll to bottom:\n- New message → scroll to bottom\n- User scrolled up → don't auto-scroll (let them read)\n- \"New messages\" indicator if scrolled up\n\nScroll restore:\n- Returning to chat: restore scroll position\n- Or: scroll to bottom\n\nLazy load history:\n- Initial load: latest 50 messages\n- Scroll up → fetch older 50\n- Cursor pagination\n\nPersistence:\n- Messages saved to DB per conversation\n- Conversation list in sidebar\n- See multi-tenancy\n\nOptimistic UI:\n- User sends message → appears immediately\n- AI response streams in\n- On error: revert + show retry\n\nOutput:\n1. Virtualization setup\n2. Memoization strategy\n3. Smooth-scroll behavior\n4. Lazy-load history\n5. Optimistic update\n\u003c/code\u003e\u003c/pre\u003e\n\u003cp\u003eThe virtuoso library is 2026 default for chat UIs. Handles auto-scroll, anchor at bottom, virtualization, and lazy load gracefully.\u003c/p\u003e\n\u003ch2\u003e5. Input UX — composing messages\u003c/h2\u003e\n\u003cp\u003eThe input matters as much as the output.\u003c/p\u003e\n\u003cpre\u003e\u003ccode class=\"language-text\"\u003eDesign chat input.\n\nInput behavior:\n\nAuto-resize textarea:\n- Grows with content (1-N lines)\n- react-textarea-autosize library\n- Max-height before scroll\n\nMulti-line:\n- Enter sends (default for chat)\n- Shift+Enter for newline\n- Cmd+Enter alternative\n\nSend button:\n- Right side; disabled when empty / loading\n- Replaces with stop button during streaming\n\nStop / cancel:\n- Stop streaming response\n- AI SDK: stop() function\n- User regret: \"Wait, that's not right\"\n\nKeyboard shortcuts:\n- Up arrow: edit last message\n- Esc: clear input\n\nAttachments:\n- File picker button (paperclip icon)\n- Drag-drop onto chat\n- Show attachment thumbnails above input\n- Remove (X) on attachment\n\nSlash commands (advanced):\n- /clear, /summarize, /search\n- Show menu when \"/\" typed\n\nMentions:\n- @user (in team chat)\n- @doc / @file (Notion-style references)\n\nPersona / model picker:\n- Dropdown to choose model (GPT-4o vs Claude vs custom persona)\n- For products with multiple options\n\nVoice input:\n- Microphone icon\n- Web Speech API or Whisper\n\nMobile:\n- Auto-focus on tap\n- Native keyboard\n- Send button enlarged for thumb\n- File picker uses native sheet\n\nOutput:\n1. Input component\n2. Send / stop logic\n3. Attachment handling\n4. Keyboard shortcuts\n5. Mobile UX\n\u003c/code\u003e\u003c/pre\u003e\n\u003cp\u003eThe Cmd+Enter alternative: power users prefer Cmd+Enter over Enter. Some products: Enter sends; Cmd+Enter sends. Some products: opposite. Pick a default; allow toggle.\u003c/p\u003e\n\u003ch2\u003e6. Regenerate, edit, and branching\u003c/h2\u003e\n\u003cp\u003eBeyond send-and-receive, modern chat supports:\u003c/p\u003e\n\u003cpre\u003e\u003ccode class=\"language-text\"\u003eImplement regenerate / edit / branch.\n\nRegenerate:\n- Button below assistant message\n- Generates new response with same input\n- Replaces current OR appends as variant\n- Useful when response is bad\n\nEdit user message:\n- Click pencil on user message\n- Edit + resubmit\n- Re-runs from that point (deletes downstream)\n\nBranching (advanced):\n- Edit creates new branch\n- Original branch preserved\n- UI: dropdown to switch branches\n- Used by: ChatGPT, Claude\n\nImplementation:\n\nServer side:\n- POST /chat with messages array including edited message\n- Streaming response from edit point\n- Old branch optionally archived\n\nUI:\n- Per-message hover actions (edit / regenerate / copy)\n- Branch indicator if multiple variants\n- Switch between variants\n\nState management:\n- conversation has many branches\n- Each branch is a path through messages\n- Tree structure or linear with parent_id\n\nAnti-patterns:\n- No regenerate (users stuck with bad response)\n- Edit without proper history\n- Confusing branching UX\n\nOutput:\n1. Regenerate implementation\n2. Edit-and-resubmit flow\n3. Branching (if applicable)\n4. State model\n5. UI for switching branches\n\u003c/code\u003e\u003c/pre\u003e\n\u003cp\u003eThe regenerate pattern is now table-stakes. Every modern chat UI has it. Without it, users feel stuck with bad responses.\u003c/p\u003e\n\u003ch2\u003e7. Attachments + file handling\u003c/h2\u003e\n\u003cp\u003eUsers upload files; AI processes them.\u003c/p\u003e\n\u003cpre\u003e\u003ccode class=\"language-text\"\u003eHandle file attachments.\n\nUpload UX:\n- Drag-drop on chat area\n- Paperclip icon for file picker\n- Paste image from clipboard\n- Show thumbnails before send\n\nFile types:\n- Images: thumbnail + full view on click\n- PDFs: icon + filename + size\n- Other: icon + filename + size\n\nPreview:\n- Hover over thumbnail: tooltip with name\n- Click: modal for fullsize image / PDF preview\n- See file-preview-document-viewer-chat\n\nServer processing:\n- Image: send to vision LLM (Claude / GPT-4o)\n- PDF: extract text first (see document-parsing-ocr)\n- Other: embed via RAG\n\nLimits:\n- File size (5-50 MB depending on context)\n- File type allowlist\n- Per-message limit (e.g., 5 attachments)\n\nInline rendering:\n- AI can reference: \"I see in the image...\"\n- Show attachment inline in user message\n\nErrors:\n- Upload failed: retry\n- File too large: clear error\n- Unsupported type: clear message\n\nMulti-file:\n- Allow selecting multiple\n- Show all thumbnails\n- Bulk remove\n\nMobile:\n- Camera capture for images\n- File picker for documents\n- Native sheet UX\n\nOutput:\n1. Upload UX\n2. File-type handling\n3. Preview component\n4. Server processing pipeline\n5. Mobile considerations\n\u003c/code\u003e\u003c/pre\u003e\n\u003cp\u003eThe image-paste-from-clipboard: power users love it. Cmd+V → screenshot pastes into chat. Easy to implement; high satisfaction.\u003c/p\u003e\n\u003ch2\u003e8. Error states + recovery\u003c/h2\u003e\n\u003cp\u003eAI fails. UX fails gracefully.\u003c/p\u003e\n\u003cpre\u003e\u003ccode class=\"language-text\"\u003eHandle AI chat errors.\n\nError types:\n\nRate limit:\n- \"Too many requests. Wait 30s.\"\n- Auto-retry after delay\n- Show countdown\n\nModel unavailable:\n- \"Service temporarily unavailable\"\n- Fallback to alternative model (if configured)\n- Retry button\n\nContext too long:\n- \"Conversation too long; start fresh\"\n- Suggest: summarize + new chat\n- Or auto-truncate older messages (with notice)\n\nInappropriate content (filter triggered):\n- \"Can't help with that request\"\n- Per-policy explanation\n- Don't be hostile\n\nNetwork failure:\n- \"Network error; retry\"\n- Auto-retry with backoff\n- Cache user message; don't lose\n\nStreaming interrupted:\n- \"Response cut off\"\n- Continue / regenerate options\n\nGeneric 500:\n- Apology + retry\n- Log to error monitoring (Sentry)\n- Don't expose internal errors\n\nUI patterns:\n- Error in-message (replaces \"...\" indicator)\n- Or: toast for transient\n- Retry button visible\n\nUser input preservation:\n- Failure doesn't lose user's message\n- Repopulate input or keep visible\n\nOutput:\n1. Error type taxonomy\n2. Per-type handling\n3. Retry / fallback logic\n4. Input preservation\n5. Logging / observability\n\u003c/code\u003e\u003c/pre\u003e\n\u003cp\u003eThe \"preserve user message on error\" rule: if AI errors, user's message stays in input or visible. Otherwise they retype the whole thing. Frustrating.\u003c/p\u003e\n\u003ch2\u003e9. Performance + cost\u003c/h2\u003e\n\u003cp\u003eAI chat is expensive. Optimize.\u003c/p\u003e\n\u003cpre\u003e\u003ccode class=\"language-text\"\u003eOptimize chat performance + cost.\n\nServer-side:\n\nStream from start:\n- Don't buffer; pipe LLM stream to client\n- First-token latency matters\n\nCaching:\n- Cache identical prompts (rare for chat)\n- Cache tool results (search queries)\n- Use Vercel Runtime Cache\n\nContext management:\n- Don't send unbounded history\n- Truncate old messages OR\n- Summarize older context (memory)\n- Token-count aware\n\nTool calls:\n- Parallel where possible\n- Timeout long-running tools\n- Stream partial results if possible\n\nProvider routing:\n- Vercel AI Gateway: failover across providers\n- Cheap model for simple queries; expensive for complex\n\nCost optimization:\n- Per-user quotas (see quotas-limits-plan-enforcement)\n- Track tokens per request\n- Alert on cost spikes\n\nCaching at API level:\n- Anthropic prompt caching: save 90% on repeated context\n- 5-min TTL; ideal for system prompts\n- See claude-api skill\n\nFrontend:\n\nBundle size:\n- AI SDK + assistant-ui ≈ 60KB\n- Reasonable\n\nMemoization:\n- Memo all messages except streaming\n- useMemo on message list\n\nRender budget:\n- 60fps during streaming\n- Profile if slow\n\nNetwork:\n- Compression on stream\n- HTTP/2 multiplexing\n\nOutput:\n1. Server streaming setup\n2. Context management\n3. Provider routing\n4. Cost monitoring\n5. Frontend perf\n\u003c/code\u003e\u003c/pre\u003e\n\u003cp\u003eThe Anthropic prompt caching insight: huge saving on repeated system prompts. If you have 5K-token system prompt, prompt cache can drop costs by 90%. Use it.\u003c/p\u003e\n\u003ch2\u003e10. Accessibility\u003c/h2\u003e\n\u003cp\u003eChat UIs are notoriously inaccessible. Don't be.\u003c/p\u003e\n\u003cpre\u003e\u003ccode class=\"language-text\"\u003eMake AI chat accessible.\n\nRequired:\n\nScreen reader:\n- Each message has role (user / assistant)\n- aria-live=\"polite\" on chat region for new messages\n- Streaming text: chunk announcements (don't spam SR with every token)\n\nKeyboard:\n- Tab through messages\n- Send on Enter\n- Esc to close\n- Arrow keys to navigate messages\n\nFocus:\n- After send: focus stays in input\n- After AI response: focus stays in input (most users)\n- Or: focus on response (some prefer)\n\nColor independence:\n- Don't rely on color alone (user vs assistant)\n- Use position + role labels\n\nMotion:\n- prefers-reduced-motion: disable cursor blink\n- Or: keep but slow\n\nVoice (optional):\n- Read responses aloud\n- Web Speech API or Cloud TTS\n\nErrors:\n- Announce via aria-live=\"assertive\"\n- Clear actionable message\n\nTest:\n- VoiceOver / NVDA\n- Keyboard-only\n- Lighthouse / axe-core\n\nCommon failures:\n- Streaming text: SR announces every token (overwhelming)\n- No way to skip past long messages\n- Focus lost after send\n\nOutput:\n1. ARIA pattern\n2. Keyboard navigation\n3. Streaming announcements (debounce)\n4. Reduced motion\n5. Test plan\n\u003c/code\u003e\u003c/pre\u003e\n\u003cp\u003eThe streaming-announcement throttle: announce every 50-200 chars to screen readers, not every token. Otherwise overwhelming.\u003c/p\u003e\n\u003ch2\u003eWhat Done Looks Like\u003c/h2\u003e\n\u003cp\u003eA v1 AI streaming chat UI for B2B SaaS in 2026:\u003c/p\u003e\n\u003cul class=\"contains-task-list\"\u003e\n\u003cli class=\"task-list-item\"\u003e\u003cinput type=\"checkbox\" disabled\u003e Vercel AI SDK + assistant-ui or shadcn-chat\u003c/li\u003e\n\u003cli class=\"task-list-item\"\u003e\u003cinput type=\"checkbox\" disabled\u003e Token streaming with cursor indicator\u003c/li\u003e\n\u003cli class=\"task-list-item\"\u003e\u003cinput type=\"checkbox\" disabled\u003e Markdown rendering (sanitized)\u003c/li\u003e\n\u003cli class=\"task-list-item\"\u003e\u003cinput type=\"checkbox\" disabled\u003e Tool-call display (transparency)\u003c/li\u003e\n\u003cli class=\"task-list-item\"\u003e\u003cinput type=\"checkbox\" disabled\u003e Code block with syntax highlight + copy\u003c/li\u003e\n\u003cli class=\"task-list-item\"\u003e\u003cinput type=\"checkbox\" disabled\u003e Auto-resize input with send + stop\u003c/li\u003e\n\u003cli class=\"task-list-item\"\u003e\u003cinput type=\"checkbox\" disabled\u003e Regenerate + edit user message\u003c/li\u003e\n\u003cli class=\"task-list-item\"\u003e\u003cinput type=\"checkbox\" disabled\u003e File attachments with previews\u003c/li\u003e\n\u003cli class=\"task-list-item\"\u003e\u003cinput type=\"checkbox\" disabled\u003e Error states with retry + input preservation\u003c/li\u003e\n\u003cli class=\"task-list-item\"\u003e\u003cinput type=\"checkbox\" disabled\u003e Message list virtualization\u003c/li\u003e\n\u003cli class=\"task-list-item\"\u003e\u003cinput type=\"checkbox\" disabled\u003e Smooth scroll (don't fight user)\u003c/li\u003e\n\u003cli class=\"task-list-item\"\u003e\u003cinput type=\"checkbox\" disabled\u003e Accessibility (ARIA + keyboard + reduced motion)\u003c/li\u003e\n\u003cli class=\"task-list-item\"\u003e\u003cinput type=\"checkbox\" disabled\u003e Mobile-friendly UX\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003eAdd later when product is mature:\u003c/p\u003e\n\u003cul class=\"contains-task-list\"\u003e\n\u003cli class=\"task-list-item\"\u003e\u003cinput type=\"checkbox\" disabled\u003e Branching conversations\u003c/li\u003e\n\u003cli class=\"task-list-item\"\u003e\u003cinput type=\"checkbox\" disabled\u003e Voice input / output\u003c/li\u003e\n\u003cli class=\"task-list-item\"\u003e\u003cinput type=\"checkbox\" disabled\u003e Slash commands\u003c/li\u003e\n\u003cli class=\"task-list-item\"\u003e\u003cinput type=\"checkbox\" disabled\u003e @mentions for docs / users\u003c/li\u003e\n\u003cli class=\"task-list-item\"\u003e\u003cinput type=\"checkbox\" disabled\u003e Multi-model picker\u003c/li\u003e\n\u003cli class=\"task-list-item\"\u003e\u003cinput type=\"checkbox\" disabled\u003e Citations + sources\u003c/li\u003e\n\u003cli class=\"task-list-item\"\u003e\u003cinput type=\"checkbox\" disabled\u003e Inline charts / visualizations\u003c/li\u003e\n\u003cli class=\"task-list-item\"\u003e\u003cinput type=\"checkbox\" disabled\u003e Real-time collaboration\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003eThe mistake to avoid: \u003cstrong\u003enon-streaming responses\u003c/strong\u003e. Feels broken in 2026; users wait 30s for full response.\u003c/p\u003e\n\u003cp\u003eThe second mistake: \u003cstrong\u003eno regenerate button\u003c/strong\u003e. Users stuck with bad responses.\u003c/p\u003e\n\u003cp\u003eThe third mistake: \u003cstrong\u003elose user input on error\u003c/strong\u003e. Frustrating; retypes.\u003c/p\u003e\n\u003ch2\u003eSee Also\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003e\u003ca href=\"/grow/ai-features-implementation-chat\"\u003eAI Features Implementation\u003c/a\u003e — strategy (companion)\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"/grow/rag-implementation-chat\"\u003eRAG Implementation\u003c/a\u003e — RAG-backed chat\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"/grow/markdown-rendering-sanitization-chat\"\u003eMarkdown Rendering \u0026#x26; Sanitization\u003c/a\u003e — render LLM output\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"/grow/llm-cost-optimization-chat\"\u003eLLM Cost Optimization\u003c/a\u003e — cost\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"/grow/llm-quality-monitoring-chat\"\u003eLLM Quality Monitoring\u003c/a\u003e — quality\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"/grow/comments-threading-mentions-chat\"\u003eComments, Threading \u0026#x26; @Mentions\u003c/a\u003e — adjacent UX\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"/grow/file-preview-document-viewer-chat\"\u003eFile Preview \u0026#x26; Document Viewer\u003c/a\u003e — attachment preview\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"/grow/file-uploads-chat\"\u003eFile Uploads\u003c/a\u003e — upload pipeline\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"/grow/empty-states-loading-error-states-chat\"\u003eEmpty States, Loading \u0026#x26; Error States\u003c/a\u003e — error states\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"/grow/toast-notifications-ui-chat\"\u003eToast Notifications UI\u003c/a\u003e — error toasts\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"/grow/real-time-collaboration-chat\"\u003eReal-Time Collaboration\u003c/a\u003e — adjacent realtime\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"/grow/websocket-sse-implementation-chat\"\u003eWebSocket / SSE Implementation\u003c/a\u003e — streaming protocols\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"/grow/performance-optimization-chat\"\u003ePerformance Optimization\u003c/a\u003e — perf\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"/grow/quotas-limits-plan-enforcement-chat\"\u003eQuotas, Limits \u0026#x26; Plan Enforcement\u003c/a\u003e — usage limits\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"https://vibereference.dev/ai-development/ai-sdk\"\u003eVibeReference: AI SDK\u003c/a\u003e — Vercel AI SDK\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"https://vibereference.dev/ai-development/ai-sdk-ui\"\u003eVibeReference: AI SDK UI\u003c/a\u003e — UI hooks\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"https://vibereference.dev/ai-development/ai-sdk-core\"\u003eVibeReference: AI SDK Core\u003c/a\u003e — Core\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"https://vibereference.dev/ai-models/claude\"\u003eVibeReference: Anthropic Claude\u003c/a\u003e — Claude\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"https://vibereference.dev/ai-models/openai-gpt\"\u003eVibeReference: OpenAI GPT\u003c/a\u003e — GPT\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"https://vibereference.dev/cloud-and-hosting/ai-gateways\"\u003eVibeReference: AI Gateways\u003c/a\u003e — AI gateway\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"https://vibereference.dev/cloud-and-hosting/vercel-ai-gateway\"\u003eVibeReference: Vercel AI Gateway\u003c/a\u003e — Vercel AI Gateway\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"https://vibereference.dev/ai-development/ai-customer-support-agents\"\u003eVibeReference: AI Customer Support Agents\u003c/a\u003e — support-agent integration\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"https://vibereference.dev/frontend/shadcn\"\u003eVibeReference: shadcn/ui\u003c/a\u003e — components\u003c/li\u003e\n\u003c/ul\u003e\n"])</script><script>self.__next_f.push([1,"10:[\"$\",\"div\",null,{\"children\":[[\"$\",\"div\",null,{\"className\":\"mb-6 flex items-center gap-2 text-sm text-gray-400\",\"children\":[[\"$\",\"$L2\",null,{\"href\":\"/\",\"className\":\"hover:text-gray-600 transition-colors\",\"children\":\"Home\"}],[\"$\",\"span\",null,{\"children\":\"/\"}],[\"$\",\"$L2\",null,{\"href\":\"/grow\",\"className\":\"hover:text-gray-600 transition-colors\",\"children\":\"Grow\"}],[\"$\",\"span\",null,{\"children\":\"/\"}],[\"$\",\"span\",null,{\"className\":\"text-gray-600\",\"children\":\"AI Streaming Chat UI\"}]]}],[\"$\",\"article\",null,{\"className\":\"prose prose-gray max-w-none prose-headings:font-semibold prose-headings:text-gray-900 prose-p:text-gray-700 prose-li:text-gray-700 prose-a:text-violet-600 prose-a:no-underline hover:prose-a:underline prose-code:bg-gray-100 prose-code:px-1 prose-code:py-0.5 prose-code:rounded prose-code:text-sm prose-code:text-gray-800 prose-pre:bg-gray-900 prose-pre:text-gray-100 prose-table:text-sm prose-th:bg-gray-50 prose-th:text-gray-900 prose-td:text-gray-700\",\"dangerouslySetInnerHTML\":{\"__html\":\"$19\"}}]]}]\n"])</script><script>self.__next_f.push([1,"16:[[\"$\",\"meta\",\"0\",{\"charSet\":\"utf-8\"}],[\"$\",\"meta\",\"1\",{\"name\":\"viewport\",\"content\":\"width=device-width, initial-scale=1\"}]]\n"])</script><script>self.__next_f.push([1,"1a:I[27201,[\"/_next/static/chunks/09sdbivtpgpue.js?dpl=dpl_9VWDqKXn4KP9nKdssSGmN1FeMeJo\",\"/_next/static/chunks/0d3shmwh5_nmn.js?dpl=dpl_9VWDqKXn4KP9nKdssSGmN1FeMeJo\"],\"IconMark\"]\n13:null\n18:[[\"$\",\"title\",\"0\",{\"children\":\"AI Streaming Chat UI — VibeWeek\"}],[\"$\",\"meta\",\"1\",{\"name\":\"description\",\"content\":\"145+ free guides to build a production SaaS app in 5 days using AI coding tools like Cursor, Claude, and v0. Step-by-step challenge for solo founders.\"}],[\"$\",\"link\",\"2\",{\"rel\":\"canonical\",\"href\":\"https://www.vibeweek.ai\"}],[\"$\",\"meta\",\"3\",{\"property\":\"og:title\",\"content\":\"VibeWeek — Build Your SaaS in 5 Days with AI\"}],[\"$\",\"meta\",\"4\",{\"property\":\"og:description\",\"content\":\"145+ free guides. Learn to build with Cursor, Claude, v0, Convex, and Clerk. The AI coding challenge for solo founders.\"}],[\"$\",\"meta\",\"5\",{\"property\":\"og:url\",\"content\":\"https://www.vibeweek.ai\"}],[\"$\",\"meta\",\"6\",{\"property\":\"og:site_name\",\"content\":\"VibeWeek\"}],[\"$\",\"meta\",\"7\",{\"property\":\"og:type\",\"content\":\"website\"}],[\"$\",\"meta\",\"8\",{\"name\":\"twitter:card\",\"content\":\"summary_large_image\"}],[\"$\",\"meta\",\"9\",{\"name\":\"twitter:title\",\"content\":\"VibeWeek — Build Your SaaS in 5 Days with AI\"}],[\"$\",\"meta\",\"10\",{\"name\":\"twitter:description\",\"content\":\"145+ free guides for solo founders. Build a production SaaS in 5 days using AI coding tools.\"}],[\"$\",\"link\",\"11\",{\"rel\":\"icon\",\"href\":\"/favicon.ico?favicon.0x3dzn~oxb6tn.ico?dpl=dpl_9VWDqKXn4KP9nKdssSGmN1FeMeJo\",\"sizes\":\"256x256\",\"type\":\"image/x-icon\"}],[\"$\",\"$L1a\",\"12\",{}]]\n"])</script></body></html>