Streaming is the core feature of the Vercel AI SDK — it transforms slow LLM responses into interactive, immediately visible outputs. In this lesson, we dive deep into server-side streaming, React Server Components, and Generative UI.
The AI SDK uses the Web Streams API standard:
Client (useChat) → HTTP POST → Route Handler → AI SDK → LLM Provider
← ReadableStream ← streamText ← Token Stream
import { streamText } from 'ai'
import { anthropic } from '@ai-sdk/anthropic'
export async function POST(req: Request) {
const { messages } = await req.json()
const result = streamText({
model: anthropic('claude-sonnet-4-20250514'),
system: 'You are a helpful assistant.',
messages,
maxTokens: 2000,
temperature: 0.7,
onFinish: async ({ text, usage }) => {
// Logging, database storage, analytics
console.log(`Tokens: ${usage.totalTokens}`)
await saveToDatabase(text)
},
})
return result.toDataStreamResponse()
}
The AI SDK provides granular callbacks for the stream lifecycle:
React Server Components enable server-side rendering with streaming:
// app/[locale]/ai/page.tsx
import { generateText } from 'ai'
import { openai } from '@ai-sdk/openai'
export default async function AIPage() {
const { text } = await generateText({
model: openai('gpt-4.1'),
prompt: 'Create a summary of the latest AI trends.',
})
return (
<article>
<h1>AI Trends 2026</h1>
<div>{text}</div>
</article>
)
}
For progressive loading experiences:
import { Suspense } from 'react'
export default function Page() {
return (
<div>
<h1>Dashboard</h1>
<Suspense fallback={<LoadingSkeleton />}>
<AIInsights />
</Suspense>
</div>
)
}
Instead of only streaming text, the AI SDK generates React components as responses:
import { streamUI } from 'ai/rsc'
import { openai } from '@ai-sdk/openai'
const result = await streamUI({
model: openai('gpt-4.1'),
prompt: 'Show me the current weather in Berlin.',
tools: {
getWeather: {
description: 'Shows the current weather for a city',
parameters: z.object({ city: z.string() }),
generate: async function*({ city }) {
yield <WeatherSkeleton />
const weather = await fetchWeather(city)
return <WeatherCard data={weather} />
},
},
},
})
The user sees: First a loading skeleton, then a complete weather card — not just text "It's 15°C in Berlin" but an interactive UI component.
| Strategy | UX | Implementation |
|---|---|---|
| Skeleton | Placeholder layout | CSS/Tailwind animate |
| Typing indicator | "AI is typing..." | Simple indicator |
| Progressive rendering | Text appears word by word | Stream-based (default) |
| Optimistic UI | Immediate placement, update on response | React Optimistic |
'use client'
import { useChat } from 'ai/react'
export function Chat() {
const { messages, error, reload } = useChat()
if (error) {
return (
<div className="text-red-500">
<p>Error: {error.message}</p>
<button onClick={reload}>Try again</button>
</div>
)
}
// ... Chat UI
}
Wrap AI components in Error Boundaries:
UX rule: The user should never see a blank page. Every state — loading, error, empty, success — needs a dedicated UI. Streaming makes the loading phase pleasant, but you need to design error states yourself.