After our first look into the Vercel AI SDK v5 and its new UIMessage
structure, it's time to pull back the curtain on how these rich, multi-part messages actually make their way from your server to the client. If you thought streaming was just about slinging text deltas, v5 is here to expand your horizons quite a bit.
This time, we're getting into the nitty-gritty of the v5 UI Message Streaming Protocol. This isn't just a minor update; it's a fundamental rewrite designed to handle the complexity of modern AI interactions – think tool calls, file attachments, structured metadata, and those crucial reasoning steps, all streaming in real-time.
As a quick heads-up, we're still sailing in v5 canary waters. This means APIs can shift, and things might look a tad different by the time it hits stable. But the core concepts we'll cover here are foundational to how v5 aims to revolutionize building conversational UIs.
🖖🏿 A Note on Process & Curation: While I didn't personally write every word, this piece is a product of my dedicated curation. It's a new concept in content creation, where I've guided powerful AI tools (like Gemini Pro 2.5 for synthesis, git diff main vs canary v5 informed by extensive research including OpenAI's Deep Research, spent 10M+ tokens) to explore and articulate complex ideas. This method, inclusive of my fact-checking and refinement, aims to deliver depth and accuracy efficiently. I encourage you to see this as a potent blend of human oversight and AI capability. I use them for my own LLM chats on Thinkbuddy, and doing some make-ups and pushing to there too.
Let's dive in.
1. Prelude: Why Streaming Needed a Rewrite
TL;DR: V4's general data stream wasn't cut out for the rich, structured, multi-part UIMessage
s of v5, necessitating a new, purpose-built streaming protocol to handle the diverse content types efficiently and robustly.
Why this matters?
Streaming is the lifeblood of any good chat application. Nobody wants to stare at a loading spinner for ages while the AI crafts its magnum opus. We want to see those words appear token by token, giving that illusion of a live conversation. Vercel AI SDK v4 did a decent job with its "Data Stream" (you might remember seeing the X-Vercel-AI-Data-Stream: v1
header). It was good for, well, data – primarily text deltas and some JSON for annotations.
But as we discussed in Post 1, AI SDK v5 introduces the rich UIMessage
structure. This isn't just a simple string; a single UIMessage
can be composed of various UIMessagePart
s – TextUIPart
, ToolInvocationUIPart
, FileUIPart
, SourceUIPart
, ReasoningUIPart
, and even StepStartUIPart
markers. Trying to shoehorn this kind of structured, multi-faceted information into V4's generic data stream would have been a nightmare. Imagine trying to clearly delineate a text chunk, then a tool call initiation, then the arguments for that tool call, then maybe a file reference, all while ensuring type safety and efficient parsing on the client. You'd end up with complex custom JSON payloads within that generic stream, effectively reinventing a structured protocol on top of it. It would be cumbersome, error-prone, and just not scalable for the kind of generative UI experiences v5 is aiming for.
How it’s solved in v5? (The New Protocol)
To tackle this head-on, AI SDK v5 introduces a brand new, dedicated UI Message Streaming Protocol. This isn't just an iteration; it's a ground-up redesign specifically engineered to transport UIMessage
updates and their constituent parts
efficiently and robustly. Think of it as a specialized data conduit, built from the ground up to carry these structured messages. This protocol is the magic that allows your client to reconstruct those rich UIMessage
objects, part by part, as the AI generates its response.
Take-aways / Migration Checklist Bullets
- V4's "Data Stream" was too generic for v5's rich, multi-part
UIMessage
s. - Streaming structured data like tool calls, files, and distinct reasoning steps reliably was a challenge with V4's stream.
- AI SDK v5 introduces a new, dedicated UI Message Streaming Protocol to solve this.
- This new protocol is designed to transport
UIMessage
updates and theirparts
efficiently.
2. Header Primer: x-vercel-ai-ui-message-stream: v1
TL;DR: Responses using the new v5 UI Message Streaming Protocol are identified by the x-vercel-ai-ui-message-stream: v1
HTTP header and are built on Server-Sent Events (SSE).
Why this matters?
When your client makes a request to your backend chat API, how does it know what kind of stream to expect? With different protocols potentially in play (especially during transitions or in complex systems), having a clear identifier is crucial for correct parsing and handling. Relying solely on Content-Type
might not be enough to distinguish between different SSE-based protocols.
How it’s solved in v5? (The Identifying Header)
Server responses that use the new v5 UI Message Streaming Protocol will include a specific HTTP header:
x-vercel-ai-ui-message-stream: v1
This header serves a clear purpose:
- Identification: It allows clients (like
useChat
in v5), and potentially any intermediate proxies or gateways, to unambiguously identify the format of the incoming stream. This ensures that the client uses the correct parsing logic (processUIMessageStream
in this case) for the v5 protocol. - Versioning: The
v1
suffix indicates the version of the UI Message Streaming Protocol. This is smart future-proofing, allowing for potential revisions to the protocol down the line (e.g.,v2
) without breaking existing implementations. Clients could then negotiate or adapt based on the version they understand.
This protocol is built on Server-Sent Events (SSE), which is a standard way for a server to push data to a client over a single HTTP connection. As such, you'll also typically see the standard SSE content type:
Content-Type: text/event-stream; charset=utf-8
So, when you're debugging network requests, look out for both of these headers. They are your signposts indicating a v5 UI Message Stream.
+-------------------------------------------------+
| Browser Network Tab - Response Headers |
+-------------------------------------------------+
| ... |
| Content-Type: text/event-stream; charset=utf-8 |
| x-vercel-ai-ui-message-stream: v1 |
| Cache-Control: no-cache |
| Connection: keep-alive |
| ... |
+-------------------------------------------------+
[FIGURE 1: Screenshot of browser network tab showing response headers with 'x-vercel-ai-ui-message-stream: v1' and 'Content-Type: text/event-stream']
Take-aways / Migration Checklist Bullets
- v5 UI Message Streams are identified by the
x-vercel-ai-ui-message-stream: v1
HTTP header. - The protocol uses Server-Sent Events (SSE).
- Expect
Content-Type: text/event-stream; charset=utf-8
. - This header is crucial for clients to correctly parse the stream.
3. Catalogue of UIMessageStreamPart
Events
TL;DR: The v5 UI Message Stream consists of a sequence of JSON objects, each being a typed UIMessageStreamPart
, categorized into lifecycle events (start
, finish
, error
) and content delivery events (text
, reasoning
, `tool-call family,
file,
source,
metadata), all designed to incrementally build or update
UIMessage` objects on the client.*
Why this matters?
If we're moving beyond simple text streams, we need a well-defined vocabulary for the different kinds of information that can come down the pipe. How does the client know when a new message begins? How does it distinguish between a chunk of text, a tool call, or a piece of metadata? Without a clear structure for these events, the client-side parsing logic would be a chaotic mess.
How it’s solved in v5? (The Stream Part Catalogue)
The v5 UI Message Stream is essentially a sequence of JSON objects sent as individual SSE data
events. Each of these JSON objects is a UIMessageStreamPart
(a new v5 term, primarily defined in packages/ai/src/ui-message-stream/ui-message-stream-parts.ts
). These parts are the atomic units of information in the v5 stream.
Each UIMessageStreamPart
object has a type
field, which acts as a discriminator, telling the client what kind of information the part contains and what its payload structure will be. Most content-related parts also include a messageId
field. This messageId
is crucial, as it allows the client-side processing logic (like processUIMessageStream
) to associate the incoming part with the specific UIMessage
object it's intended to update or contribute to. This is how the SDK can handle multiple messages being updated or streamed concurrently, though typically it's one assistant message at a time.
Let's break down the different types of UIMessageStreamPart
s. They generally fall into two categories: lifecycle parts and content parts.
3.1 Lifecycle parts – start
, finish
, error
These parts manage the overall flow of a message stream and its lifecycle.
-
'start'
:-
TypeScript Interface (Conceptual):
interface UIMessageStartStreamPart { type: 'start'; messageId: string; createdAt?: string; // ISO 8601 date string }
Purpose: This part signals the beginning of a new
UIMessage
being streamed or indicates that updates are about to commence for an existingUIMessage
(though typically, it's for a new assistant message).messageId: string
: This is the unique identifier for theUIMessage
that will be constructed or updated by subsequent parts. The client uses this ID to map incoming content parts to the correct message object in its state. This ID should remain stable for the lifetime of that UI message.createdAt?: string
: An optional, server-authoritative timestamp (as an ISO 8601 string) for when the message effectively began. The client can use this to instantiate theUIMessage.createdAt
field.
-
-
'finish'
:-
TypeScript Interface (Conceptual):
interface UIMessageFinishStreamPart { type: 'finish'; messageId: string; finishReason: LanguageModelV2FinishReason; // e.g., 'stop', 'length', 'tool-calls', 'content-filter', 'error' usage?: LanguageModelV2Usage; // { promptTokens: number; completionTokens: number; totalTokens?: number } providerMetadata?: Record<string, any>; }
Purpose: Indicates that all content parts for the
UIMessage
identified bymessageId
have been successfully streamed. This is the signal that the message is now complete from the server's perspective.messageId: string
: Identifies theUIMessage
that has finished streaming.finishReason: LanguageModelV2FinishReason
: A string indicating why the generation concluded (e.g.,'stop'
for natural completion,'length'
if max tokens hit,'tool-calls'
if the model stopped to invoke tools,'content-filter'
if generation was halted by a content filter, or'error'
if an unrecoverable model error occurred during generation for this message). This is a crucial piece of metadata.usage?: LanguageModelV2Usage
: Optional. Provides token usage information (prompt tokens, completion tokens) for the generation of this message.providerMetadata?: Record<string, any>
: Optional. Any final, provider-specific metadata associated with the completion of this message.
-
-
'error'
:-
TypeScript Interface (Conceptual):
interface UIMessageErrorStreamPart { type: 'error'; value: string; // The error message }
Purpose: Signals a stream-level or general processing error on the server that isn't specific to a single message's
finishReason
(though a model error could also manifest as afinishReason: 'error'
in a'finish'
part). This is for errors that might affect the entire stream or the server's ability to continue.value: string
: Contains the error message string.Client Handling: When
useChat
(viaprocessUIMessageStream
) receives this, it typically sets itserror
state to a newError(value)
, making the error accessible to your UI for display.
-
3.2 Content parts – text
, reasoning
, tool-call
family, file
, source
, metadata
These parts deliver the actual content that will populate the UIMessage.parts
array or UIMessage.metadata
on the client. Each of these will include a messageId
to link it to the correct UIMessage
.
-
'text'
:-
TypeScript Interface (Conceptual):
interface UIMessageTextStreamPart { type: 'text'; messageId: string; value: string; // A delta of text }
Purpose: Carries a delta (chunk) of text content.
Client Handling: The client appends this
value
to thetext
property of aTextUIPart
within theUIMessage.parts
array for the givenmessageId
. If noTextUIPart
exists yet for this stream segment, one is created. This is how text appears to stream token by token.
-
-
'reasoning'
:-
TypeScript Interface (Conceptual):
interface UIMessageReasoningStreamPart { type: 'reasoning'; messageId: string; value: string; // Text for the reasoning step providerMetadata?: Record<string, any>; }
Purpose: Streams text content intended for a
ReasoningUIPart
in theUIMessage
.Client Handling: Similar to
'text'
, thisvalue
populates thetext
field of aReasoningUIPart
, withproviderMetadata
also being set if provided.
-
-
'tool-call'
family (related parts forToolInvocationUIPart
):
Managing tool calls is complex, so there are several stream parts dedicated to updating the state of aToolInvocationUIPart
on the client:-
'tool-call-delta'
:-
TypeScript Interface (Conceptual):
interface UIMessageToolCallDeltaStreamPart { type: 'tool-call-delta'; messageId: string; toolCallId: string; // ID for this specific tool call toolName: string; // Name of the tool being called argsTextDelta: string; // A delta of the stringified JSON arguments }
Purpose: Streams the arguments for a tool call progressively. This is useful if the arguments are long or generated token by token by the model.
Client Handling: This part typically creates or updates a
ToolInvocationUIPart
(identified bytoolCallId
within theUIMessage
formessageId
). ThetoolInvocation.state
is set to'partial-call'
, andargsTextDelta
is appended to accumulate the full argument string.
-
-
'tool-call'
:-
TypeScript Interface (Conceptual):
interface UIMessageToolCallStreamPart { type: 'tool-call'; messageId: string; toolCallId: string; toolName: string; args: string; // Complete stringified JSON arguments }
Purpose: Signals a complete tool call specification from the model. The
args
field contains the complete stringified JSON arguments for the tool.Client Handling: This finalizes the argument accumulation for the
ToolInvocationUIPart
. ThetoolInvocation.state
transitions to'call'
, and theargs
string is parsed into aJSONValue
and stored.
-
-
'tool-result'
:-
TypeScript Interface (Conceptual):
interface UIMessageToolResultStreamPart { type: 'tool-result'; messageId: string; toolCallId: string; toolName: string; // Often included for context, though toolCallId is primary result: string; // Stringified JSON result from the tool execution }
Purpose: Provides the result of a tool execution (after the application has run the tool).
Client Handling: Updates the
ToolInvocationUIPart
(identified bytoolCallId
). ThetoolInvocation.state
transitions to'result'
, and theresult
string is parsed into aJSONValue
and stored.
-
-
'tool-error'
:-
TypeScript Interface (Conceptual):
interface UIMessageToolErrorStreamPart { type: 'tool-error'; messageId: string; toolCallId: string; toolName: string; // Context errorMessage: string; // The error message from tool execution }
Purpose: Communicates an error that occurred during a tool's invocation or execution.
Client Handling: Updates the
ToolInvocationUIPart
. ThetoolInvocation.state
transitions to'error'
, anderrorMessage
is stored.
-
-
-
'file'
:-
TypeScript Interface (Conceptual):
interface UIMessageFileStreamPart { type: 'file'; messageId: string; file: { mediaType: string; // IANA media type filename?: string; url: string; // Remote URL or Data URL }; }
Purpose: Provides the necessary information to construct or update a
FileUIPart
within aUIMessage
.Client Handling: A
FileUIPart
is added toUIMessage.parts
with the providedfile
details.
-
-
'source'
:-
TypeScript Interface (Conceptual):
interface UIMessageSourceStreamPart { type: 'source'; messageId: string; source: LanguageModelV2Source; // e.g., { sourceType: 'url', id: string, url: string, title?: string, ... } }
Purpose: Delivers data for a
SourceUIPart
, typically used in RAG systems for citations.Client Handling: A
SourceUIPart
is added toUIMessage.parts
with the providedsource
data.
-
-
'metadata'
:-
TypeScript Interface (Conceptual):
interface UIMessageMetadataStreamPart { type: 'metadata'; messageId: string; metadata: unknown; // The custom metadata object or value }
Purpose: Allows for streaming updates or additions to a
UIMessage
's custommetadata
field.Client Handling: The client should take this
metadata
payload (which isunknown
from the stream's perspective) and validate it against its application-definedmessageMetadataSchema
(if provided touseChat
). If valid, it's merged into theUIMessage.metadata
object for the givenmessageId
.
-
This catalogue of UIMessageStreamPart
events forms the backbone of real-time, structured communication in v5, enabling the client to dynamically build those rich UIMessage
objects we explored in Post 1.
Server -> Client Stream (for a single message M1 with tool call T1):
----------------------------------------------------------------------------------------------
SSE Event: data: {"type":"start", "messageId":"M1", "createdAt":"..."}
|
v
SSE Event: data: {"type":"text", "messageId":"M1", "value":"Thinking... "}
|
v
SSE Event: data: {"type":"text", "messageId":"M1", "value":"about "}
|
v
SSE Event: data: {"type":"tool-call-delta", "messageId":"M1", "toolCallId":"T1", "toolName":"search", "argsTextDelta":"{\"q\":\"AI"}
|
v
SSE Event: data: {"type":"tool-call-delta", "messageId":"M1", "toolCallId":"T1", "toolName":"search", "argsTextDelta":" news\"}"}
|
v
SSE Event: data: {"type":"tool-call", "messageId":"M1", "toolCallId":"T1", "toolName":"search", "args":"{\"q\":\"AI news\"}"}
|
v
(Client/App executes tool T1 based on the 'tool-call' event above)
|
v (Server is informed of tool T1 result by app, then streams it back)
SSE Event: data: {"type":"tool-result", "messageId":"M1", "toolCallId":"T1", "toolName":"search", "result":"[{\"title\":\"...\"}]"}
|
v
SSE Event: data: {"type":"text", "messageId":"M1", "value":"\nHere is what I found."}
|
v
SSE Event: data: {"type":"finish", "messageId":"M1", "finishReason":"stop", "usage":{...}}
----------------------------------------------------------------------------------------------
[FIGURE 2: Sequence diagram illustrating a typical flow of UIMessageStreamParts for a message with text and a tool call: 'start' -> 'text' (delta) -> 'text' (delta) -> 'tool-call-delta' -> 'tool-call' -> (app executes tool) -> server sends 'tool-result' -> 'text' (delta) -> 'finish']
Take-aways / Migration Checklist Bullets
- The v5 UI Message Stream is a sequence of JSON objects, each a typed
UIMessageStreamPart
. -
messageId
links content parts to the correctUIMessage
. - Lifecycle parts (
'start'
,'finish'
,'error'
) manage the stream. - Content parts (
'text'
,'reasoning'
, the'tool-call'
family,'file'
,'source'
,'metadata'
) deliver data forUIMessage.parts
orUIMessage.metadata
. - Tool interactions have a family of stream parts (
'tool-call-delta'
,'tool-call'
,'tool-result'
,'tool-error'
) to manage their lifecycle.
4. Server Emission Patterns
TL;DR: Servers can emit the v5 UI Message Stream either automatically using result.toUIMessageStreamResponse()
after a V2 streamText
call (recommended), or manually by constructing the stream with UIMessageStreamWriter
for more fine-grained control or integration with custom AI logic.
Why this matters?
Knowing what the stream parts are is one thing; knowing how to generate them from your server is another. Developers need clear patterns for producing this v5 UI Message Stream, whether they're using standard SDK functions or integrating with more complex, custom AI pipelines.
How it’s solved in v5? (Emission Helpers)
AI SDK v5 provides two main ways for your server-side code (e.g., in a Next.js API route) to produce this new stream.
4.1 Auto-generation via streamText().toUIMessageStreamResponse()
This is the primary and recommended method for most chat scenarios, especially when you're directly using a V2 model provider with streamText()
.
-
How it works:
-
You make a call to a V2 core function like
streamText()
using a V2 model instance (e.g.,openai('gpt-4o-mini')
).
// Simplified server route import { streamText } from '@ai-sdk/provider'; import { openai } from '@ai-sdk/openai'; // V2 provider import { UIMessage, convertToModelMessages } from 'ai'; // v5 utilities export async function POST(req: Request) { const { messages: uiMessages }: { messages: UIMessage[] } = await req.json(); // Convert incoming UIMessages to ModelMessages for the LLM const { modelMessages } = convertToModelMessages(uiMessages); const result = await streamText({ model: openai('gpt-4o-mini'), // Using a V2 model instance messages: modelMessages, // ... other options like tools, system prompt, etc. }); // Now, the magic step: return result.toUIMessageStreamResponse(); }
-
2. The `result` object returned by `streamText()` (which is a `StreamTextResult` or similar V2 result type) has a handy method: `toUIMessageStreamResponse()`.
3. Calling `result.toUIMessageStreamResponse()` does the heavy lifting:
* It takes the underlying stream of V2 core parts (raw text deltas, tool call information, file data, etc.) coming from the LLM provider.
* It transforms these V2 core parts into the corresponding v5 `UIMessageStreamPart`s (like `'text'`, `'tool-call'`, `'file'`, etc.).
* It wraps this transformed stream in a standard `Response` object, correctly setting the SSE headers:
* `Content-Type: text/event-stream; charset=utf-8`
* `x-vercel-ai-ui-message-stream: v1`
- Benefits: This is incredibly convenient. You don't have to manually construct SSE events or worry about the intricacies of mapping different V2 core parts to v5 stream parts. The SDK handles it for you.
-
onFinish
for Persistence: As mentioned before,toUIMessageStreamResponse()
can also take anonFinish
callback in its options. This callback is invoked on the server after allUIMessageStreamPart
s for the current AI turn have been generated and queued to be sent to the client. This is the ideal place to persist the final, fully formedUIMessage
(s) from the assistant's turn.
4.2 Manual generation via UIMessageStreamWriter
There are scenarios where streamText().toUIMessageStreamResponse()
might not be a direct fit:
- Integrating with AI logic that doesn't come from a standard V2
streamText
call (e.g., some LangChain setups before robust@ai-sdk/langchain
v5 adapters mature, custom rule-based systems, or streaming data from a non-LLM source that you want to present in the chat UI). - Needing extremely fine-grained control over exactly which
UIMessageStreamPart
s are streamed and when they are sent, perhaps for complex multi-stage agentic behaviors.
For these cases, AI SDK v5 provides a lower-level utility: createUIMessageStream
(found in packages/ai/src/ui-message-stream/create-ui-message-stream.ts
).
-
How it works:
-
Call
createUIMessageStream()
:
import { createUIMessageStream, UIMessageStreamWriter } from 'ai'; // v5 utility // In your API route: // const { stream, writer } = createUIMessageStream();
-
2. This function returns an object containing:
* `stream: ReadableStream`: This is the `ReadableStream` that will eventually carry your SSE data. You'll return this in your `Response` object.
* `writer: UIMessageStreamWriter`: This is an object with methods that allow you to manually write different `UIMessageStreamPart`s into the `stream`.
-
UIMessageStreamWriter
Methods: Thewriter
object provides methods like:-
writer.writeStart({ messageId: string; createdAt?: string; })
-
writer.writeTextDelta(messageId: string, value: string)
-
writer.writeReasoning(messageId: string, value: string, providerMetadata?: Record<string, any>)
-
writer.writeToolCallDelta(messageId: string, { toolCallId, toolName, argsTextDelta })
-
writer.writeToolCall(messageId: string, { toolCallId, toolName, args })
-
writer.writeToolResult(messageId: string, { toolCallId, toolName, result })
-
writer.writeToolError(messageId: string, { toolCallId, toolName, errorMessage })
-
writer.writeFile(messageId: string, fileData: { mediaType, filename?, url })
-
writer.writeSource(messageId: string, sourceData: LanguageModelV2Source)
-
writer.writeMetadata(messageId: string, metadata: unknown)
-
writer.writeError(errorMessage: string)
(for stream-level errors) -
writer.writeFinish({ messageId: string; finishReason: LanguageModelV2FinishReason; ... })
-
writer.close()
: Crucially, you must callwriter.close()
when you are done writing all parts. This signals the end of the stream. -
writer.abort(error?: Error)
: To terminate the stream due to an error.
-
-
Conceptual Code Snippet (Manual Emission):
import { createUIMessageStream, UIMessageStreamWriter, LanguageModelV2FinishReason } from 'ai'; import { NextRequest, NextResponse } from 'next/server'; export const runtime = 'edge'; // Example function that simulates a custom AI process and writes to the stream async function handleCustomAIProcess(writer: UIMessageStreamWriter, userInput: string) { const assistantMessageId = "assistant-" + Date.now(); // Always start a message writer.writeStart({ messageId: assistantMessageId, createdAt: new Date().toISOString() }); writer.writeTextDelta(assistantMessageId, "Okay, I will process your input: '"); await new Promise(r => setTimeout(r, 200)); // Simulate work writer.writeTextDelta(assistantMessageId, userInput); writer.writeTextDelta(assistantMessageId, "'. "); // Simulate some reasoning await new Promise(r => setTimeout(r, 300)); writer.writeReasoning(assistantMessageId, "First, I need to analyze the sentiment."); writer.writeTextDelta(assistantMessageId, "\nSentiment analysis complete. "); // Simulate a tool call await new Promise(r => setTimeout(r, 500)); const toolCallId = "tool-" + Date.now(); writer.writeToolCallDelta(assistantMessageId, { toolCallId, toolName: 'myCustomTool', argsTextDelta: '{"param":"val' }); await new Promise(r => setTimeout(r, 100)); writer.writeToolCallDelta(assistantMessageId, { toolCallId, toolName: 'myCustomTool', argsTextDelta: 'ue"}' }); writer.writeToolCall(assistantMessageId, { toolCallId, toolName: 'myCustomTool', args: JSON.stringify({param: "value"}) }); // Simulate tool execution and result await new Promise(r => setTimeout(r, 700)); // Simulate tool work const toolResult = { status: "success", data: "Tool executed successfully." }; writer.writeToolResult(assistantMessageId, { toolCallId, toolName: 'myCustomTool', result: JSON.stringify(toolResult) }); writer.writeTextDelta(assistantMessageId, "\nAll processing is now complete."); // Always finish the message writer.writeFinish({ messageId: assistantMessageId, finishReason: 'stop' as LanguageModelV2FinishReason, // Cast for type safety usage: { promptTokens: 10, completionTokens: 50 } // Example usage }); // CRITICAL: Close the writer to end the stream properly writer.close(); } export async function POST(req: NextRequest) { try { const { input } = await req.json(); // Assuming client sends { input: "user's text" } const { stream, writer } = createUIMessageStream(); // Call your custom logic. IMPORTANT: Do NOT await handleCustomAIProcess if you want to stream. // Let it run in the background and write to the stream. handleCustomAIProcess(writer, input as string).catch(err => { console.error("Error in custom AI process:", err); // If an error occurs in the async process, try to write a stream error part // This requires careful error handling and writer management. try { if (!writer.closed) { // Check if writer is still open writer.writeError(err instanceof Error ? err.message : "Unknown error in background process"); writer.close(); } } catch (writeErrorErr) { console.error("Error writing stream error:", writeErrorErr); } }); // Return the stream in a Response object with correct headers return new NextResponse(stream, { headers: { 'Content-Type': 'text/event-stream; charset=utf-8', 'x-vercel-ai-ui-message-stream': 'v1', 'Cache-Control': 'no-cache', // Ensure no caching for SSE 'Connection': 'keep-alive', // Keep connection open for SSE }, }); } catch (error) { // Handle errors in setting up the stream itself console.error('[Manual Stream API Error]', error); return NextResponse.json({ error: (error as Error).message }, { status: 500 }); } }
Remember to handle errors robustly when manually managing streams, especially in the async processing function.
Choosing between these two patterns depends on your needs. For standard LLM interactions, result.toUIMessageStreamResponse()
is simpler and safer. For more complex or custom scenarios, UIMessageStreamWriter
gives you the power and control.
Take-aways / Migration Checklist Bullets
- Use
result.toUIMessageStreamResponse()
for the easiest way to generate v5 UI Message Streams from V2streamText
calls. - This method handles transformation and SSE header setting automatically.
- For manual stream generation, use
createUIMessageStream()
to get astream
andwriter
. - Use
UIMessageStreamWriter
methods to write individualUIMessageStreamPart
s. - Always call
writer.close()
when done with manual streaming to properly terminate the SSE stream.
5. Client Consumption
TL;DR: On the client, useChat
internally uses processUIMessageStream
to consume the v5 UI Message Stream, parse UIMessageStreamPart
s, and intelligently reconstruct/update UIMessage
objects in its state, triggering reactive UI updates.
Why this matters?
Okay, so the server is dutifully sending this stream of finely crafted UIMessageStreamPart
s. How does the client make sense of it all? If you were to manually parse an SSE stream, manage message states, accumulate text deltas, handle tool lifecycle updates, and validate metadata, your client-side code would become incredibly complex and error-prone. We need a robust client-side utility to handle this consumption.
How it’s solved in v5? (The Client-Side Processor)
The Vercel AI SDK provides a core client-side utility for this: processUIMessageStream
(located in packages/ai/src/ui/process-ui-message-stream.ts
). This function is the workhorse that useChat
(and potentially other v5 UI hooks) uses under the hood to consume and interpret the v5 UI Message Stream.
5.1 processUIMessageStream()
algorithm
You generally won't call processUIMessageStream
directly if you're using useChat
, but understanding what it does is key to understanding how useChat
works its magic.
- Role and Inputs:
- It takes a
ReadableStream
(typically obtained from thebody
of afetch
response) as its primary input. This stream is expected to be a v5 UI Message Stream (SSE ofUIMessageStreamPart
JSON objects). - It also takes an options object, which includes crucial callbacks.
- It takes a
- Core Logic:
- SSE Parsing: It reads the
ReadableStream
, decodes the SSE events, and parses thedata:
field of each event as a JSONUIMessageStreamPart
. -
UIMessage
Reconstruction/Update: This is where the intelligence lies.- When a
'start'
part arrives,processUIMessageStream
uses itsmessageId
to either identify an existingUIMessage
to update or, more commonly, to create a newUIMessage
instance in memory (e.g., for an assistant's reply). Theid
of thisUIMessage
is set from themessageId
in the'start'
part, ensuring stability. - For content parts like
'text'
,'reasoning'
,'file'
,'source'
, it uses themessageId
to find the correspondingUIMessage
object. - It then appends text deltas from
'text'
parts to the appropriateTextUIPart
(creating one if necessary). - It assembles
ToolInvocationUIPart
s through their various states using'tool-call-delta'
,'tool-call'
,'tool-result'
, and'tool-error'
stream parts, carefully managing thetoolCallId
and thetoolInvocation.state
. - It adds new
FileUIPart
s,SourceUIPart
s, orReasoningUIPart
s as they arrive. - For
'metadata'
parts, it takes themetadata
payload, validates it against themessageMetadataSchema
(if provided in options), and then merges it into theUIMessage.metadata
field.
- When a
- Callback Invocation: This is how
processUIMessageStream
communicates changes back to its consumer (e.g.,useChat
).
- SSE Parsing: It reads the
- Key Callbacks for
processUIMessageStream
(Options):-
onUpdate(message: UIMessage<METADATA>, updateReason: 'initial' | 'update' | 'finish')
: This is arguably the most important callback. It's called whenever aUIMessage
object is significantly constructed or updated by incoming stream parts.- The
message
argument is the current state of theUIMessage
object being processed. - The
updateReason
(or similar parameter) might indicate whyonUpdate
was called (e.g., a new message started, an existing part was updated, or the message just finished). - This is the hook that
useChat
uses to reactively update its internalmessages
state array, which in turn triggers UI re-renders in your React components.
- The
-
onDone()
: Called when the entire stream has been successfully processed and closed (typically after the last'finish'
part for the last message in the stream, or if the stream ends for other reasons without an error). -
onClientError(error: Error)
: Called ifprocessUIMessageStream
itself encounters an error during its processing (e.g., an error parsing an SSE event, a network error during consumption after the initial fetch, or if a required callback throws). This is distinct from server-sent'error'
stream parts, which are handled as data. -
messageMetadataSchema
: As mentioned, an optional schema (e.g., a Zod schema) for validating custommetadata
payloads received via'metadata'
stream parts.
-
+-----------------+ +--------------------------+ +--------------------+ +---------------------+
| ReadableStream | --> | processUIMessageStream() | --> | onUpdate(message, | --> | Update UIMessage[] |
| (SSE from API) | | (Parses StreamParts, | | updateReason) | | in useChat's state |
+-----------------+ | Builds UIMessages) | | callback) | | (triggers UI rerender)|
+--------------------------+ +--------------------+ +---------------------+
| ^
| | (metadata validation)
v |
+--------------------------+
| messageMetadataSchema |
| (Optional Zod schema) |
+--------------------------+
[FIGURE 3: Diagram showing ReadableStream -> processUIMessageStream -> onUpdate callback -> updates UIMessage array]
5.2 React integration inside useChat
So, how does useChat
tie into all this?
- When you call an action that triggers an API call (e.g.,
handleSubmit
,reload
,experimental_resume
),useChat
internally uses a helper function (likecallChatApi
) to make the actualfetch
request to your backend API endpoint. -
callChatApi
gets theResponse
object fromfetch
. - If the response is successful and has the
x-vercel-ai-ui-message-stream: v1
header,callChatApi
takes theresponse.body
(which is aReadableStream
) and pipes it toprocessUIMessageStream
. - The
onUpdate
callback thatuseChat
(viacallChatApi
) provides toprocessUIMessageStream
is a function that knows how to updateuseChat
's internalmessages
state. WhenonUpdate
is called byprocessUIMessageStream
with an updatedUIMessage
,useChat
updates its state, which causes your React component to re-render with the new message data. - Callbacks like
onDone
andonClientError
fromprocessUIMessageStream
are also wired up to updateuseChat
'sstatus
anderror
states.
This layered approach means that as a useChat
user, you're abstracted away from the raw stream processing. You simply see the messages
array update magically as data streams in, but now you know processUIMessageStream
is the engine making it happen.
Take-aways / Migration Checklist Bullets
-
processUIMessageStream
is the core client-side utility for consuming v5 UI Message Streams. - It parses SSE events containing
UIMessageStreamPart
s. - It intelligently reconstructs/updates
UIMessage
objects usingmessageId
. - Its
onUpdate
callback is key for reactive UI updates in hooks likeuseChat
. -
useChat
usesprocessUIMessageStream
internally, abstracting the raw processing from the developer.
6. Debugging Streams (curl, Browser Dev Tools)
TL;DR: Inspect v5 UI Message Streams using curl
to see raw SSE events, or use your browser's Developer Tools Network tab to examine the EventStream response and individual messages when useChat
makes requests.
Why this matters?
When things go wrong with streaming, or when you're just trying to understand what data your server is actually sending, being able to inspect the raw stream is invaluable. Blindly trusting that the stream is correct can lead to a lot of frustration.
How it’s solved in v5? (Inspection Techniques)
Here are a few ways to peek under the hood of your v5 UI Message Streams:
-
Using
curl
(Command Line):
This is great for hitting your backend API endpoint directly from your terminal and seeing the raw SSE stream without any client-side processing.
# Example: POSTing to a local Next.js API route # Make sure your server is running (e.g., npm run dev) curl -N -X POST -H "Content-Type: application/json" \ -d '{"messages": [{"id":"user-1","role":"user","parts":[{"type":"text","text":"Hello, AI!"}]}]}' \ http://localhost:3000/api/v5/chat # Replace /api/v5/chat with your actual endpoint
-
-N
: Tellscurl
not to buffer the output, so you see events as they arrive. -
-X POST -H "Content-Type: application/json" -d '{...}'
: Standard POST request with a JSON body. Adjust the body to match what youruseChat
client would send (an array ofUIMessage
s). -
What to look for: You'll see a sequence of lines, each starting with
data:
, followed by a JSON object. Each of these JSON objects is one of yourUIMessageStreamPart
s.
data: {"type":"start","messageId":"ai-123","createdAt":"2023-10-27T10:00:00Z"} data: {"type":"text","messageId":"ai-123","value":"Hello"} data: {"type":"text","messageId":"ai-123","value":" there"} data: {"type":"text","messageId":"ai-123","value":"!"} data: {"type":"finish","messageId":"ai-123","finishReason":"stop"}
This is super helpful for verifying that your server is sending the parts you expect, in the correct order, and with the correct
messageId
. -
-
Using
npx sse-cat
(More User-Friendly CLI for SSE):
Whilecurl
is powerful,sse-cat
is a small utility specifically for inspecting SSE streams and can sometimes format the output more nicely if the JSON is complex.
# sse-cat is primarily for GET requests. For POST, it's trickier. # If your endpoint supports GET for streaming (e.g., for a fixed demo response): # npx sse-cat http://localhost:3000/api/v5/chat-demo-stream # For POST requests with SSE, tools like Postman or Insomnia are often better # than trying to wrangle sse-cat with POST bodies.
For most v5 chat endpoints that expect POST,
curl
or a GUI API client is usually more practical. If you have a simple GET endpoint that streams SSE for testing,sse-cat
can be handy. -
Browser Developer Tools (Your Best Friend on the Client):
When your application is running in the browser anduseChat
is making requests:- Open your browser's Developer Tools (usually F12 or Right-click -> Inspect).
- Go to the Network tab.
- Trigger a chat message submission in your UI.
- You'll see a
fetch
(or XHR) request made to your API endpoint (e.g.,/api/v5/chat
). Click on it. - Look at the Headers tab to confirm the request/response headers (e.g.,
x-vercel-ai-ui-message-stream: v1
). -
The most useful tab here is often EventStream (Chrome), Response (Firefox, sometimes shows raw stream), or a similar tab that specifically decodes SSE.
- In Chrome's EventStream tab, you'll see each individual SSE message received from the server, neatly displayed with its
id
,event
(usuallymessage
), anddata
(yourUIMessageStreamPart
JSON).
+--------------------------------------------------------------------+ | Browser DevTools - Network Tab - EventStream View | +--------------------------------------------------------------------+ | Request: /api/v5/chat | +--------------------------------------------------------------------+ | Events: | | | | > Time: 10:00:01.100 | Event: message | | data: {"type":"start","messageId":"ai-xyz", ...} | | | | > Time: 10:00:01.200 | Event: message | | data: {"type":"text","messageId":"ai-xyz","value":"Hello"} | | | | > Time: 10:00:01.350 | Event: message | | data: {"type":"tool-call-delta","messageId":"ai-xyz", ...} | | | | ... (more stream parts as they arrive) ... | | | | > Time: 10:00:02.500 | Event: message | | data: {"type":"finish","messageId":"ai-xyz", ...} | +--------------------------------------------------------------------+
[FIGURE 4: Screenshot of Chrome DevTools Network tab, EventStream view, showing individual UIMessageStreamPart JSON objects being received]
This allows you to see exactly whatprocessUIMessageStream
is receiving from the server, which is invaluable for debugging client-side processing issues or discrepancies between what the server thinks it's sending and what the client actually gets. - In Chrome's EventStream tab, you'll see each individual SSE message received from the server, neatly displayed with its
These debugging techniques are essential for building and troubleshooting robust streaming applications with v5.
Take-aways / Migration Checklist Bullets
- Use
curl -N
to inspect raw v5 UI Message Stream events from your server endpoint. - Leverage browser Developer Tools (Network tab -> EventStream/Response) to see what
useChat
is receiving. - Look for
data: {...}
lines containing yourUIMessageStreamPart
JSON objects. - Verify headers like
x-vercel-ai-ui-message-stream: v1
andContent-Type: text/event-stream
.
7. Performance: Throttling & Back-pressure
TL;DR: AI SDK v5 offers client-side UI update throttling via experimental_throttleTimeMilliseconds
in useChat
to prevent jank from rapid token arrival, while server-side core functions like streamText
are designed with back-pressure in mind to avoid overwhelming the client.
Why this matters?
Streaming is great for perceived performance, but if not handled carefully, it can also cause performance issues. Ultra-rapid token arrival from a fast LLM could lead to too many state updates and re-renders on the client, causing UI jank or even React errors like "maximum update depth exceeded." Conversely, if the server generates data much faster than the client or network can handle, resources can be wasted.
How it’s solved in v5? (Performance Considerations)
-
Client-Side Throttling (
experimental_throttleTimeMilliseconds
):-
This is an option you can pass to
useChat
(and other UI hooks likeuseCompletion
).
const { messages, ... } = useChat({ // ... other options experimental_throttleTimeMilliseconds: 50, // e.g., update UI at most every 50ms (20fps) });
How it helps: When tokens are arriving very quickly,
useChat
(viaprocessUIMessageStream
and its internal update logic) will buffer these rapid updates. Instead of triggering a React re-render for every single token, it will batch these updates and apply them to themessages
state at most once per the specified interval (e.g., every 50 milliseconds).Benefit: This significantly reduces the number of re-renders for very fast streams, leading to smoother UI animations (like the text appearing), lower CPU usage on the client, and a more stable UI. It's a trade-off between a tiny bit of added latency for individual token display versus overall UI responsiveness. You can tune this value based on your application's needs.
-
-
Server-Side Back-pressure (Conceptual with
streamText
):- Core SDK functions like
streamText
(when interacting with compliant V2 model providers) are generally designed with back-pressure in mind. This is a fundamental concept in stream processing. - What it means: Ideally, the LLM provider (or the SDK's handling of its stream) only generates new tokens or data as fast as the consuming end (your server route, and ultimately the client connection) can accept them. If the network connection to the client is slow, or if the client is busy, back-pressure signals should propagate back, causing the LLM to pause or slow down its generation.
- Benefit: This helps prevent the server from generating a massive amount of data that just gets buffered and potentially discarded if the client disconnects. It saves resources and can reduce costs if your LLM provider charges per token generated (even if not delivered).
- Contrast with
result.consumeStream()
: You might have seen theresult.consumeStream()
pattern on the server, often used withonFinish
for persistence (especially in V4 docs for "Chatbot Message Persistence - Handling client disconnects"). CallingconsumeStream()
on the server explicitly removes back-pressure from the client connection for that server-side processing. The server will then try to read the entire LLM stream as fast as possible, regardless of whether the client is still connected or keeping up. This is a deliberate choice when you want to ensure the full AI response is generated and saved on the server even if the client disconnects mid-stream. It's a trade-off: you ensure full generation for persistence, but you lose the client-driven back-pressure for that specific consumption.
- Core SDK functions like
Understanding these performance aspects helps you build chat applications that are not only interactive but also efficient and robust under various network conditions and generation speeds.
Take-aways / Migration Checklist Bullets
- Use
experimental_throttleTimeMilliseconds
inuseChat
to batch UI updates from rapid streams and prevent jank. - V2
streamText
and compliant providers should inherently support back-pressure, preventing the LLM from overwhelming the client. - Be aware that
result.consumeStream()
on the server bypasses client back-pressure to ensure full generation for persistence.
8. Summary & Checklist for Implementors
TL;DR: The v5 UI Message Streaming Protocol, identified by x-vercel-ai-ui-message-stream: v1
, uses SSE to transport typed UIMessageStreamPart
s, enabling robust streaming of rich, multi-part UIMessage
s. Implementors need to ensure correct server-side emission and client-side rendering of these structured parts.
We've been on quite a journey through the internals of Vercel AI SDK v5's new UI Message Streaming Protocol! Let's wrap up with a quick summary and a checklist to keep in mind.
Recap of the v5 UI Message Stream:
- It's an SSE-based protocol specifically designed for real-time chat updates.
- Server responses are identified by the
x-vercel-ai-ui-message-stream: v1
HTTP header. - The stream consists of a sequence of JSON objects, each being a typed
UIMessageStreamPart
. - These parts are categorized into lifecycle events (
'start'
,'finish'
,'error'
) and content delivery events ('text'
,'reasoning'
, the'tool-call'
family,'file'
,'source'
,'metadata'
). - The protocol enables the efficient and robust streaming of rich, multi-part
UIMessage
objects, which are then reconstructed on the client by utilities likeprocessUIMessageStream
(used byuseChat
).
This new protocol is a huge step up from V4's generic data stream, providing the structured foundation needed for the advanced, generative UI experiences that v5 is targeting.
Checklist for Server-Side Implementors:
- [ ] SSE Endpoint: Ensure your API endpoint that handles chat requests emits Server-Sent Events (SSE).
- [ ] Correct Headers: Your SSE response must include:
-
Content-Type: text/event-stream; charset=utf-8
-
x-vercel-ai-ui-message-stream: v1
-
- [ ] Stream Generation Method:
- Recommended: If using a V2 core function like
streamText()
, useresult.toUIMessageStreamResponse()
to automatically generate the v5 UI Message Stream. - Manual: If you need custom stream generation, use
createUIMessageStream()
to get astream
andwriter
, then useUIMessageStreamWriter
methods to write eachUIMessageStreamPart
.
- Recommended: If using a V2 core function like
- [ ] Complete Message Lifecycle: Ensure you stream all necessary
UIMessageStreamPart
s for each message:- Always start with a
'start'
part. - Stream all relevant content parts (e.g.,
'text'
,'tool-call'
family,'file'
, etc.). - Always end a successful message stream with a
'finish'
part (containing thefinishReason
). - If a stream-level error occurs, send an
'error'
part.
- Always start with a
- [ ] Persistence: Implement persistence logic (e.g., saving
UIMessage
arrays to your database) typically in theonFinish
callback oftoUIMessageStreamResponse()
or after manually constructing and closing your stream withUIMessageStreamWriter
.
Checklist for Client-Side Implementors (using useChat
):
- [ ]
streamProtocol
Configuration: EnsureuseChat
is configured withstreamProtocol: 'ui-message'
. This is the default in v5 Canary, so you might not need to explicitly set it, but be aware of it. - [ ] Render
message.parts
: This is critical. Update your message rendering components to iterate overmessage.parts
(an array on eachUIMessage
object) and render each part according to itstype
(e.g.,TextUIPart
,ToolInvocationUIPart
,FileUIPart
). Do not try to render a top-levelmessage.content
string, as it's no longer the primary content holder. - [ ] Handle Different Part Types: Your rendering logic should be able to handle all the
UIMessagePart
types your application expects to receive from the server. - [ ] Typed Metadata: If you're using custom metadata with your messages, provide a
messageMetadataSchema
(e.g., a Zod schema) touseChat
for validation and type safety. - [ ] Error and Status Handling: Use the
error
object andstatus
string returned byuseChat
to provide appropriate UI feedback to the user (e.g., display error messages, show loading indicators).
Tease Post 3: What's Next?
We've now seen how v5 structures messages (UIMessage
and UIMessagePart
) and how it streams them (UIMessageStreamPart
s via the UI Message Streaming Protocol). But where does the data for these streams originate from on the server? How does the SDK interact with different LLM providers to get the text, tool calls, and other rich data in the first place?
Next, we'll explore the V2 Model Interfaces in detail. We'll understand how AI SDK 5 standardizes interactions with diverse LLM providers (OpenAI, Anthropic, Google, etc.) and enables the rich multi-modal capabilities that ultimately flow through these structured streams we've just dissected. This is where the SDK's power to abstract away provider differences really shines!
Your AI generated slop is polluting the internet with non-existent functions. The sample code, and functions are all hallucinated.
The stream creation also doesn't follow the function signatures.
See UIMessageStreamWriter
github.com/vercel/ai/blob/v5/packa...