Vercel AI SDK v5 Internals - Part 2 — Streaming the Richness: Inside the UI Message Protocol & UIMessageStreamParts

After our first look into the Vercel AI SDK v5 and its new UIMessage structure, it's time to pull back the curtain on how these rich, multi-part messages actually make their way from your server to the client. If you thought streaming was just about slinging text deltas, v5 is here to expand your horizons quite a bit.

This time, we're getting into the nitty-gritty of the v5 UI Message Streaming Protocol. This isn't just a minor update; it's a fundamental rewrite designed to handle the complexity of modern AI interactions – think tool calls, file attachments, structured metadata, and those crucial reasoning steps, all streaming in real-time.

As a quick heads-up, we're still sailing in v5 canary waters. This means APIs can shift, and things might look a tad different by the time it hits stable. But the core concepts we'll cover here are foundational to how v5 aims to revolutionize building conversational UIs.

🖖🏿 A Note on Process & Curation: While I didn't personally write every word, this piece is a product of my dedicated curation. It's a new concept in content creation, where I've guided powerful AI tools (like Gemini Pro 2.5 for synthesis, git diff main vs canary v5 informed by extensive research including OpenAI's Deep Research, spent 10M+ tokens) to explore and articulate complex ideas. This method, inclusive of my fact-checking and refinement, aims to deliver depth and accuracy efficiently. I encourage you to see this as a potent blend of human oversight and AI capability. I use them for my own LLM chats on Thinkbuddy, and doing some make-ups and pushing to there too.

Let's dive in.

1. Prelude: Why Streaming Needed a Rewrite

TL;DR: V4's general data stream wasn't cut out for the rich, structured, multi-part UIMessages of v5, necessitating a new, purpose-built streaming protocol to handle the diverse content types efficiently and robustly.

Why this matters?

Streaming is the lifeblood of any good chat application. Nobody wants to stare at a loading spinner for ages while the AI crafts its magnum opus. We want to see those words appear token by token, giving that illusion of a live conversation. Vercel AI SDK v4 did a decent job with its "Data Stream" (you might remember seeing the X-Vercel-AI-Data-Stream: v1 header). It was good for, well, data – primarily text deltas and some JSON for annotations.

But as we discussed in Post 1, AI SDK v5 introduces the rich UIMessage structure. This isn't just a simple string; a single UIMessage can be composed of various UIMessageParts – TextUIPart, ToolInvocationUIPart, FileUIPart, SourceUIPart, ReasoningUIPart, and even StepStartUIPart markers. Trying to shoehorn this kind of structured, multi-faceted information into V4's generic data stream would have been a nightmare. Imagine trying to clearly delineate a text chunk, then a tool call initiation, then the arguments for that tool call, then maybe a file reference, all while ensuring type safety and efficient parsing on the client. You'd end up with complex custom JSON payloads within that generic stream, effectively reinventing a structured protocol on top of it. It would be cumbersome, error-prone, and just not scalable for the kind of generative UI experiences v5 is aiming for.

How it’s solved in v5? (The New Protocol)

To tackle this head-on, AI SDK v5 introduces a brand new, dedicated UI Message Streaming Protocol. This isn't just an iteration; it's a ground-up redesign specifically engineered to transport UIMessage updates and their constituent parts efficiently and robustly. Think of it as a specialized data conduit, built from the ground up to carry these structured messages. This protocol is the magic that allows your client to reconstruct those rich UIMessage objects, part by part, as the AI generates its response.

Take-aways / Migration Checklist Bullets

V4's "Data Stream" was too generic for v5's rich, multi-part UIMessages.
Streaming structured data like tool calls, files, and distinct reasoning steps reliably was a challenge with V4's stream.
AI SDK v5 introduces a new, dedicated UI Message Streaming Protocol to solve this.
This new protocol is designed to transport UIMessage updates and their parts efficiently.

2. Header Primer: `x-vercel-ai-ui-message-stream: v1`

TL;DR: Responses using the new v5 UI Message Streaming Protocol are identified by the x-vercel-ai-ui-message-stream: v1 HTTP header and are built on Server-Sent Events (SSE).

Why this matters?

When your client makes a request to your backend chat API, how does it know what kind of stream to expect? With different protocols potentially in play (especially during transitions or in complex systems), having a clear identifier is crucial for correct parsing and handling. Relying solely on Content-Type might not be enough to distinguish between different SSE-based protocols.

How it’s solved in v5? (The Identifying Header)

Server responses that use the new v5 UI Message Streaming Protocol will include a specific HTTP header:
x-vercel-ai-ui-message-stream: v1

This header serves a clear purpose:

Identification: It allows clients (like useChat in v5), and potentially any intermediate proxies or gateways, to unambiguously identify the format of the incoming stream. This ensures that the client uses the correct parsing logic (processUIMessageStream in this case) for the v5 protocol.
Versioning: The v1 suffix indicates the version of the UI Message Streaming Protocol. This is smart future-proofing, allowing for potential revisions to the protocol down the line (e.g., v2) without breaking existing implementations. Clients could then negotiate or adapt based on the version they understand.

This protocol is built on Server-Sent Events (SSE), which is a standard way for a server to push data to a client over a single HTTP connection. As such, you'll also typically see the standard SSE content type:
Content-Type: text/event-stream; charset=utf-8

So, when you're debugging network requests, look out for both of these headers. They are your signposts indicating a v5 UI Message Stream.

+-------------------------------------------------+
| Browser Network Tab - Response Headers          |
+-------------------------------------------------+
| ...                                             |
| Content-Type: text/event-stream; charset=utf-8  |
| x-vercel-ai-ui-message-stream: v1               |
| Cache-Control: no-cache                         |
| Connection: keep-alive                          |
| ...                                             |
+-------------------------------------------------+

[FIGURE 1: Screenshot of browser network tab showing response headers with 'x-vercel-ai-ui-message-stream: v1' and 'Content-Type: text/event-stream']

Take-aways / Migration Checklist Bullets

v5 UI Message Streams are identified by the x-vercel-ai-ui-message-stream: v1 HTTP header.
The protocol uses Server-Sent Events (SSE).
Expect Content-Type: text/event-stream; charset=utf-8.
This header is crucial for clients to correctly parse the stream.

3. Catalogue of `UIMessageStreamPart` Events

TL;DR: The v5 UI Message Stream consists of a sequence of JSON objects, each being a typed UIMessageStreamPart, categorized into lifecycle events (start, finish, error) and content delivery events (text, reasoning, `tool-call family, file, source, metadata), all designed to incrementally build or update UIMessage` objects on the client.*

Why this matters?

If we're moving beyond simple text streams, we need a well-defined vocabulary for the different kinds of information that can come down the pipe. How does the client know when a new message begins? How does it distinguish between a chunk of text, a tool call, or a piece of metadata? Without a clear structure for these events, the client-side parsing logic would be a chaotic mess.

How it’s solved in v5? (The Stream Part Catalogue)

The v5 UI Message Stream is essentially a sequence of JSON objects sent as individual SSE data events. Each of these JSON objects is a UIMessageStreamPart (a new v5 term, primarily defined in packages/ai/src/ui-message-stream/ui-message-stream-parts.ts). These parts are the atomic units of information in the v5 stream.

Each UIMessageStreamPart object has a type field, which acts as a discriminator, telling the client what kind of information the part contains and what its payload structure will be. Most content-related parts also include a messageId field. This messageId is crucial, as it allows the client-side processing logic (like processUIMessageStream) to associate the incoming part with the specific UIMessage object it's intended to update or contribute to. This is how the SDK can handle multiple messages being updated or streamed concurrently, though typically it's one assistant message at a time.

Let's break down the different types of UIMessageStreamParts. They generally fall into two categories: lifecycle parts and content parts.

3.1 Lifecycle parts – `start`, `finish`, `error`

These parts manage the overall flow of a message stream and its lifecycle.

'start':
- TypeScript Interface (Conceptual):
```
interface UIMessageStartStreamPart {
  type: 'start';
  messageId: string;
  createdAt?: string; // ISO 8601 date string
}
```
- Purpose: This part signals the beginning of a new UIMessage being streamed or indicates that updates are about to commence for an existing UIMessage (though typically, it's for a new assistant message).
- messageId: string: This is the unique identifier for the UIMessage that will be constructed or updated by subsequent parts. The client uses this ID to map incoming content parts to the correct message object in its state. This ID should remain stable for the lifetime of that UI message.
- createdAt?: string: An optional, server-authoritative timestamp (as an ISO 8601 string) for when the message effectively began. The client can use this to instantiate the UIMessage.createdAt field.
'finish':
- TypeScript Interface (Conceptual):
```
interface UIMessageFinishStreamPart {
  type: 'finish';
  messageId: string;
  finishReason: LanguageModelV2FinishReason; // e.g., 'stop', 'length', 'tool-calls', 'content-filter', 'error'
  usage?: LanguageModelV2Usage; // { promptTokens: number; completionTokens: number; totalTokens?: number }
  providerMetadata?: Record<string, any>;
}
```
- Purpose: Indicates that all content parts for the UIMessage identified by messageId have been successfully streamed. This is the signal that the message is now complete from the server's perspective.
- messageId: string: Identifies the UIMessage that has finished streaming.
- finishReason: LanguageModelV2FinishReason: A string indicating why the generation concluded (e.g., 'stop' for natural completion, 'length' if max tokens hit, 'tool-calls' if the model stopped to invoke tools, 'content-filter' if generation was halted by a content filter, or 'error' if an unrecoverable model error occurred during generation for this message). This is a crucial piece of metadata.
- usage?: LanguageModelV2Usage: Optional. Provides token usage information (prompt tokens, completion tokens) for the generation of this message.
- providerMetadata?: Record<string, any>: Optional. Any final, provider-specific metadata associated with the completion of this message.
'error':
- TypeScript Interface (Conceptual):
```
interface UIMessageErrorStreamPart {
  type: 'error';
  value: string; // The error message
}
```
- Purpose: Signals a stream-level or general processing error on the server that isn't specific to a single message's finishReason (though a model error could also manifest as a finishReason: 'error' in a 'finish' part). This is for errors that might affect the entire stream or the server's ability to continue.
- value: string: Contains the error message string.
- Client Handling: When useChat (via processUIMessageStream) receives this, it typically sets its error state to a new Error(value), making the error accessible to your UI for display.

3.2 Content parts – `text`, `reasoning`, `tool-call` family, `file`, `source`, `metadata`

These parts deliver the actual content that will populate the UIMessage.parts array or UIMessage.metadata on the client. Each of these will include a messageId to link it to the correct UIMessage.

'text':
- TypeScript Interface (Conceptual):
```
interface UIMessageTextStreamPart {
  type: 'text';
  messageId: string;
  value: string; // A delta of text
}
```
- Purpose: Carries a delta (chunk) of text content.
- Client Handling: The client appends this value to the text property of a TextUIPart within the UIMessage.parts array for the given messageId. If no TextUIPart exists yet for this stream segment, one is created. This is how text appears to stream token by token.
'reasoning':
- TypeScript Interface (Conceptual):
```
interface UIMessageReasoningStreamPart {
  type: 'reasoning';
  messageId: string;
  value: string; // Text for the reasoning step
  providerMetadata?: Record<string, any>;
}
```
- Purpose: Streams text content intended for a ReasoningUIPart in the UIMessage.
- Client Handling: Similar to 'text', this value populates the text field of a ReasoningUIPart, with providerMetadata also being set if provided.
'tool-call' family (related parts for ToolInvocationUIPart):
Managing tool calls is complex, so there are several stream parts dedicated to updating the state of a ToolInvocationUIPart on the client:
- 'tool-call-delta':
  - TypeScript Interface (Conceptual):
```
interface UIMessageToolCallDeltaStreamPart {
  type: 'tool-call-delta';
  messageId: string;
  toolCallId: string; // ID for this specific tool call
  toolName: string;   // Name of the tool being called
  argsTextDelta: string; // A delta of the stringified JSON arguments
}
```
  - Purpose: Streams the arguments for a tool call progressively. This is useful if the arguments are long or generated token by token by the model.
  - Client Handling: This part typically creates or updates a ToolInvocationUIPart (identified by toolCallId within the UIMessage for messageId). The toolInvocation.state is set to 'partial-call', and argsTextDelta is appended to accumulate the full argument string.
- 'tool-call':
  - TypeScript Interface (Conceptual):
```
interface UIMessageToolCallStreamPart {
  type: 'tool-call';
  messageId: string;
  toolCallId: string;
  toolName: string;
  args: string; // Complete stringified JSON arguments
}
```
  - Purpose: Signals a complete tool call specification from the model. The args field contains the complete stringified JSON arguments for the tool.
  - Client Handling: This finalizes the argument accumulation for the ToolInvocationUIPart. The toolInvocation.state transitions to 'call', and the args string is parsed into a JSONValue and stored.
- 'tool-result':
  - TypeScript Interface (Conceptual):
```
interface UIMessageToolResultStreamPart {
  type: 'tool-result';
  messageId: string;
  toolCallId: string;
  toolName: string; // Often included for context, though toolCallId is primary
  result: string;   // Stringified JSON result from the tool execution
}
```
  - Purpose: Provides the result of a tool execution (after the application has run the tool).
  - Client Handling: Updates the ToolInvocationUIPart (identified by toolCallId). The toolInvocation.state transitions to 'result', and the result string is parsed into a JSONValue and stored.
- 'tool-error':
  - TypeScript Interface (Conceptual):
```
interface UIMessageToolErrorStreamPart {
  type: 'tool-error';
  messageId: string;
  toolCallId: string;
  toolName: string; // Context
  errorMessage: string; // The error message from tool execution
}
```
  - Purpose: Communicates an error that occurred during a tool's invocation or execution.
  - Client Handling: Updates the ToolInvocationUIPart. The toolInvocation.state transitions to 'error', and errorMessage is stored.

'file':

TypeScript Interface (Conceptual):

interface UIMessageFileStreamPart {
  type: 'file';
  messageId: string;
  file: {
    mediaType: string; // IANA media type
    filename?: string;
    url: string;       // Remote URL or Data URL
  };
}

Purpose: Provides the necessary information to construct or update a FileUIPart within a UIMessage.
Client Handling: A FileUIPart is added to UIMessage.parts with the provided file details.

'source':
- TypeScript Interface (Conceptual):
```
interface UIMessageSourceStreamPart {
  type: 'source';
  messageId: string;
  source: LanguageModelV2Source; // e.g., { sourceType: 'url', id: string, url: string, title?: string, ... }
}
```
- Purpose: Delivers data for a SourceUIPart, typically used in RAG systems for citations.
- Client Handling: A SourceUIPart is added to UIMessage.parts with the provided source data.
'metadata':
- TypeScript Interface (Conceptual):
```
interface UIMessageMetadataStreamPart {
  type: 'metadata';
  messageId: string;
  metadata: unknown; // The custom metadata object or value
}
```
- Purpose: Allows for streaming updates or additions to a UIMessage's custom metadata field.
- Client Handling: The client should take this metadata payload (which is unknown from the stream's perspective) and validate it against its application-defined messageMetadataSchema (if provided to useChat). If valid, it's merged into the UIMessage.metadata object for the given messageId.

This catalogue of UIMessageStreamPart events forms the backbone of real-time, structured communication in v5, enabling the client to dynamically build those rich UIMessage objects we explored in Post 1.

Server -> Client Stream (for a single message M1 with tool call T1):
----------------------------------------------------------------------------------------------
SSE Event: data: {"type":"start", "messageId":"M1", "createdAt":"..."}
   |
   v
SSE Event: data: {"type":"text", "messageId":"M1", "value":"Thinking... "}
   |
   v
SSE Event: data: {"type":"text", "messageId":"M1", "value":"about "}
   |
   v
SSE Event: data: {"type":"tool-call-delta", "messageId":"M1", "toolCallId":"T1", "toolName":"search", "argsTextDelta":"{\"q\":\"AI"}
   |
   v
SSE Event: data: {"type":"tool-call-delta", "messageId":"M1", "toolCallId":"T1", "toolName":"search", "argsTextDelta":" news\"}"}
   |
   v
SSE Event: data: {"type":"tool-call", "messageId":"M1", "toolCallId":"T1", "toolName":"search", "args":"{\"q\":\"AI news\"}"}
   |
   v
(Client/App executes tool T1 based on the 'tool-call' event above)
   |
   v (Server is informed of tool T1 result by app, then streams it back)
SSE Event: data: {"type":"tool-result", "messageId":"M1", "toolCallId":"T1", "toolName":"search", "result":"[{\"title\":\"...\"}]"}
   |
   v
SSE Event: data: {"type":"text", "messageId":"M1", "value":"\nHere is what I found."}
   |
   v
SSE Event: data: {"type":"finish", "messageId":"M1", "finishReason":"stop", "usage":{...}}
----------------------------------------------------------------------------------------------

[FIGURE 2: Sequence diagram illustrating a typical flow of UIMessageStreamParts for a message with text and a tool call: 'start' -> 'text' (delta) -> 'text' (delta) -> 'tool-call-delta' -> 'tool-call' -> (app executes tool) -> server sends 'tool-result' -> 'text' (delta) -> 'finish']

Take-aways / Migration Checklist Bullets

The v5 UI Message Stream is a sequence of JSON objects, each a typed UIMessageStreamPart.
messageId links content parts to the correct UIMessage.
Lifecycle parts ('start', 'finish', 'error') manage the stream.
Content parts ('text', 'reasoning', the 'tool-call' family, 'file', 'source', 'metadata') deliver data for UIMessage.parts or UIMessage.metadata.
Tool interactions have a family of stream parts ('tool-call-delta', 'tool-call', 'tool-result', 'tool-error') to manage their lifecycle.

4. Server Emission Patterns

TL;DR: Servers can emit the v5 UI Message Stream either automatically using result.toUIMessageStreamResponse() after a V2 streamText call (recommended), or manually by constructing the stream with UIMessageStreamWriter for more fine-grained control or integration with custom AI logic.

Why this matters?

Knowing what the stream parts are is one thing; knowing how to generate them from your server is another. Developers need clear patterns for producing this v5 UI Message Stream, whether they're using standard SDK functions or integrating with more complex, custom AI pipelines.

How it’s solved in v5? (Emission Helpers)

AI SDK v5 provides two main ways for your server-side code (e.g., in a Next.js API route) to produce this new stream.

4.1 Auto-generation via `streamText().toUIMessageStreamResponse()`

This is the primary and recommended method for most chat scenarios, especially when you're directly using a V2 model provider with streamText().

How it works:

You make a call to a V2 core function like streamText() using a V2 model instance (e.g., openai('gpt-4o-mini')).

// Simplified server route
import { streamText } from '@ai-sdk/provider';
import { openai } from '@ai-sdk/openai'; // V2 provider
import { UIMessage, convertToModelMessages } from 'ai'; // v5 utilities

export async function POST(req: Request) {
  const { messages: uiMessages }: { messages: UIMessage[] } = await req.json();

  // Convert incoming UIMessages to ModelMessages for the LLM
  const { modelMessages } = convertToModelMessages(uiMessages);

  const result = await streamText({
    model: openai('gpt-4o-mini'), // Using a V2 model instance
    messages: modelMessages,
    // ... other options like tools, system prompt, etc.
  });

  // Now, the magic step:
  return result.toUIMessageStreamResponse();
}

2.  The `result` object returned by `streamText()` (which is a `StreamTextResult` or similar V2 result type) has a handy method: `toUIMessageStreamResponse()`.
3.  Calling `result.toUIMessageStreamResponse()` does the heavy lifting:
    *   It takes the underlying stream of V2 core parts (raw text deltas, tool call information, file data, etc.) coming from the LLM provider.
    *   It transforms these V2 core parts into the corresponding v5 `UIMessageStreamPart`s (like `'text'`, `'tool-call'`, `'file'`, etc.).
    *   It wraps this transformed stream in a standard `Response` object, correctly setting the SSE headers:
        *   `Content-Type: text/event-stream; charset=utf-8`
        *   `x-vercel-ai-ui-message-stream: v1`

Benefits: This is incredibly convenient. You don't have to manually construct SSE events or worry about the intricacies of mapping different V2 core parts to v5 stream parts. The SDK handles it for you.
onFinish for Persistence: As mentioned before, toUIMessageStreamResponse() can also take an onFinish callback in its options. This callback is invoked on the server after all UIMessageStreamParts for the current AI turn have been generated and queued to be sent to the client. This is the ideal place to persist the final, fully formed UIMessage(s) from the assistant's turn.

4.2 Manual generation via `UIMessageStreamWriter`

There are scenarios where streamText().toUIMessageStreamResponse() might not be a direct fit:

Integrating with AI logic that doesn't come from a standard V2 streamText call (e.g., some LangChain setups before robust @ai-sdk/langchain v5 adapters mature, custom rule-based systems, or streaming data from a non-LLM source that you want to present in the chat UI).
Needing extremely fine-grained control over exactly which UIMessageStreamParts are streamed and when they are sent, perhaps for complex multi-stage agentic behaviors.

For these cases, AI SDK v5 provides a lower-level utility: createUIMessageStream (found in packages/ai/src/ui-message-stream/create-ui-message-stream.ts).

How it works:

Call createUIMessageStream():

import { createUIMessageStream, UIMessageStreamWriter } from 'ai'; // v5 utility
// In your API route:
// const { stream, writer } = createUIMessageStream();

2.  This function returns an object containing:
    *   `stream: ReadableStream`: This is the `ReadableStream` that will eventually carry your SSE data. You'll return this in your `Response` object.
    *   `writer: UIMessageStreamWriter`: This is an object with methods that allow you to manually write different `UIMessageStreamPart`s into the `stream`.

UIMessageStreamWriter Methods: The writer object provides methods like:
- writer.writeStart({ messageId: string; createdAt?: string; })
- writer.writeTextDelta(messageId: string, value: string)
- writer.writeReasoning(messageId: string, value: string, providerMetadata?: Record<string, any>)
- writer.writeToolCallDelta(messageId: string, { toolCallId, toolName, argsTextDelta })
- writer.writeToolCall(messageId: string, { toolCallId, toolName, args })
- writer.writeToolResult(messageId: string, { toolCallId, toolName, result })
- writer.writeToolError(messageId: string, { toolCallId, toolName, errorMessage })
- writer.writeFile(messageId: string, fileData: { mediaType, filename?, url })
- writer.writeSource(messageId: string, sourceData: LanguageModelV2Source)
- writer.writeMetadata(messageId: string, metadata: unknown)
- writer.writeError(errorMessage: string) (for stream-level errors)
- writer.writeFinish({ messageId: string; finishReason: LanguageModelV2FinishReason; ... })
- writer.close(): Crucially, you must call writer.close() when you are done writing all parts. This signals the end of the stream.
- writer.abort(error?: Error): To terminate the stream due to an error.

Conceptual Code Snippet (Manual Emission):

import { createUIMessageStream, UIMessageStreamWriter, LanguageModelV2FinishReason } from 'ai';
import { NextRequest, NextResponse } from 'next/server';

export const runtime = 'edge';

// Example function that simulates a custom AI process and writes to the stream
async function handleCustomAIProcess(writer: UIMessageStreamWriter, userInput: string) {
  const assistantMessageId = "assistant-" + Date.now();

  // Always start a message
  writer.writeStart({ messageId: assistantMessageId, createdAt: new Date().toISOString() });

  writer.writeTextDelta(assistantMessageId, "Okay, I will process your input: '");
  await new Promise(r => setTimeout(r, 200)); // Simulate work
  writer.writeTextDelta(assistantMessageId, userInput);
  writer.writeTextDelta(assistantMessageId, "'. ");

  // Simulate some reasoning
  await new Promise(r => setTimeout(r, 300));
  writer.writeReasoning(assistantMessageId, "First, I need to analyze the sentiment.");
  writer.writeTextDelta(assistantMessageId, "\nSentiment analysis complete. ");

  // Simulate a tool call
  await new Promise(r => setTimeout(r, 500));
  const toolCallId = "tool-" + Date.now();
  writer.writeToolCallDelta(assistantMessageId, { toolCallId, toolName: 'myCustomTool', argsTextDelta: '{"param":"val' });
  await new Promise(r => setTimeout(r, 100));
  writer.writeToolCallDelta(assistantMessageId, { toolCallId, toolName: 'myCustomTool', argsTextDelta: 'ue"}' });
  writer.writeToolCall(assistantMessageId, { toolCallId, toolName: 'myCustomTool', args: JSON.stringify({param: "value"}) });

  // Simulate tool execution and result
  await new Promise(r => setTimeout(r, 700)); // Simulate tool work
  const toolResult = { status: "success", data: "Tool executed successfully." };
  writer.writeToolResult(assistantMessageId, { toolCallId, toolName: 'myCustomTool', result: JSON.stringify(toolResult) });

  writer.writeTextDelta(assistantMessageId, "\nAll processing is now complete.");

  // Always finish the message
  writer.writeFinish({
    messageId: assistantMessageId,
    finishReason: 'stop' as LanguageModelV2FinishReason, // Cast for type safety
    usage: { promptTokens: 10, completionTokens: 50 } // Example usage
  });

  // CRITICAL: Close the writer to end the stream properly
  writer.close();
}

export async function POST(req: NextRequest) {
  try {
    const { input } = await req.json(); // Assuming client sends { input: "user's text" }
    const { stream, writer } = createUIMessageStream();

    // Call your custom logic. IMPORTANT: Do NOT await handleCustomAIProcess if you want to stream.
    // Let it run in the background and write to the stream.
    handleCustomAIProcess(writer, input as string).catch(err => {
      console.error("Error in custom AI process:", err);
      // If an error occurs in the async process, try to write a stream error part
      // This requires careful error handling and writer management.
      try {
        if (!writer.closed) { // Check if writer is still open
          writer.writeError(err instanceof Error ? err.message : "Unknown error in background process");
          writer.close();
        }
      } catch (writeErrorErr) {
        console.error("Error writing stream error:", writeErrorErr);
      }
    });

    // Return the stream in a Response object with correct headers
    return new NextResponse(stream, {
      headers: {
        'Content-Type': 'text/event-stream; charset=utf-8',
        'x-vercel-ai-ui-message-stream': 'v1',
        'Cache-Control': 'no-cache', // Ensure no caching for SSE
        'Connection': 'keep-alive',  // Keep connection open for SSE
      },
    });

  } catch (error) {
    // Handle errors in setting up the stream itself
    console.error('[Manual Stream API Error]', error);
    return NextResponse.json({ error: (error as Error).message }, { status: 500 });
  }
}

Remember to handle errors robustly when manually managing streams, especially in the async processing function.

Choosing between these two patterns depends on your needs. For standard LLM interactions, result.toUIMessageStreamResponse() is simpler and safer. For more complex or custom scenarios, UIMessageStreamWriter gives you the power and control.

Take-aways / Migration Checklist Bullets

Use result.toUIMessageStreamResponse() for the easiest way to generate v5 UI Message Streams from V2 streamText calls.
This method handles transformation and SSE header setting automatically.
For manual stream generation, use createUIMessageStream() to get a stream and writer.
Use UIMessageStreamWriter methods to write individual UIMessageStreamParts.
Always call writer.close() when done with manual streaming to properly terminate the SSE stream.

5. Client Consumption

TL;DR: On the client, useChat internally uses processUIMessageStream to consume the v5 UI Message Stream, parse UIMessageStreamParts, and intelligently reconstruct/update UIMessage objects in its state, triggering reactive UI updates.

Why this matters?

Okay, so the server is dutifully sending this stream of finely crafted UIMessageStreamParts. How does the client make sense of it all? If you were to manually parse an SSE stream, manage message states, accumulate text deltas, handle tool lifecycle updates, and validate metadata, your client-side code would become incredibly complex and error-prone. We need a robust client-side utility to handle this consumption.

How it’s solved in v5? (The Client-Side Processor)

The Vercel AI SDK provides a core client-side utility for this: processUIMessageStream (located in packages/ai/src/ui/process-ui-message-stream.ts). This function is the workhorse that useChat (and potentially other v5 UI hooks) uses under the hood to consume and interpret the v5 UI Message Stream.

5.1 `processUIMessageStream()` algorithm

You generally won't call processUIMessageStream directly if you're using useChat, but understanding what it does is key to understanding how useChat works its magic.

Role and Inputs:
- It takes a ReadableStream (typically obtained from the body of a fetch response) as its primary input. This stream is expected to be a v5 UI Message Stream (SSE of UIMessageStreamPart JSON objects).
- It also takes an options object, which includes crucial callbacks.
Core Logic:
1. SSE Parsing: It reads the ReadableStream, decodes the SSE events, and parses the data: field of each event as a JSON UIMessageStreamPart.
2. UIMessage Reconstruction/Update: This is where the intelligence lies.
  - When a 'start' part arrives, processUIMessageStream uses its messageId to either identify an existing UIMessage to update or, more commonly, to create a new UIMessage instance in memory (e.g., for an assistant's reply). The id of this UIMessage is set from the messageId in the 'start' part, ensuring stability.
  - For content parts like 'text', 'reasoning', 'file', 'source', it uses the messageId to find the corresponding UIMessage object.
  - It then appends text deltas from 'text' parts to the appropriate TextUIPart (creating one if necessary).
  - It assembles ToolInvocationUIParts through their various states using 'tool-call-delta', 'tool-call', 'tool-result', and 'tool-error' stream parts, carefully managing the toolCallId and the toolInvocation.state.
  - It adds new FileUIParts, SourceUIParts, or ReasoningUIParts as they arrive.
  - For 'metadata' parts, it takes the metadata payload, validates it against the messageMetadataSchema (if provided in options), and then merges it into the UIMessage.metadata field.
3. Callback Invocation: This is how processUIMessageStream communicates changes back to its consumer (e.g., useChat).
Key Callbacks for processUIMessageStream (Options):
- onUpdate(message: UIMessage<METADATA>, updateReason: 'initial' | 'update' | 'finish'): This is arguably the most important callback. It's called whenever a UIMessage object is significantly constructed or updated by incoming stream parts.
  - The message argument is the current state of the UIMessage object being processed.
  - The updateReason (or similar parameter) might indicate why onUpdate was called (e.g., a new message started, an existing part was updated, or the message just finished).
  - This is the hook that useChat uses to reactively update its internal messages state array, which in turn triggers UI re-renders in your React components.
- onDone(): Called when the entire stream has been successfully processed and closed (typically after the last 'finish' part for the last message in the stream, or if the stream ends for other reasons without an error).
- onClientError(error: Error): Called if processUIMessageStream itself encounters an error during its processing (e.g., an error parsing an SSE event, a network error during consumption after the initial fetch, or if a required callback throws). This is distinct from server-sent 'error' stream parts, which are handled as data.
- messageMetadataSchema: As mentioned, an optional schema (e.g., a Zod schema) for validating custom metadata payloads received via 'metadata' stream parts.

+-----------------+     +--------------------------+     +--------------------+     +---------------------+
| ReadableStream  | --> | processUIMessageStream() | --> | onUpdate(message,  | --> | Update UIMessage[]  |
| (SSE from API)  |     | (Parses StreamParts,     |     |        updateReason) |     | in useChat's state  |
+-----------------+     |  Builds UIMessages)      |     |        callback)     |     | (triggers UI rerender)|
                        +--------------------------+     +--------------------+     +---------------------+
                                   |  ^
                                   |  | (metadata validation)
                                   v  |
                       +--------------------------+
                       | messageMetadataSchema    |
                       | (Optional Zod schema)    |
                       +--------------------------+

[FIGURE 3: Diagram showing ReadableStream -> processUIMessageStream -> onUpdate callback -> updates UIMessage array]

5.2 React integration inside `useChat`

So, how does useChat tie into all this?

When you call an action that triggers an API call (e.g., handleSubmit, reload, experimental_resume), useChat internally uses a helper function (like callChatApi) to make the actual fetch request to your backend API endpoint.
callChatApi gets the Response object from fetch.
If the response is successful and has the x-vercel-ai-ui-message-stream: v1 header, callChatApi takes the response.body (which is a ReadableStream) and pipes it to processUIMessageStream.
The onUpdate callback that useChat (via callChatApi) provides to processUIMessageStream is a function that knows how to update useChat's internal messages state. When onUpdate is called by processUIMessageStream with an updated UIMessage, useChat updates its state, which causes your React component to re-render with the new message data.
Callbacks like onDone and onClientError from processUIMessageStream are also wired up to update useChat's status and error states.

This layered approach means that as a useChat user, you're abstracted away from the raw stream processing. You simply see the messages array update magically as data streams in, but now you know processUIMessageStream is the engine making it happen.

Take-aways / Migration Checklist Bullets

processUIMessageStream is the core client-side utility for consuming v5 UI Message Streams.
It parses SSE events containing UIMessageStreamParts.
It intelligently reconstructs/updates UIMessage objects using messageId.
Its onUpdate callback is key for reactive UI updates in hooks like useChat.
useChat uses processUIMessageStream internally, abstracting the raw processing from the developer.

6. Debugging Streams (curl, Browser Dev Tools)

TL;DR: Inspect v5 UI Message Streams using curl to see raw SSE events, or use your browser's Developer Tools Network tab to examine the EventStream response and individual messages when useChat makes requests.

Why this matters?

When things go wrong with streaming, or when you're just trying to understand what data your server is actually sending, being able to inspect the raw stream is invaluable. Blindly trusting that the stream is correct can lead to a lot of frustration.

How it’s solved in v5? (Inspection Techniques)

Here are a few ways to peek under the hood of your v5 UI Message Streams:

Using curl (Command Line):
This is great for hitting your backend API endpoint directly from your terminal and seeing the raw SSE stream without any client-side processing.
```
# Example: POSTing to a local Next.js API route
# Make sure your server is running (e.g., npm run dev)
curl -N -X POST -H "Content-Type: application/json" \
     -d '{"messages": [{"id":"user-1","role":"user","parts":[{"type":"text","text":"Hello, AI!"}]}]}' \
     http://localhost:3000/api/v5/chat
     # Replace /api/v5/chat with your actual endpoint
```
- -N: Tells curl not to buffer the output, so you see events as they arrive.
- -X POST -H "Content-Type: application/json" -d '{...}': Standard POST request with a JSON body. Adjust the body to match what your useChat client would send (an array of UIMessages).
- What to look for: You'll see a sequence of lines, each starting with data:, followed by a JSON object. Each of these JSON objects is one of your UIMessageStreamParts.
```
data: {"type":"start","messageId":"ai-123","createdAt":"2023-10-27T10:00:00Z"}

data: {"type":"text","messageId":"ai-123","value":"Hello"}

data: {"type":"text","messageId":"ai-123","value":" there"}

data: {"type":"text","messageId":"ai-123","value":"!"}

data: {"type":"finish","messageId":"ai-123","finishReason":"stop"}
```
This is super helpful for verifying that your server is sending the parts you expect, in the correct order, and with the correct messageId.
Using npx sse-cat (More User-Friendly CLI for SSE):
While curl is powerful, sse-cat is a small utility specifically for inspecting SSE streams and can sometimes format the output more nicely if the JSON is complex.
```
# sse-cat is primarily for GET requests. For POST, it's trickier.
# If your endpoint supports GET for streaming (e.g., for a fixed demo response):
# npx sse-cat http://localhost:3000/api/v5/chat-demo-stream

# For POST requests with SSE, tools like Postman or Insomnia are often better
# than trying to wrangle sse-cat with POST bodies.
```
For most v5 chat endpoints that expect POST, curl or a GUI API client is usually more practical. If you have a simple GET endpoint that streams SSE for testing, sse-cat can be handy.

Browser Developer Tools (Your Best Friend on the Client):
When your application is running in the browser and useChat is making requests:

Open your browser's Developer Tools (usually F12 or Right-click -> Inspect).
Go to the Network tab.
Trigger a chat message submission in your UI.
You'll see a fetch (or XHR) request made to your API endpoint (e.g., /api/v5/chat). Click on it.
Look at the Headers tab to confirm the request/response headers (e.g., x-vercel-ai-ui-message-stream: v1).

The most useful tab here is often EventStream (Chrome), Response (Firefox, sometimes shows raw stream), or a similar tab that specifically decodes SSE.

In Chrome's EventStream tab, you'll see each individual SSE message received from the server, neatly displayed with its id, event (usually message), and data (your UIMessageStreamPart JSON).

+--------------------------------------------------------------------+
| Browser DevTools - Network Tab - EventStream View                  |
+--------------------------------------------------------------------+
| Request: /api/v5/chat                                              |
+--------------------------------------------------------------------+
| Events:                                                            |
|                                                                    |
| > Time: 10:00:01.100 | Event: message                              |
|   data: {"type":"start","messageId":"ai-xyz", ...}                 |
|                                                                    |
| > Time: 10:00:01.200 | Event: message                              |
|   data: {"type":"text","messageId":"ai-xyz","value":"Hello"}       |
|                                                                    |
| > Time: 10:00:01.350 | Event: message                              |
|   data: {"type":"tool-call-delta","messageId":"ai-xyz", ...}       |
|                                                                    |
| ... (more stream parts as they arrive) ...                         |
|                                                                    |
| > Time: 10:00:02.500 | Event: message                              |
|   data: {"type":"finish","messageId":"ai-xyz", ...}                |
+--------------------------------------------------------------------+

[FIGURE 4: Screenshot of Chrome DevTools Network tab, EventStream view, showing individual UIMessageStreamPart JSON objects being received]
This allows you to see exactly what processUIMessageStream is receiving from the server, which is invaluable for debugging client-side processing issues or discrepancies between what the server thinks it's sending and what the client actually gets.

These debugging techniques are essential for building and troubleshooting robust streaming applications with v5.

Take-aways / Migration Checklist Bullets

Use curl -N to inspect raw v5 UI Message Stream events from your server endpoint.
Leverage browser Developer Tools (Network tab -> EventStream/Response) to see what useChat is receiving.
Look for data: {...} lines containing your UIMessageStreamPart JSON objects.
Verify headers like x-vercel-ai-ui-message-stream: v1 and Content-Type: text/event-stream.

7. Performance: Throttling & Back-pressure

TL;DR: AI SDK v5 offers client-side UI update throttling via experimental_throttleTimeMilliseconds in useChat to prevent jank from rapid token arrival, while server-side core functions like streamText are designed with back-pressure in mind to avoid overwhelming the client.

Why this matters?

Streaming is great for perceived performance, but if not handled carefully, it can also cause performance issues. Ultra-rapid token arrival from a fast LLM could lead to too many state updates and re-renders on the client, causing UI jank or even React errors like "maximum update depth exceeded." Conversely, if the server generates data much faster than the client or network can handle, resources can be wasted.

How it’s solved in v5? (Performance Considerations)

Client-Side Throttling (experimental_throttleTimeMilliseconds):
- This is an option you can pass to useChat (and other UI hooks like useCompletion).
```
const { messages, ... } = useChat({
  // ... other options
  experimental_throttleTimeMilliseconds: 50, // e.g., update UI at most every 50ms (20fps)
});
```
- How it helps: When tokens are arriving very quickly, useChat (via processUIMessageStream and its internal update logic) will buffer these rapid updates. Instead of triggering a React re-render for every single token, it will batch these updates and apply them to the messages state at most once per the specified interval (e.g., every 50 milliseconds).
- Benefit: This significantly reduces the number of re-renders for very fast streams, leading to smoother UI animations (like the text appearing), lower CPU usage on the client, and a more stable UI. It's a trade-off between a tiny bit of added latency for individual token display versus overall UI responsiveness. You can tune this value based on your application's needs.
Server-Side Back-pressure (Conceptual with streamText):
- Core SDK functions like streamText (when interacting with compliant V2 model providers) are generally designed with back-pressure in mind. This is a fundamental concept in stream processing.
- What it means: Ideally, the LLM provider (or the SDK's handling of its stream) only generates new tokens or data as fast as the consuming end (your server route, and ultimately the client connection) can accept them. If the network connection to the client is slow, or if the client is busy, back-pressure signals should propagate back, causing the LLM to pause or slow down its generation.
- Benefit: This helps prevent the server from generating a massive amount of data that just gets buffered and potentially discarded if the client disconnects. It saves resources and can reduce costs if your LLM provider charges per token generated (even if not delivered).
- Contrast with result.consumeStream(): You might have seen the result.consumeStream() pattern on the server, often used with onFinish for persistence (especially in V4 docs for "Chatbot Message Persistence - Handling client disconnects"). Calling consumeStream() on the server explicitly removes back-pressure from the client connection for that server-side processing. The server will then try to read the entire LLM stream as fast as possible, regardless of whether the client is still connected or keeping up. This is a deliberate choice when you want to ensure the full AI response is generated and saved on the server even if the client disconnects mid-stream. It's a trade-off: you ensure full generation for persistence, but you lose the client-driven back-pressure for that specific consumption.

Understanding these performance aspects helps you build chat applications that are not only interactive but also efficient and robust under various network conditions and generation speeds.

Take-aways / Migration Checklist Bullets

Use experimental_throttleTimeMilliseconds in useChat to batch UI updates from rapid streams and prevent jank.
V2 streamText and compliant providers should inherently support back-pressure, preventing the LLM from overwhelming the client.
Be aware that result.consumeStream() on the server bypasses client back-pressure to ensure full generation for persistence.

8. Summary & Checklist for Implementors

TL;DR: The v5 UI Message Streaming Protocol, identified by x-vercel-ai-ui-message-stream: v1, uses SSE to transport typed UIMessageStreamParts, enabling robust streaming of rich, multi-part UIMessages. Implementors need to ensure correct server-side emission and client-side rendering of these structured parts.

We've been on quite a journey through the internals of Vercel AI SDK v5's new UI Message Streaming Protocol! Let's wrap up with a quick summary and a checklist to keep in mind.

Recap of the v5 UI Message Stream:

It's an SSE-based protocol specifically designed for real-time chat updates.
Server responses are identified by the x-vercel-ai-ui-message-stream: v1 HTTP header.
The stream consists of a sequence of JSON objects, each being a typed UIMessageStreamPart.
These parts are categorized into lifecycle events ('start', 'finish', 'error') and content delivery events ('text', 'reasoning', the 'tool-call' family, 'file', 'source', 'metadata').
The protocol enables the efficient and robust streaming of rich, multi-part UIMessage objects, which are then reconstructed on the client by utilities like processUIMessageStream (used by useChat).

This new protocol is a huge step up from V4's generic data stream, providing the structured foundation needed for the advanced, generative UI experiences that v5 is targeting.

Checklist for Server-Side Implementors:

[ ] SSE Endpoint: Ensure your API endpoint that handles chat requests emits Server-Sent Events (SSE).
[ ] Correct Headers: Your SSE response must include:
- Content-Type: text/event-stream; charset=utf-8
- x-vercel-ai-ui-message-stream: v1
[ ] Stream Generation Method:
- Recommended: If using a V2 core function like streamText(), use result.toUIMessageStreamResponse() to automatically generate the v5 UI Message Stream.
- Manual: If you need custom stream generation, use createUIMessageStream() to get a stream and writer, then use UIMessageStreamWriter methods to write each UIMessageStreamPart.
[ ] Complete Message Lifecycle: Ensure you stream all necessary UIMessageStreamParts for each message:
- Always start with a 'start' part.
- Stream all relevant content parts (e.g., 'text', 'tool-call' family, 'file', etc.).
- Always end a successful message stream with a 'finish' part (containing the finishReason).
- If a stream-level error occurs, send an 'error' part.
[ ] Persistence: Implement persistence logic (e.g., saving UIMessage arrays to your database) typically in the onFinish callback of toUIMessageStreamResponse() or after manually constructing and closing your stream with UIMessageStreamWriter.

Checklist for Client-Side Implementors (using useChat):

[ ] streamProtocol Configuration: Ensure useChat is configured with streamProtocol: 'ui-message'. This is the default in v5 Canary, so you might not need to explicitly set it, but be aware of it.
[ ] Render message.parts: This is critical. Update your message rendering components to iterate over message.parts (an array on each UIMessage object) and render each part according to its type (e.g., TextUIPart, ToolInvocationUIPart, FileUIPart). Do not try to render a top-level message.content string, as it's no longer the primary content holder.
[ ] Handle Different Part Types: Your rendering logic should be able to handle all the UIMessagePart types your application expects to receive from the server.
[ ] Typed Metadata: If you're using custom metadata with your messages, provide a messageMetadataSchema (e.g., a Zod schema) to useChat for validation and type safety.
[ ] Error and Status Handling: Use the error object and status string returned by useChat to provide appropriate UI feedback to the user (e.g., display error messages, show loading indicators).

Tease Post 3: What's Next?
We've now seen how v5 structures messages (UIMessage and UIMessagePart) and how it streams them (UIMessageStreamParts via the UI Message Streaming Protocol). But where does the data for these streams originate from on the server? How does the SDK interact with different LLM providers to get the text, tool calls, and other rich data in the first place?

Next, we'll explore the V2 Model Interfaces in detail. We'll understand how AI SDK 5 standardizes interactions with diverse LLM providers (OpenAI, Anthropic, Google, etc.) and enables the rich multi-modal capabilities that ultimately flow through these structured streams we've just dissected. This is where the SDK's power to abstract away provider differences really shines!

Yigit Konur @yigit-konur