Why does my streaming client break on some responses?

The most common cause is trying to JSON.parse() each SSE line directly. SSE lines can contain partial or empty data fields. Always strip the "data: " prefix, skip empty lines and the "[DONE]" sentinel, then parse only valid JSON lines.

What does finish_reason: "length" mean?

The model hit the max_tokens limit and the response was truncated. Increase max_tokens if you need longer responses, or detect this reason code in your client and prompt the user that the response was cut short.

Is the SSE format the same for all AI providers?

No. OpenAI and providers using the OpenAI-compatible API use the data: {JSON}\n\n format with a [DONE] terminator. Anthropic uses typed event: lines with separate event types for message start, content deltas, and message end. The viewer handles both.

Parse and Inspect LLM Streaming JSON (SSE)

LLM streaming responses use Server-Sent Events (SSE) format: a sequence of data: lines, each containing a JSON object with a delta chunk of the response. Debugging streaming issues — dropped chunks, malformed JSON mid-stream, unexpected stop reasons — requires being able to parse the raw SSE data and reconstruct the full message from individual deltas. The streaming JSON viewer parses each data: line, validates the JSON structure, shows the accumulated text after each chunk, and flags any parsing errors. The SSE format from OpenAI sends data: {JSON} lines followed by a final data: [DONE] line. Each JSON object contains the model name, a choices array with a delta field carrying the text increment, and a finish_reason that is null for all chunks except the last. Anthropic's SSE format uses typed event: lines (event: content_block_delta, event: message_delta) with slightly different JSON structure. The viewer supports both formats and auto-detects which provider the stream came from. Common streaming issues this tool helps diagnose: a chunk that contains a partial JSON string causing a parse error in client code that tries to JSON-parse each chunk directly (rather than accumulating and parsing the full response), unexpected early termination with finish_reason: "length" indicating the max_tokens limit was hit, and missing the final [DONE] signal when a network timeout interrupts the stream.

Example

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","model":"gpt-4o","choices":[{"delta":{"role":"assistant","content":""},"finish_reason":null,"index":0}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","model":"gpt-4o","choices":[{"delta":{"content":"The"},"finish_reason":null,"index":0}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","model":"gpt-4o","choices":[{"delta":{"content":" answer"},"finish_reason":null,"index":0}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","model":"gpt-4o","choices":[{"delta":{"content":" is 42."},"finish_reason":"stop","index":0}]}

data: [DONE]

[ open in AI Streaming JSON Viewer → ]

FAQ

Why does my streaming client break on some responses?: The most common cause is trying to JSON.parse() each SSE line directly. SSE lines can contain partial or empty data fields. Always strip the "data: " prefix, skip empty lines and the "[DONE]" sentinel, then parse only valid JSON lines.
What does finish_reason: "length" mean?: The model hit the max_tokens limit and the response was truncated. Increase max_tokens if you need longer responses, or detect this reason code in your client and prompt the user that the response was cut short.
Is the SSE format the same for all AI providers?: No. OpenAI and providers using the OpenAI-compatible API use the data: {JSON}\n\n format with a [DONE] terminator. Anthropic uses typed event: lines with separate event types for message start, content deltas, and message end. The viewer handles both.

Related Examples

Repair Malformed JSON from an AI Response

LLMs frequently return invalid JSON despite being instructed to produce valid JS...

Build a JSON Schema for Structured AI Outputs

Structured output APIs from OpenAI (json_schema mode) and Anthropic (tool use) r...

Clean and Format Raw LLM Markdown Output

LLMs frequently produce markdown that is technically valid but visually inconsis...

Build an OpenAI Chat Completion Request

The Chat Completion API is the primary interface for all GPT models and the foun...