Getting Structured Output from LLMs

Getting an LLM to reliably return structured data like JSON is one of the most important skills in production AI engineering. Without enforcement, models occasionally produce malformed JSON, add explanatory text around the JSON, or invent fields. This guide explains all available mechanisms for enforcing structured output and how to choose between them.

Why Structured Output is Hard

Language models generate text token by token, without any built-in constraint that the output must conform to a schema. Even with a prompt that says "return only JSON", the model might add "Here is the JSON:" before the opening brace, wrap the JSON in markdown code fences, or produce an object with slightly different key names than specified. These deviations break downstream JSON parsing and require error handling. The more complex the schema, the higher the probability of at least one field violating the type or format constraint.

OpenAI JSON Mode and Structured Outputs

OpenAI provides two mechanisms: JSON mode and Structured Outputs. JSON mode (response_format: { type: "json_object" }) guarantees that the model's output is valid JSON but does not enforce a specific schema. Structured Outputs (response_format: { type: "json_schema", json_schema: {...} }) enforce a specific JSON Schema, with the model constrained at the token-generation level to only produce tokens that match the schema. Structured Outputs are the preferred approach for production use because they guarantee schema compliance and eliminate the need for post-processing validation. The trade-off is that not all JSON Schema features are supported, and overly complex nested schemas can increase latency.

Claude Tool Use for Structured Output

Claude does not have a JSON Schema enforcement mode equivalent to OpenAI's Structured Outputs, but tool use (function calling) achieves a similar result. Define a tool with an input_schema that describes the JSON structure you want, then instruct Claude to call that tool with the extracted or generated data. Claude's tool use is highly reliable because it is trained specifically to produce tool calls that conform to the declared schema. This approach works well for data extraction tasks — the tool schema acts as a contract for the output format.

Zod and Runtime Schema Validation

Even with JSON mode or tool use, always validate the parsed output against a schema at runtime. In TypeScript/JavaScript, Zod is the standard library for this: define a z.object() schema matching your expected output, call schema.parse(json) to validate, and handle z.ZodError exceptions when validation fails. If validation fails, you can either throw an error or implement an automatic retry that sends the failed output back to the model with "The output was invalid. Here is the JSON schema you must follow: ... Here is your incorrect output: ... Please correct it and return only valid JSON."

Prompting Strategies for Consistent JSON

When not using JSON mode, these prompt strategies improve JSON reliability: (1) provide a JSON schema or example in the system prompt; (2) use few-shot examples showing input → JSON output pairs; (3) end the user message with "Return only a JSON object. No other text." — placing the instruction last makes it more reliably followed; (4) for Claude, end the assistant turn with "{" to force the model to continue the JSON it started; (5) use temperature=0 for structured data tasks — lower temperature reduces creative deviations from the format.

Handling Streaming with Structured Output

Streaming (SSE) and structured JSON are fundamentally in tension: the full JSON is not valid until the stream is complete, so you cannot parse it incrementally. For applications that need both streaming and structured data, two patterns work: (1) stream the text to the user as prose, then extract structured data from the completed response; (2) use partial JSON streaming libraries like Zod's safe-parse on each chunk combined with incomplete JSON detection — this is complex but enables progress indicators while streaming. For most production use cases, disabling streaming on structured-output endpoints is simpler and more robust.

Try These Tools

FAQ

What is the best way to extract structured data from a long document?
Use OpenAI's Structured Outputs with a JSON Schema defining the fields to extract, or use Claude tool use with an equivalent input schema. Provide 2-3 examples in the prompt showing the extraction pattern. For documents longer than 50,000 tokens, chunk the document and extract from each chunk, then merge the results.
Can I use structured outputs for streaming responses?
OpenAI supports streaming with JSON Schema structured outputs, but the output is only valid once the stream is complete. Parse the full response after streaming ends. Claude tool use also streams but the tool_use content block is only complete at the end of the stream.
What should I do when the model returns invalid JSON despite JSON mode?
JSON mode guarantees syntactically valid JSON but not schema compliance. If the model returns a JSON object missing required fields, implement a retry loop: send the invalid output back with "This is invalid because field X is missing. Return the corrected JSON." One retry usually resolves schema violations.

Related Guides