How to Get JSON Output from ChatGPT and Other LLMs
By Dorian Laurenceau
📅 Last reviewed: April 24, 2026. Updated with April 2026 findings and community feedback.
📅 Last Updated: January 28, 2026
📚 Related: ChatGPT 5.2 Prompting Guide | Meta-Prompting Techniques | GPT-5.2-Codex Deep Dive
Getting AI to give you a nicely formatted JSON response instead of paragraphs of text can feel like magic-when it works. But it can also be frustrating when the model adds extra commentary or breaks the format.
Let's explore why structured output matters and the key concepts behind getting it right.
<!-- manual-insight -->
What actually works for JSON output in 2026 (and what stopped working)
The "how do I get reliable JSON" question has been answered, re-answered, and obsoleted multiple times since 2022. The current consensus on r/OpenAI, r/ClaudeAI, r/LocalLLaMA, and the LangChain Discord is much narrower than the older blog content suggests.
What actually works in 2026:
- →Structured outputs / strict JSON mode. OpenAI's structured outputs and Anthropic's tool use with strict input schemas enforce JSON schema at the decoding level. They are dramatically more reliable than prompt-based JSON requests.
- →Function calling / tool use as a JSON-shaping mechanism. Even if you don't actually need tools, defining a tool with the schema you want and asking the model to "call it" produces clean JSON.
- →Grammar-constrained decoding for local models. llama.cpp's GBNF grammars, outlines, and vLLM's guided decoding constrain output to valid JSON at the token level. For self-hosted setups this is the gold standard.
- →JSON Schema with explicit examples. Even with structured outputs, providing 1-2 concrete examples reduces edge-case failures significantly.
What stopped working (or never really worked):
- →"Respond ONLY with JSON" prompts on non-strict models. Older models would ignore this 5-15 % of the time. Don't trust prompt-based enforcement for production.
- →Asking for JSON inside markdown code blocks. It works most of the time and breaks on long outputs. Markdown extraction is fragile; use the API's response_format parameter.
- →Regex post-processing as the only safeguard. Necessary as a fallback, insufficient as the primary mechanism. Models hallucinate JSON-looking-but-broken structures regularly.
- →Hand-crafted JSON examples in the system prompt without a schema. Better than nothing, but the schema-aware modes outperform it consistently.
What practitioners actually do for production-grade structured output:
- →Define a Pydantic / Zod schema as the source of truth, then export it to JSON Schema for the API call. Pydantic on Python and Zod on TypeScript are the dominant choices.
- →Validate after parsing, every time. Even with strict mode, edge cases happen. Reject and retry with clear error feedback rather than silently passing bad data downstream.
- →Log raw output for failure analysis. When something breaks, you need the exact tokens the model produced.
- →Evaluate output quality, not just format compliance. A well-formed JSON object with hallucinated values is still a bug. Your evals should test fields, not just structure.
- →Use libraries that handle the boilerplate. Instructor (Python), LangChain's structured output, and BAML all wrap the schema-validation-retry loop cleanly.
What's still hard:
- →Deeply nested schemas with many enums. Strict mode helps but still degrades quality on extremely complex schemas. Flatten where possible.
- →Mixed content (some prose, some JSON). The cleanest approach is two separate calls or a two-field schema (
reasoning: string,result: object). Trying to embed JSON in prose breaks more often than it works. - →Streaming structured output. Possible with newer APIs but requires careful UI handling. For most use cases, wait-and-parse is simpler.
The honest framing: in 2026 you should not be writing your own JSON-extraction logic from scratch. Use structured outputs, function calling, or grammar-constrained decoding depending on your stack. Define schemas in code, validate on every response, and treat the output as untrusted until validated. The prompting tricks of 2023 are now mostly obsolete.
Learn AI — From Prompts to Agents
Why JSON Output Matters
When you're building applications with AI, you don't want prose-you want data you can parse.
The Problem with Unstructured Output
User: List the top 3 programming languages
AI: Sure! Here are the top 3 programming languages:
1. Python - Great for beginners and data science
2. JavaScript - Essential for web development
3. TypeScript - JavaScript with types
These are all excellent choices depending on your needs...
This is nice for humans but terrible for code. You can't reliably extract the data.
The Power of Structured Output
{
"languages": [
{"name": "Python", "use_case": "Data science, beginners"},
{"name": "JavaScript", "use_case": "Web development"},
{"name": "TypeScript", "use_case": "Type-safe JavaScript"}
]
}
Now your code can parse this directly. No regex gymnastics required.
Basic Techniques for JSON Output
1. Explicit Format Instructions
Tell the AI exactly what format you want:
Return your response as a JSON object with this structure:
{
"title": "string",
"summary": "string",
"tags": ["array", "of", "strings"]
}
Only output the JSON, no other text.
2. Provide an Example
Show the AI what you expect:
Extract the product info from this description and return it as JSON.
Example output:
{"name": "Widget Pro", "price": 29.99, "category": "Tools"}
Description: "The SuperBlender 3000 costs $149 and is in our Kitchen category"
3. Use System Messages (API)
When using the API, the system message can enforce format:
System: You are a JSON-only responder. Output valid JSON and nothing else.
Common Challenges
1. Extra Text Around JSON
The model might say "Here's the JSON:" before the actual output.
Solution: Be explicit: "Output ONLY valid JSON. No introduction, no explanation."
2. Invalid JSON Syntax
Missing quotes, trailing commas, or broken brackets.
Solution: Ask the model to validate: "Ensure the JSON is valid and parseable."
3. Inconsistent Structure
Sometimes keys are present, sometimes they're not.
Solution: Define the schema explicitly and state which fields are required.
When Structured Output Shines
Structured output is essential for:
- →API integrations, Feeding AI output to other systems
- →Database storage, Storing responses in structured format
- →Automation workflows, Zapier, Make, n8n integrations
- →Frontend rendering, Displaying AI output in UI components
- →Data extraction, Pulling structured info from unstructured text
JSON vs. Other Formats
| Format | Best For | Limitation |
|---|---|---|
| JSON | APIs, code integration | Verbose for simple data |
| Markdown | Documentation, readable output | Harder to parse |
| CSV | Tabular data | No nested structures |
| YAML | Config files, human-readable | Less common in APIs |
JSON is the most universal choice for programmatic use.
Native JSON Mode (API)
Modern APIs now offer JSON Mode that guarantees valid JSON output. This is a significant shift for reliability.
OpenAI JSON Mode (GPT-5.2)
import openai
client = openai.OpenAI()
response = client.chat.completions.create(
model="gpt-5.2",
response_format={"type": "json_object"},
messages=[
{"role": "system", "content": "Output valid JSON only."},
{"role": "user", "content": "List top 3 programming languages with use cases"}
]
)
# Guaranteed valid JSON
import json
data = json.loads(response.choices[0].message.content)
Claude JSON Mode
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-sonnet-4-5-20250514",
max_tokens=1024,
messages=[
{
"role": "user",
"content": """Return a JSON object with this exact structure:
{"languages": [{"name": "...", "use_case": "..."}]}
List top 3 programming languages. Output ONLY the JSON."""
}
]
)
Gemini Structured Output
Gemini supports schema-based structured output:
import google.generativeai as genai
model = genai.GenerativeModel('gemini-3-pro')
response = model.generate_content(
"List top 3 programming languages",
generation_config=genai.GenerationConfig(
response_mime_type="application/json",
response_schema={
"type": "object",
"properties": {
"languages": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {"type": "string"},
"use_case": {"type": "string"}
}
}
}
}
}
)
)
JSON Output by Model
| Model | JSON Mode | Schema Validation | Reliability |
|---|---|---|---|
| GPT-5.2 | ✅ Native | Via function calling | Excellent |
| GPT-4o | ✅ Native | Via function calling | Very Good |
| Claude Sonnet 4.5 | ✅ Native | Via tool use | Excellent |
| Gemini 3 Pro | ✅ Native | ✅ Schema-based | Excellent |
| DeepSeek V3 | ⚠️ Prompting | Manual | Good |
| Llama 3 | ⚠️ Prompting | Manual | Variable |
Advanced JSON Patterns
Pattern 1: Schema Definition
Define your schema explicitly for consistent results:
Return a JSON object matching this TypeScript interface:
interface ProductAnalysis {
name: string;
category: string;
sentiment: "positive" | "negative" | "neutral";
keyFeatures: string[]; // exactly 3 items
priceRange: {
min: number;
max: number;
currency: "USD" | "EUR";
};
}
Pattern 2: Validation Instructions
Ask the model to self-validate:
Before outputting, verify your JSON:
1. All required fields present
2. Arrays have correct length
3. No trailing commas
4. Strings properly quoted
5. Valid JSON.parse() compatible
If any check fails, fix before outputting.
Pattern 3: Error Recovery
Handle malformed JSON gracefully:
import json
import re
def parse_llm_json(response: str) -> dict:
# Try direct parse
try:
return json.loads(response)
except json.JSONDecodeError:
pass
# Extract JSON from markdown code block
match = re.search(r'```json?\s*([\s\S]*?)```', response)
if match:
try:
return json.loads(match.group(1))
except json.JSONDecodeError:
pass
# Extract first { } block
match = re.search(r'\{[\s\S]*\}', response)
if match:
try:
return json.loads(match.group(0))
except json.JSONDecodeError:
pass
raise ValueError("Could not parse JSON from response")
FAQ
Does JSON mode cost more tokens?
No. JSON mode doesn't increase token usage. However, requesting structured output often produces more concise responses.
Can I use JSON mode with streaming?
Yes, but you'll need to buffer the response and parse once complete, as partial JSON isn't valid.
What if the model returns invalid JSON despite JSON mode?
With native JSON mode, this is extremely rare. If it happens, retry the request. For critical applications, implement validation and retry logic.
Should I use function calling instead of JSON mode?
Function calling (tool use) is better when you need the model to choose between actions. JSON mode is better for pure data extraction.
- →Meta-Prompting Techniques, Advanced prompting patterns
- →GPT-5.2-Codex Deep Dive, Coding with structured output
- →Claude Code Slash Commands, CLI structured output
In Brief
- →Structured output makes AI responses machine-readable
- →Use explicit format instructions and examples
- →State "JSON only, no other text" to prevent extras
- →Define your schema clearly for consistency
- →JSON Mode in modern APIs provides guarantees
Ready to Master Structured Outputs?
This article covered the what and why of getting JSON from AI. But reliable structured output requires deeper techniques.
In our Module 2, Structured Outputs, you'll learn:
- →Advanced schema definition techniques
- →How to handle optional vs. required fields
- →Validation strategies for production systems
- →Working with nested and complex structures
- →Using JSON Mode and function calling APIs
→ Explore Module 2: Structured Outputs
Last Updated: January 28, 2026
Module 2 — Structured Outputs
Learn to get reliable, formatted responses like JSON and tables.
Dorian Laurenceau
Full-Stack Developer & Learning DesignerFull-stack web developer and learning designer. I spent 4 years as a freelance full-stack developer and 4 years teaching React, JavaScript, HTML/CSS and WordPress to adult learners. Today I design learning paths in web development and AI, grounded in learning science. I founded learn-prompting.fr to make AI practical and accessible, and built the Bluff app to gamify political transparency.
Weekly AI Insights
Tools, techniques & news — curated for AI practitioners. Free, no spam.
Free, no spam. Unsubscribe anytime.
→Related Articles
FAQ
How do I get JSON output from ChatGPT?+
Use explicit format instructions, provide examples, or enable JSON mode in the API. State 'Output ONLY valid JSON, no other text.'
What is JSON mode in ChatGPT?+
JSON mode guarantees valid JSON output from the API. Enable it with response_format={'type': 'json_object'} in your API call.
Why does ChatGPT add text around my JSON?+
The model tries to be helpful. Prevent this by explicitly stating 'No introduction, no explanation, only JSON output.'
Can Claude and Gemini output JSON?+
Yes. All major LLMs support JSON output via explicit prompting. Claude and GPT-5.2 also support native JSON mode.
How do I validate JSON from LLMs?+
Use schema validation libraries like Pydantic (Python) or Zod (TypeScript). Define your expected structure and validate responses.