Parsing JSON from a language model in production is not the same as parsing JSON from a well-behaved REST API. LLMs produce syntactically broken output, schema-mismatched output, and sometimes no JSON at all. The naive JSON.parse(response) blows up in production constantly.
This post covers the full production pattern: repair → parse → validate → type-safe output.
The Problem Stack
LLM JSON failures fall into three categories:
1. Syntax errors — the output isn't valid JSON at all
- Trailing commas:
{"a": 1,} - Single quotes:
{'key': 'val'} - Python literals:
True,False,None - Markdown fences:
```json {...} ``` - Truncated output (hit
max_tokens)
2. Schema errors — the JSON is valid but doesn't match your expected shape
- Missing required fields
- Wrong types (
"42"instead of42) - Extra fields you didn't ask for
- Differently structured nesting than expected
3. Content failures — the model refused or hallucinated
- Plain text response: "I cannot provide that information"
- Empty response
- JSON that looks right but has hallucinated field values
A production parser needs to handle all three categories gracefully.
The Production Pattern
import { jsonrepair } from 'jsonrepair';
import { z } from 'zod';
// Step 1: Define your expected schema
const ProductSchema = z.object({
id: z.string().uuid(),
name: z.string().min(1),
price: z.number().positive(),
category: z.string(),
inStock: z.boolean(),
tags: z.array(z.string()).default([]),
description: z.string().optional(),
});
type Product = z.infer<typeof ProductSchema>;
// Step 2: Build a robust LLM JSON parser
async function parseLlmJson<T>(
rawOutput: string,
schema: z.ZodSchema<T>,
): Promise<{ data: T; repaired: boolean } | { error: string; raw: string }> {
let jsonString = rawOutput.trim();
let repaired = false;
// Try direct parse first
try {
const data = schema.parse(JSON.parse(jsonString));
return { data, repaired: false };
} catch {
// not valid JSON or schema mismatch — try repair
}
// Attempt repair
try {
const fixed = jsonrepair(jsonString);
const parsed = JSON.parse(fixed);
repaired = true;
const result = schema.safeParse(parsed);
if (result.success) {
return { data: result.data, repaired };
}
// Schema validation failed even after repair
return {
error: `Schema mismatch: ${result.error.flatten().fieldErrors}`,
raw: rawOutput,
};
} catch {
return { error: 'Could not parse as JSON', raw: rawOutput };
}
}
// Usage
const result = await parseLlmJson(llmResponse, ProductSchema);
if ('error' in result) {
// Handle failure — log, retry, or fall back
console.error('LLM JSON parse failed:', result.error);
} else {
if (result.repaired) {
// Optionally log repaired outputs for prompt tuning
metrics.increment('llm.json.repaired');
}
const product: Product = result.data;
}
This pattern gives you:
- A clean
Producttype when everything works - A structured error when it doesn't
- A
repairedflag so you can track how often your LLM output needs repair (use this to tune prompts over time)
Handling Markdown-Wrapped Responses
When you don't use JSON mode, models frequently wrap their response:
Here's the product data:
```json
{
"id": "abc-123",
"name": "Widget Pro"
}
```
Let me know if you need anything else.
Add a markdown extraction step before repair:
function extractJsonFromMarkdown(text: string): string {
// Try to find a code block
const fenceMatch = text.match(/```(?:json)?\s*([\s\S]*?)```/);
if (fenceMatch) return fenceMatch[1].trim();
// Try to find a bare JSON object or array
const objectMatch = text.match(/(\{[\s\S]*\}|\[[\s\S]*\])/);
if (objectMatch) return objectMatch[1];
return text;
}
Or use the Extract JSON from Markdown tool in the browser to test your extraction logic against real examples.
For production code, the full pipeline becomes:
async function parseLlmJsonFull<T>(
rawOutput: string,
schema: z.ZodSchema<T>,
): Promise<T> {
// 1. Extract from markdown if needed
const extracted = extractJsonFromMarkdown(rawOutput);
// 2. Repair if needed
let jsonStr: string;
try {
JSON.parse(extracted); // check if already valid
jsonStr = extracted;
} catch {
jsonStr = jsonrepair(extracted);
}
// 3. Parse and validate
const parsed = JSON.parse(jsonStr);
return schema.parse(parsed);
}
Coercion for Common Type Mismatches
LLMs frequently return numbers as strings ("price": "9.99" instead of "price": 9.99). Zod handles this with .coerce:
const ProductSchema = z.object({
id: z.string(),
name: z.string(),
price: z.coerce.number(), // "9.99" → 9.99
count: z.coerce.number().int(), // "5" → 5
active: z.preprocess(
(v) => v === 'true' ? true : v === 'false' ? false : v,
z.boolean()
),
});
Use coercion carefully — it can mask real problems. Use it for known LLM quirks (numbers as strings, booleans as strings) but not as a general-purpose fix.
Retry Strategies
When validation fails, you have several options:
Option 1: Retry with error feedback
Feed the validation error back to the model:
async function parseLlmJsonWithRetry<T>(
prompt: string,
schema: z.ZodSchema<T>,
maxRetries = 2,
): Promise<T> {
let lastError: string | null = null;
for (let attempt = 0; attempt <= maxRetries; attempt++) {
const messages = [
{ role: 'system', content: 'Respond only with valid JSON.' },
{ role: 'user', content: prompt },
];
if (lastError) {
messages.push(
{ role: 'assistant', content: lastAttempt },
{ role: 'user', content: `The JSON you returned had this error: ${lastError}. Please fix it.` }
);
}
const response = await llm.complete(messages);
const result = await parseLlmJson(response, schema);
if ('data' in result) return result.data;
lastError = result.error;
lastAttempt = response;
}
throw new Error(`Failed after ${maxRetries} retries: ${lastError}`);
}
Option 2: Use OpenAI Structured Outputs
The cleanest approach — zero repair needed:
const response = await openai.chat.completions.create({
model: 'gpt-4o',
response_format: {
type: 'json_schema',
json_schema: {
name: 'product',
strict: true,
schema: {
type: 'object',
properties: {
id: { type: 'string' },
name: { type: 'string' },
price: { type: 'number' },
category: { type: 'string' },
inStock: { type: 'boolean' },
},
required: ['id', 'name', 'price', 'category', 'inStock'],
additionalProperties: false,
},
},
},
messages: [{ role: 'user', content: '...' }],
});
// With strict: true, this never fails
const product = ProductSchema.parse(
JSON.parse(response.choices[0].message.content!)
);
Generate the JSON Schema from your sample data using the JSON to OpenAI Schema tool.
Option 3: Partial schema acceptance
Sometimes you need the data even if it's incomplete. Use .partial():
const PartialProductSchema = ProductSchema.partial();
type PartialProduct = z.infer<typeof PartialProductSchema>;
// Accept whatever the model provided
const partialResult = PartialProductSchema.safeParse(parsed);
if (partialResult.success) {
// Work with what we have
processPartial(partialResult.data);
}
Observability: Track Parse Quality Over Time
In production, track these metrics:
interface LlmJsonMetrics {
total: number;
parseSuccess: number;
repaired: number;
schemaError: number;
totalFailure: number;
}
If your repaired rate climbs above 20%, your prompts need tuning. If schemaError rate is non-zero, your schema or prompt description has a mismatch. These metrics tell you where to invest prompt engineering effort.
Schema Design for LLM Outputs
Some Zod patterns that work better with LLMs:
Use .default() for optional arrays — LLMs often omit optional arrays entirely rather than returning []:
z.object({
tags: z.array(z.string()).default([]), // never undefined
})
Use .transform() to normalize — handle variations in how the model formats values:
z.object({
status: z.string().transform(s => s.toLowerCase()),
// "Active", "ACTIVE", "active" → all become "active"
})
Use z.union() for inconsistent types — when the model sometimes returns a string ID and sometimes a number:
z.object({
id: z.union([z.string(), z.number()]).transform(v => String(v)),
})
Summary
The production-grade LLM JSON pipeline:
- Extract — pull JSON out of markdown fences if needed
- Repair — run
jsonrepairas a fallback for syntax errors - Validate — use Zod's
safeParse()for schema validation - Coerce — use
z.coercefor known type mismatches - Retry — feed errors back to the model or use structured outputs
- Observe — track repair rate and schema error rate
The goal isn't zero errors — LLMs are probabilistic. The goal is catching every error gracefully so a malformed response never reaches your application logic unchecked.
Tools mentioned in this post:
- Fix LLM JSON — browser-based JSON repair
- JSON to Zod Schema — generate Zod schemas from sample JSON
- JSON to OpenAI Schema — generate JSON Schema for structured outputs
- Extract JSON from Markdown — pull JSON from LLM markdown responses