How to Validate JSON from an LLM in Production

Parsing JSON from a language model in production is not the same as parsing JSON from a well-behaved REST API. LLMs produce syntactically broken output, schema-mismatched output, and sometimes no JSON at all. The naive JSON.parse(response) blows up in production constantly.

This post covers the full production pattern: repair → parse → validate → type-safe output.

The Problem Stack

LLM JSON failures fall into three categories:

1. Syntax errors — the output isn't valid JSON at all

Trailing commas: {"a": 1,}
Single quotes: {'key': 'val'}
Python literals: True, False, None
Markdown fences: ```json {...} ```
Truncated output (hit max_tokens)

2. Schema errors — the JSON is valid but doesn't match your expected shape

Missing required fields
Wrong types ("42" instead of 42)
Extra fields you didn't ask for
Differently structured nesting than expected

3. Content failures — the model refused or hallucinated

Plain text response: "I cannot provide that information"
Empty response
JSON that looks right but has hallucinated field values

A production parser needs to handle all three categories gracefully.

The Production Pattern

import { jsonrepair } from 'jsonrepair';
import { z } from 'zod';

// Step 1: Define your expected schema
const ProductSchema = z.object({
  id:          z.string().uuid(),
  name:        z.string().min(1),
  price:       z.number().positive(),
  category:    z.string(),
  inStock:     z.boolean(),
  tags:        z.array(z.string()).default([]),
  description: z.string().optional(),
});

type Product = z.infer<typeof ProductSchema>;

// Step 2: Build a robust LLM JSON parser
async function parseLlmJson<T>(
  rawOutput: string,
  schema: z.ZodSchema<T>,
): Promise<{ data: T; repaired: boolean } | { error: string; raw: string }> {
  let jsonString = rawOutput.trim();
  let repaired = false;

  // Try direct parse first
  try {
    const data = schema.parse(JSON.parse(jsonString));
    return { data, repaired: false };
  } catch {
    // not valid JSON or schema mismatch — try repair
  }

  // Attempt repair
  try {
    const fixed = jsonrepair(jsonString);
    const parsed = JSON.parse(fixed);
    repaired = true;

    const result = schema.safeParse(parsed);
    if (result.success) {
      return { data: result.data, repaired };
    }

    // Schema validation failed even after repair
    return {
      error: `Schema mismatch: ${result.error.flatten().fieldErrors}`,
      raw: rawOutput,
    };
  } catch {
    return { error: 'Could not parse as JSON', raw: rawOutput };
  }
}

// Usage
const result = await parseLlmJson(llmResponse, ProductSchema);
if ('error' in result) {
  // Handle failure — log, retry, or fall back
  console.error('LLM JSON parse failed:', result.error);
} else {
  if (result.repaired) {
    // Optionally log repaired outputs for prompt tuning
    metrics.increment('llm.json.repaired');
  }
  const product: Product = result.data;
}

This pattern gives you:

A clean Product type when everything works
A structured error when it doesn't
A repaired flag so you can track how often your LLM output needs repair (use this to tune prompts over time)

Handling Markdown-Wrapped Responses

When you don't use JSON mode, models frequently wrap their response:

Here's the product data:

```json
{
  "id": "abc-123",
  "name": "Widget Pro"
}
```

Let me know if you need anything else.

Add a markdown extraction step before repair:

function extractJsonFromMarkdown(text: string): string {
  // Try to find a code block
  const fenceMatch = text.match(/```(?:json)?\s*([\s\S]*?)```/);
  if (fenceMatch) return fenceMatch[1].trim();

  // Try to find a bare JSON object or array
  const objectMatch = text.match(/(\{[\s\S]*\}|\[[\s\S]*\])/);
  if (objectMatch) return objectMatch[1];

  return text;
}

Or use the Extract JSON from Markdown tool in the browser to test your extraction logic against real examples.

For production code, the full pipeline becomes:

async function parseLlmJsonFull<T>(
  rawOutput: string,
  schema: z.ZodSchema<T>,
): Promise<T> {
  // 1. Extract from markdown if needed
  const extracted = extractJsonFromMarkdown(rawOutput);

  // 2. Repair if needed
  let jsonStr: string;
  try {
    JSON.parse(extracted); // check if already valid
    jsonStr = extracted;
  } catch {
    jsonStr = jsonrepair(extracted);
  }

  // 3. Parse and validate
  const parsed = JSON.parse(jsonStr);
  return schema.parse(parsed);
}

Coercion for Common Type Mismatches

LLMs frequently return numbers as strings ("price": "9.99" instead of "price": 9.99). Zod handles this with .coerce:

const ProductSchema = z.object({
  id:    z.string(),
  name:  z.string(),
  price: z.coerce.number(),     // "9.99" → 9.99
  count: z.coerce.number().int(), // "5" → 5
  active: z.preprocess(
    (v) => v === 'true' ? true : v === 'false' ? false : v,
    z.boolean()
  ),
});

Use coercion carefully — it can mask real problems. Use it for known LLM quirks (numbers as strings, booleans as strings) but not as a general-purpose fix.

Retry Strategies

When validation fails, you have several options:

Option 1: Retry with error feedback

Feed the validation error back to the model:

async function parseLlmJsonWithRetry<T>(
  prompt: string,
  schema: z.ZodSchema<T>,
  maxRetries = 2,
): Promise<T> {
  let lastError: string | null = null;

  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    const messages = [
      { role: 'system', content: 'Respond only with valid JSON.' },
      { role: 'user', content: prompt },
    ];

    if (lastError) {
      messages.push(
        { role: 'assistant', content: lastAttempt },
        { role: 'user', content: `The JSON you returned had this error: ${lastError}. Please fix it.` }
      );
    }

    const response = await llm.complete(messages);
    const result = await parseLlmJson(response, schema);

    if ('data' in result) return result.data;

    lastError = result.error;
    lastAttempt = response;
  }

  throw new Error(`Failed after ${maxRetries} retries: ${lastError}`);
}

Option 2: Use OpenAI Structured Outputs

The cleanest approach — zero repair needed:

const response = await openai.chat.completions.create({
  model: 'gpt-4o',
  response_format: {
    type: 'json_schema',
    json_schema: {
      name: 'product',
      strict: true,
      schema: {
        type: 'object',
        properties: {
          id:       { type: 'string' },
          name:     { type: 'string' },
          price:    { type: 'number' },
          category: { type: 'string' },
          inStock:  { type: 'boolean' },
        },
        required: ['id', 'name', 'price', 'category', 'inStock'],
        additionalProperties: false,
      },
    },
  },
  messages: [{ role: 'user', content: '...' }],
});

// With strict: true, this never fails
const product = ProductSchema.parse(
  JSON.parse(response.choices[0].message.content!)
);

Generate the JSON Schema from your sample data using the JSON to OpenAI Schema tool.

Option 3: Partial schema acceptance

Sometimes you need the data even if it's incomplete. Use .partial():

const PartialProductSchema = ProductSchema.partial();
type PartialProduct = z.infer<typeof PartialProductSchema>;

// Accept whatever the model provided
const partialResult = PartialProductSchema.safeParse(parsed);
if (partialResult.success) {
  // Work with what we have
  processPartial(partialResult.data);
}

Observability: Track Parse Quality Over Time

In production, track these metrics:

interface LlmJsonMetrics {
  total: number;
  parseSuccess: number;
  repaired: number;
  schemaError: number;
  totalFailure: number;
}

If your repaired rate climbs above 20%, your prompts need tuning. If schemaError rate is non-zero, your schema or prompt description has a mismatch. These metrics tell you where to invest prompt engineering effort.

Schema Design for LLM Outputs

Some Zod patterns that work better with LLMs:

Use .default() for optional arrays — LLMs often omit optional arrays entirely rather than returning []:

z.object({
  tags: z.array(z.string()).default([]),  // never undefined
})

Use .transform() to normalize — handle variations in how the model formats values:

z.object({
  status: z.string().transform(s => s.toLowerCase()),
  // "Active", "ACTIVE", "active" → all become "active"
})

Use z.union() for inconsistent types — when the model sometimes returns a string ID and sometimes a number:

z.object({
  id: z.union([z.string(), z.number()]).transform(v => String(v)),
})

Summary

The production-grade LLM JSON pipeline:

Extract — pull JSON out of markdown fences if needed
Repair — run jsonrepair as a fallback for syntax errors
Validate — use Zod's safeParse() for schema validation
Coerce — use z.coerce for known type mismatches
Retry — feed errors back to the model or use structured outputs
Observe — track repair rate and schema error rate

The goal isn't zero errors — LLMs are probabilistic. The goal is catching every error gracefully so a malformed response never reaches your application logic unchecked.

Tools mentioned in this post:

Fix LLM JSON — browser-based JSON repair
JSON to Zod Schema — generate Zod schemas from sample JSON
JSON to OpenAI Schema — generate JSON Schema for structured outputs
Extract JSON from Markdown — pull JSON from LLM markdown responses