How to Validate JSON from an LLM in Production

LLMs produce unreliable JSON. Here's the production-grade pattern for parsing, repairing, and validating LLM output with Zod — with real error handling.

·7 min readllmzodproductiontypescript

Parsing JSON from a language model in production is not the same as parsing JSON from a well-behaved REST API. LLMs produce syntactically broken output, schema-mismatched output, and sometimes no JSON at all. The naive JSON.parse(response) blows up in production constantly.

This post covers the full production pattern: repair → parse → validate → type-safe output.

The Problem Stack

LLM JSON failures fall into three categories:

1. Syntax errors — the output isn't valid JSON at all

  • Trailing commas: {"a": 1,}
  • Single quotes: {'key': 'val'}
  • Python literals: True, False, None
  • Markdown fences: ```json {...} ```
  • Truncated output (hit max_tokens)

2. Schema errors — the JSON is valid but doesn't match your expected shape

  • Missing required fields
  • Wrong types ("42" instead of 42)
  • Extra fields you didn't ask for
  • Differently structured nesting than expected

3. Content failures — the model refused or hallucinated

  • Plain text response: "I cannot provide that information"
  • Empty response
  • JSON that looks right but has hallucinated field values

A production parser needs to handle all three categories gracefully.

The Production Pattern

import { jsonrepair } from 'jsonrepair';
import { z } from 'zod';

// Step 1: Define your expected schema
const ProductSchema = z.object({
  id:          z.string().uuid(),
  name:        z.string().min(1),
  price:       z.number().positive(),
  category:    z.string(),
  inStock:     z.boolean(),
  tags:        z.array(z.string()).default([]),
  description: z.string().optional(),
});

type Product = z.infer<typeof ProductSchema>;

// Step 2: Build a robust LLM JSON parser
async function parseLlmJson<T>(
  rawOutput: string,
  schema: z.ZodSchema<T>,
): Promise<{ data: T; repaired: boolean } | { error: string; raw: string }> {
  let jsonString = rawOutput.trim();
  let repaired = false;

  // Try direct parse first
  try {
    const data = schema.parse(JSON.parse(jsonString));
    return { data, repaired: false };
  } catch {
    // not valid JSON or schema mismatch — try repair
  }

  // Attempt repair
  try {
    const fixed = jsonrepair(jsonString);
    const parsed = JSON.parse(fixed);
    repaired = true;

    const result = schema.safeParse(parsed);
    if (result.success) {
      return { data: result.data, repaired };
    }

    // Schema validation failed even after repair
    return {
      error: `Schema mismatch: ${result.error.flatten().fieldErrors}`,
      raw: rawOutput,
    };
  } catch {
    return { error: 'Could not parse as JSON', raw: rawOutput };
  }
}

// Usage
const result = await parseLlmJson(llmResponse, ProductSchema);
if ('error' in result) {
  // Handle failure — log, retry, or fall back
  console.error('LLM JSON parse failed:', result.error);
} else {
  if (result.repaired) {
    // Optionally log repaired outputs for prompt tuning
    metrics.increment('llm.json.repaired');
  }
  const product: Product = result.data;
}

This pattern gives you:

  • A clean Product type when everything works
  • A structured error when it doesn't
  • A repaired flag so you can track how often your LLM output needs repair (use this to tune prompts over time)

Handling Markdown-Wrapped Responses

When you don't use JSON mode, models frequently wrap their response:

Here's the product data:

```json
{
  "id": "abc-123",
  "name": "Widget Pro"
}
```

Let me know if you need anything else.

Add a markdown extraction step before repair:

function extractJsonFromMarkdown(text: string): string {
  // Try to find a code block
  const fenceMatch = text.match(/```(?:json)?\s*([\s\S]*?)```/);
  if (fenceMatch) return fenceMatch[1].trim();

  // Try to find a bare JSON object or array
  const objectMatch = text.match(/(\{[\s\S]*\}|\[[\s\S]*\])/);
  if (objectMatch) return objectMatch[1];

  return text;
}

Or use the Extract JSON from Markdown tool in the browser to test your extraction logic against real examples.

For production code, the full pipeline becomes:

async function parseLlmJsonFull<T>(
  rawOutput: string,
  schema: z.ZodSchema<T>,
): Promise<T> {
  // 1. Extract from markdown if needed
  const extracted = extractJsonFromMarkdown(rawOutput);

  // 2. Repair if needed
  let jsonStr: string;
  try {
    JSON.parse(extracted); // check if already valid
    jsonStr = extracted;
  } catch {
    jsonStr = jsonrepair(extracted);
  }

  // 3. Parse and validate
  const parsed = JSON.parse(jsonStr);
  return schema.parse(parsed);
}

Coercion for Common Type Mismatches

LLMs frequently return numbers as strings ("price": "9.99" instead of "price": 9.99). Zod handles this with .coerce:

const ProductSchema = z.object({
  id:    z.string(),
  name:  z.string(),
  price: z.coerce.number(),     // "9.99" → 9.99
  count: z.coerce.number().int(), // "5" → 5
  active: z.preprocess(
    (v) => v === 'true' ? true : v === 'false' ? false : v,
    z.boolean()
  ),
});

Use coercion carefully — it can mask real problems. Use it for known LLM quirks (numbers as strings, booleans as strings) but not as a general-purpose fix.

Retry Strategies

When validation fails, you have several options:

Option 1: Retry with error feedback

Feed the validation error back to the model:

async function parseLlmJsonWithRetry<T>(
  prompt: string,
  schema: z.ZodSchema<T>,
  maxRetries = 2,
): Promise<T> {
  let lastError: string | null = null;

  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    const messages = [
      { role: 'system', content: 'Respond only with valid JSON.' },
      { role: 'user', content: prompt },
    ];

    if (lastError) {
      messages.push(
        { role: 'assistant', content: lastAttempt },
        { role: 'user', content: `The JSON you returned had this error: ${lastError}. Please fix it.` }
      );
    }

    const response = await llm.complete(messages);
    const result = await parseLlmJson(response, schema);

    if ('data' in result) return result.data;

    lastError = result.error;
    lastAttempt = response;
  }

  throw new Error(`Failed after ${maxRetries} retries: ${lastError}`);
}

Option 2: Use OpenAI Structured Outputs

The cleanest approach — zero repair needed:

const response = await openai.chat.completions.create({
  model: 'gpt-4o',
  response_format: {
    type: 'json_schema',
    json_schema: {
      name: 'product',
      strict: true,
      schema: {
        type: 'object',
        properties: {
          id:       { type: 'string' },
          name:     { type: 'string' },
          price:    { type: 'number' },
          category: { type: 'string' },
          inStock:  { type: 'boolean' },
        },
        required: ['id', 'name', 'price', 'category', 'inStock'],
        additionalProperties: false,
      },
    },
  },
  messages: [{ role: 'user', content: '...' }],
});

// With strict: true, this never fails
const product = ProductSchema.parse(
  JSON.parse(response.choices[0].message.content!)
);

Generate the JSON Schema from your sample data using the JSON to OpenAI Schema tool.

Option 3: Partial schema acceptance

Sometimes you need the data even if it's incomplete. Use .partial():

const PartialProductSchema = ProductSchema.partial();
type PartialProduct = z.infer<typeof PartialProductSchema>;

// Accept whatever the model provided
const partialResult = PartialProductSchema.safeParse(parsed);
if (partialResult.success) {
  // Work with what we have
  processPartial(partialResult.data);
}

Observability: Track Parse Quality Over Time

In production, track these metrics:

interface LlmJsonMetrics {
  total: number;
  parseSuccess: number;
  repaired: number;
  schemaError: number;
  totalFailure: number;
}

If your repaired rate climbs above 20%, your prompts need tuning. If schemaError rate is non-zero, your schema or prompt description has a mismatch. These metrics tell you where to invest prompt engineering effort.

Schema Design for LLM Outputs

Some Zod patterns that work better with LLMs:

Use .default() for optional arrays — LLMs often omit optional arrays entirely rather than returning []:

z.object({
  tags: z.array(z.string()).default([]),  // never undefined
})

Use .transform() to normalize — handle variations in how the model formats values:

z.object({
  status: z.string().transform(s => s.toLowerCase()),
  // "Active", "ACTIVE", "active" → all become "active"
})

Use z.union() for inconsistent types — when the model sometimes returns a string ID and sometimes a number:

z.object({
  id: z.union([z.string(), z.number()]).transform(v => String(v)),
})

Summary

The production-grade LLM JSON pipeline:

  1. Extract — pull JSON out of markdown fences if needed
  2. Repair — run jsonrepair as a fallback for syntax errors
  3. Validate — use Zod's safeParse() for schema validation
  4. Coerce — use z.coerce for known type mismatches
  5. Retry — feed errors back to the model or use structured outputs
  6. Observe — track repair rate and schema error rate

The goal isn't zero errors — LLMs are probabilistic. The goal is catching every error gracefully so a malformed response never reaches your application logic unchecked.


Tools mentioned in this post:

Try the JSON Kit tools

Everything mentioned in this post is available as a free browser-side tool.

Browse all 30+ tools →