Guides

Building Structured Output / JSON Mode with open-source t...

This guide outlines the technical implementation of schema-validated LLM outputs for production data pipelines. It focuses on moving beyond basic prompt engineering to a robust, type-safe architecture using Zod and modern SDKs to ensure 100% parseable results.

45 minutes6 steps
1

Define the Zod Schema with Semantic Descriptions

Create a Zod schema that defines your data structure. Crucially, use the .describe() method on fields; most LLM providers use these descriptions as 'instructions' for the model to understand the context of each field during the generation process.

schema.ts
import { z } from 'zod';

const ExtractionSchema = z.object({
  companyName: z.string().describe('The full legal name of the entity'),
  valuation: z.number().describe('USD valuation in millions'),
  tags: z.array(z.string()).max(5).describe('Industry keywords'),
  sentiment: z.enum(['positive', 'neutral', 'negative'])
});

⚠ Common Pitfalls

  • Avoid using deeply nested objects which increase the likelihood of the model losing context.
  • Do not leave fields ambiguous; a missing description often leads to hallucinated data formats.
2

Select the Enforcement Mode

Decide between 'JSON Mode' and 'Tool/Function Calling'. JSON Mode ensures the output is valid JSON but doesn't guarantee schema adherence. Tool/Function Calling (or OpenAI's 'Structured Outputs' feature) forces the model to follow the specific JSON schema provided in the API call.

api-call.ts
// OpenAI Structured Output Example
const response = await openai.chat.completions.create({
  model: 'gpt-4o-2024-08-06',
  messages: [{ role: 'user', content: 'Extract data from...' }],
  response_format: {
    type: 'json_schema',
    json_schema: {
      name: 'extraction',
      strict: true,
      schema: zodToJsonSchema(ExtractionSchema)
    }
  }
});

⚠ Common Pitfalls

  • Using JSON Mode without a validation library like Zod on the client side will lead to runtime crashes when keys are missing.
  • Strict mode in OpenAI requires all fields to be required in the JSON schema.
3

Implement the Extraction Logic with Instructor or Vercel AI SDK

Using a wrapper library like 'Instructor' or 'Vercel AI SDK' abstracts the boilerplate of converting Zod to JSON Schema and handling the parsing. This reduces the risk of manual parsing errors.

generate.ts
import { generateObject } from 'ai';
import { openai } from '@ai-sdk/openai';

const { object } = await generateObject({
  model: openai('gpt-4o-mini'),
  schema: ExtractionSchema,
  prompt: 'Analyze this press release: [TEXT]',
});

console.log(object.companyName); // Fully typed
4

Handle Partial Results in Streaming UI

If building a dashboard, use partial streaming to show data as it is generated. This improves perceived latency. Use libraries that can parse incomplete JSON chunks into a partial object state.

stream.ts
import { streamObject } from 'ai';

const { partialObjectStream } = await streamObject({
  model: openai('gpt-4o'),
  schema: ExtractionSchema,
  prompt: 'Extracting...',
});

for await (const partial of partialObjectStream) {
  console.log(partial); // Partial object updates
}

⚠ Common Pitfalls

  • Do not attempt to use partial data for database writes; only use it for UI state.
5

Implement Automated Retries for Validation Failures

Even with structured output modes, models can fail due to token limits or internal errors. Implement a retry loop (maximum 2-3 attempts) that feeds the validation error back into the next prompt to 'self-correct' the output.

⚠ Common Pitfalls

  • Infinite retry loops can quickly drain API credits if a schema is impossible for the model to satisfy.
  • Generic error messages like 'invalid json' don't help the model; pass the specific Zod error path and message.
6

Benchmark and Cost Optimization

Structured output consumes more tokens than raw text because of the formatting overhead. Compare GPT-4o-mini and Gemini 1.5 Flash for high-volume pipelines. Use specialized benchmarks to ensure the smaller model maintains the required extraction accuracy for your specific schema.

⚠ Common Pitfalls

  • Ignoring the 'output token' cost of repetitive JSON keys in large arrays.
  • Assuming a model that works for a simple schema will work for a complex one without re-testing.

What you built

Transitioning to strictly typed structured outputs eliminates the most common failure point in LLM integrations. By combining Zod schemas with provider-native enforcement (like OpenAI Strict Mode) and robust client-side parsing, you can build reliable data pipelines that function like traditional APIs.