A Beginner’s Guide to Returning Structured Outputs in LangChain

When building with AI models, the hardest problems often aren’t about generating text, they’re about controlling it. You don’t just want an answer; you want a predictable, structured response that fits cleanly into your system. Whether you’re feeding outputs into a database, orchestrating tools, or powering a UI, you need data that behaves. That’s where structured output comes in. Instead of prompting a model and hoping it formats the response correctly, you define a schema for the model to conform to. With this, you can move from loose, text-based interaction to tightly scoped, schema-driven responses. LangChain makes this not only possible but surprisingly clean. It gives you multiple tools, from schema binding and tool calling to JSON mode and the withStructuredOutput() helper, that make structured generation reliable and production-ready. Before we dive in, here’s something you’ll love: We are currently working on Langcasts.com, a resource crafted specifically for AI engineers, whether you're just getting started or already deep in the game. We'll be sharing guides, tips, hands-on walkthroughs, and extensive classes to help you master every piece of the puzzle. If you’d like to be notified the moment new materials drop, you can subscribe here to get updates directly. This article dives into how to return structured outputs in LangChain, covering key concepts, schema definitions, tool calling, JSON mode, and LangChain’s built-in helper functions to streamline the process. By the end, you'll understand how to make LangChain models return structured, developer-native responses that integrate seamlessly with your applications. What is Structured Output? Structured output refers to the process of instructing AI-generated responses that follow a predefined format, such as JSON or a schema-based structure. Instead of producing free-text answers, the model outputs data in an organized and predictable way. Let's say you’re building an AI-powered customer support system. A user asks, "What's the refund policy?" You don’t just want a free-text response. Instead, you want a structured output that separates the answer, potential follow-up questions, and maybe even a relevant link to the refund policy page. This ensures your application can process the response programmatically, display it neatly in a UI, or store it in a database without extra parsing. Structured output is all about predictability. Instead of an AI model generating unstructured, free-flowing text, we define a schema, a blueprint that dictates the format of the model's response. This is crucial for applications that rely on AI-generated data because: ✅ Consistency – Every response follows the same structure, making it easier to parse and use. ✅ Reliability – No surprises. The output adheres to a predefined format, preventing unexpected results. ✅ Automation-Friendly – Structured responses integrate seamlessly with databases, APIs, and other systems. Structured vs. Unstructured Output: A Quick Comparison Feature Unstructured Output (Free Text) Structured Output (Schema-Based) Format Plain text, varies each time Consistent, predefined structure Ease of Parsing Hard to extract key info programmatically Easy to process automatically Use Case Chatbots, casual conversations APIs, databases, automation tasks For AI developers, structured outputs is a necessity when integrating AI-generated responses into larger systems. Now that we understand why structured output matters, let’s explore the key concepts behind implementing it in LangChain. Methods for Returning Structured Outputs in LangChain Now that we understand the key concepts behind structured outputs, let’s explore the different ways to enforce structured responses in LangChain. These methods ensure that AI models return data in a reliable format that can be easily processed by your application. Using Schema Definitions The first strategy in enforcing structured output is defining a schema. A schema is like a template that defines how an AI model should format its response. Think of it as setting ground rules for what the output should contain and how it should be structured. Without a schema, models generate open-ended text, which can be inconsistent and difficult to process. In LangChain, schemas can be defined in multiple ways, but the most common approach is using Zod schema definitions (in TypeScript), Pydantic (in Python, or using a JSON Schema. Zod Schema Definition (TypeScript): Zod is a powerful TypeScript library for defining and validating object structures. Here’s how we can define a schema using Zod: import { z } from "zod"; const ResponseSchema = z.object({ answer: z.string().describe("The answer to the user's question"), followup_question: z.string().describe("A possible follow-up question"), }); This schema ensures that every AI response includes: ✔ An "answer" field – The AI-generated answ

Apr 18, 2025 - 23:23

A Beginner’s Guide to Returning Structured Outputs in LangChain

When building with AI models, the hardest problems often aren’t about generating text, they’re about controlling it. You don’t just want an answer; you want a predictable, structured response that fits cleanly into your system. Whether you’re feeding outputs into a database, orchestrating tools, or powering a UI, you need data that behaves.

That’s where structured output comes in. Instead of prompting a model and hoping it formats the response correctly, you define a schema for the model to conform to. With this, you can move from loose, text-based interaction to tightly scoped, schema-driven responses.

LangChain makes this not only possible but surprisingly clean. It gives you multiple tools, from schema binding and tool calling to JSON mode and the withStructuredOutput() helper, that make structured generation reliable and production-ready.

Before we dive in, here’s something you’ll love:
We are currently working on Langcasts.com, a resource crafted specifically for AI engineers, whether you're just getting started or already deep in the game. We'll be sharing guides, tips, hands-on walkthroughs, and extensive classes to help you master every piece of the puzzle. If you’d like to be notified the moment new materials drop, you can subscribe here to get updates directly.

This article dives into how to return structured outputs in LangChain, covering key concepts, schema definitions, tool calling, JSON mode, and LangChain’s built-in helper functions to streamline the process. By the end, you'll understand how to make LangChain models return structured, developer-native responses that integrate seamlessly with your applications.

What is Structured Output?

Structured output refers to the process of instructing AI-generated responses that follow a predefined format, such as JSON or a schema-based structure. Instead of producing free-text answers, the model outputs data in an organized and predictable way.

Let's say you’re building an AI-powered customer support system. A user asks, "What's the refund policy?" You don’t just want a free-text response. Instead, you want a structured output that separates the answer, potential follow-up questions, and maybe even a relevant link to the refund policy page. This ensures your application can process the response programmatically, display it neatly in a UI, or store it in a database without extra parsing.

Structured output is all about predictability. Instead of an AI model generating unstructured, free-flowing text, we define a schema, a blueprint that dictates the format of the model's response. This is crucial for applications that rely on AI-generated data because:

✅ Consistency – Every response follows the same structure, making it easier to parse and use.

✅ Reliability – No surprises. The output adheres to a predefined format, preventing unexpected results.

✅ Automation-Friendly – Structured responses integrate seamlessly with databases, APIs, and other systems.

Structured vs. Unstructured Output: A Quick Comparison

Feature	Unstructured Output (Free Text)	Structured Output (Schema-Based)
Format	Plain text, varies each time	Consistent, predefined structure
Ease of Parsing	Hard to extract key info programmatically	Easy to process automatically
Use Case	Chatbots, casual conversations	APIs, databases, automation tasks

For AI developers, structured outputs is a necessity when integrating AI-generated responses into larger systems.

Now that we understand why structured output matters, let’s explore the key concepts behind implementing it in LangChain.

Methods for Returning Structured Outputs in LangChain

Now that we understand the key concepts behind structured outputs, let’s explore the different ways to enforce structured responses in LangChain. These methods ensure that AI models return data in a reliable format that can be easily processed by your application.

Using Schema Definitions

The first strategy in enforcing structured output is defining a schema. A schema is like a template that defines how an AI model should format its response. Think of it as setting ground rules for what the output should contain and how it should be structured. Without a schema, models generate open-ended text, which can be inconsistent and difficult to process.

In LangChain, schemas can be defined in multiple ways, but the most common approach is using Zod schema definitions (in TypeScript), Pydantic (in Python, or using a JSON Schema.

Zod Schema Definition (TypeScript): Zod is a powerful TypeScript library for defining and validating object structures. Here’s how we can define a schema using Zod:

import { z } from "zod";

const ResponseSchema = z.object({
  answer: z.string().describe("The answer to the user's question"),
  followup_question: z.string().describe("A possible follow-up question"),
});

This schema ensures that every AI response includes:

✔ An "answer" field – The AI-generated answer.

✔ A "followup_question" field – A suggested follow-up question.

JSON Schema Definition: JSON Schema is a widely accepted format for defining structured data. It is especially useful for enforcing structure in systems that don’t use TypeScript. Here’s the same schema in JSON Schema format:

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "title": "ResponseSchema",
  "type": "object",
  "properties": {
    "answer": {
      "type": "string",
      "description": "The answer to the user's question"
    },
    "followup_question": {
      "type": "string",
      "description": "A possible follow-up question"
    }
  },
  "required": ["answer", "followup_question"]
}

Both Zod and JSON Schema provide a way to define the expected output, but defining the schema alone isn’t enough—we also need a way to enforce it when the AI generates a response.

Tool Calling

Tool calling allows AI models to invoke external tools and force responses to follow a predefined schema. This is useful when we need structured responses that align with a specific data format.

How Tool Calling Works in LangChain

Define a tool with a structured output schema.
Bind the tool to the model, ensuring the AI can only return responses in the correct format.
Invoke the model, and it automatically generates a structured response.

Example:

import { ChatOpenAI } from "@langchain/openai";
import { z } from "zod";
import { tool } from "@langchain/core/tools"; 

const ResponseSchema = z.object({
  answer: z.string().describe("The answer to the user's question"),
  followup_question: z.string().describe("A possible follow-up question"),
});

const model = new ChatOpenAI({
  modelName: "gpt-4",
  temperature: 0,
});

// Create a tool with the predefined schema
const responseFormatterTool = tool(async () => {}, {
  name: "responseFormatter",
  schema: ResponseSchema,
});

// Bind the tool to the model
const modelWithTools = model.bindTools([responseFormatterTool]);

// Invoke the model
const aiMsg = await modelWithTools.invoke("What is the powerhouse of the cell?");
console.log(aiMsg);

With tool calling, the AI must return responses that fit the schema—ensuring consistency and reliability.

JSON Mode

Some AI models, including OpenAI's GPT, support a special JSON Mode that forces responses to be formatted as valid JSON objects. This is a form of structured output, but unlike schema-based methods, the structure is implied through the prompt, not programmatically enforced.

How JSON Mode Works in LangChain:

You enable JSON Mode when initializing the model and provide an instruction in natural language that describes the expected JSON shape.


import { ChatOpenAI } from "@langchain/openai";

const model = new ChatOpenAI({
  model: "gpt-4",
}).bind({
  response_format: { type: "json_object" },
});

const aiMsg = await model.invoke(
  "Return a JSON object with key 'random_nums' and a value of 10 random numbers in [0-99]"
);

console.log(aiMsg.content);
// Output: { "random_nums": [23, 47, 89, 15, 34, 76, 58, 3, 62, 91] }

Since the response is returned as a string, you still need to parse it:

const jsonObject = JSON.parse(aiMsg.content);
console.log(jsonObject.random_nums);

Why JSON Mode is Useful:

Forces valid JSON format without needing extra parsing logic.
Reduces the risk of malformed text outputs.
Useful when working with APIs or systems that expect clean JSON.

Note: If you need to strictly enforce both structure and format, use withStructuredOutput() instead. JSON Mode is powerful, but it relies on the model following instructions—not on actual schema enforcement.

Each method—Schema Definitions, Tool Calling, and JSON Mode—offers a unique way to return structured outputs in LangChain. However, they all come with their own trade-offs.

To simplify the process, LangChain provides a built-in helper function: withStructuredOutput(), which we’ll explore next. This function automates schema binding and output parsing, making structured output even more reliable.

The withStructuredOutput() Method

We’ve explored different ways to enforce structured output in LangChain, but each method comes with its own complexities—parsing tool call arguments, enforcing tool usage, and handling JSON parsing manually.

To simplify this process, LangChain provides a built-in helper function:withStructuredOutput().

withStructuredOutput() is a wrapper function that binds a schema to the model, ensuring that all outputs follow a predefined structure. It also automatically parses the model’s response, so there’s no need for manual parsing. This makes enforcing structured outputs much simpler compared to using raw tool calling or JSON mode.

Instead of dealing with complex parsing logic, withStructuredOutput() lets you focus on getting clean, structured responses effortlessly.

How `withStructuredOutput()` Works

Using withStructuredOutput(), we can enforce a structured response in just three simple steps:

1️⃣ Define a Schema – Use Zod to specify the expected output format.

2️⃣ Bind the Schema to the Model – Use withStructuredOutput() to attach the schema to the model.

3️⃣ Invoke the Model – Get a structured response that automatically matches the schema.

Example:

Let’s say we want the model to return a structured response with an answer and a follow-up question. Here’s how easy it is with withStructuredOutput():

import { ChatOpenAI } from "@langchain/openai";
import { z } from "zod";

// Step 1: Define the schema
const ResponseSchema = z.object({
  answer: z.string().describe("The answer to the user's question"),
  followup_question: z.string().describe("A follow-up question the user could ask"),
});

// Step 2: Bind the schema to the model
const model = new ChatOpenAI({ modelName: "gpt-4", temperature: 0 });
const modelWithStructure = model.withStructuredOutput(ResponseSchema);

// Step 3: Invoke the model and get a structured response
const structuredOutput = await modelWithStructure.invoke(
  "What is the powerhouse of the cell?"
);

console.log(structuredOutput);

// Output:
{
  answer: "Mitochondria are the powerhouse of the cell.",
  followup_question: "What role does ATP play in cellular energy?"
}

withStructuredOutput() is the simplest and most reliable way to enforce structured outputs in LangChain. It removes the pain points of manual parsing and schema enforcement, allowing developers to focus on building robust AI-driven applications.

Conclusion

Structured outputs are essential for ensuring AI-generated responses are consistent, predictable, and easy to integrate into real-world applications. Whether you're storing data in a database, powering an API, or building an AI-driven tool, structured responses eliminate ambiguity and streamline automation.

LangChain makes implementing structured outputs simple and efficient by providing built-in support for schema definitions, tool calling, JSON mode, and the withStructuredOutput() method. These features allow developers to enforce structured responses without complex parsing or unreliable free-text outputs.