Structured Output for Agent

Build agents that return reliable, schema-validated structured responses.

Structured output means the Agent returns its answer in a format you define, such as a dictionary, JSON-like object, or typed schema. Instead of replying with free-form text, the agent fills the exact fields and structure you specify.

Why Structured Output?

Structured output is useful because:

It gives consistent and predictable responses
Your code can easily read and use the output without parsing text
It prevents the model from adding extra text or changing the format
It makes agents reliable for automation, APIs, databases, UI generation, and workflows

Enforcing a simple schema

We will be using pydantic package to create the output schema. Pydantic is a data validation and settings management using Python type annotations. More about pydantic: https://docs.pydantic.dev/latest/

Make sure to install pydantic package pip install pydantic.

from pydantic import BaseModel, Field
from peargent import create_agent
from peargent.models import openai

# 1. Define your schema//
class Summary(BaseModel):
    title: str = Field(description="Short title for the summary")
    points: list[str] = Field(description="Key points extracted from the text")

# 2. Create agent with structured output
agent = create_agent(
    name="Summarizer",
    description="Summarizes long text into structured key points.",
    persona="You are a precise summarizer.",
    model=openai("gpt-4o"),
    output_schema=Summary,   # ← enforce structured output
)

# 3. Run the agent
result = agent.run("Long text to summarize")
print(result)

Output (always structured):

{
  "title": "Understanding Black Holes",
  "points": [
    "They form when massive stars collapse.",
    "Their gravity is extremely strong.",
    "Nothing can escape once inside the event horizon."
  ]
}

Schema with Custom Validators

Sometimes models return values that don't match the schema. Peargent integrates with Pydantic validators to enforce rules like rejecting incorrect values or cleaning fields. If validation fails, Peargent automatically retries until the output is valid (respecting max_retries).

This is particularly useful for ensuring that generated data meets strict business logic requirements, such as validating email formats, checking price ranges, or ensuring strings meet specific length constraints.

from pydantic import BaseModel, Field, field_validator
from peargent import create_agent
from peargent.models import openai

# Define a simple structured output with validators
class Product(BaseModel):
    name: str = Field(description="Product name")
    price: float = Field(description="Price in USD", ge=0)
    category: str = Field(description="Product category")

    # Validator: ensure product name is not too generic
    @field_validator("name") 
    @classmethod
    def name_not_generic(cls, v):
        """
        Ensure the product name is not overly generic (e.g., 'item', 'product', 'thing').
        Helps maintain meaningful and descriptive product naming.
        """
        forbidden = ["item", "product", "thing"]
        if v.lower() in forbidden:
            raise ValueError("Product name is too generic")
        return v
    
    # Validator: enforce category capitalization
    @field_validator("category")
    @classmethod
    def category_must_be_titlecase(cls, v):
        """
        Automatically convert the product category into Title Case
        to maintain consistent formatting across all entries.
        """
        return v.title()

# Create agent with structured output validation
agent = create_agent(
    name="ProductGenerator",
    description="Generates product details",
    persona="You describe products clearly and accurately.",
    model=openai("gpt-4o"),
    output_schema=Product,
    max_retries=3
)

product = agent.run("Generate a new gadget idea for travelers")
print(product)

Why validator docstrings matter: Docstrings are included in the prompt sent to the model. They explain your custom validation rules in natural language, helping the LLM avoid mistakes before they happen. This drastically reduces failed validations, retries, and extra API calls, saving cost and improving reliability.

Nested Output Schema

You can also nest multiple Pydantic models inside each other. This allows your agent to return clean, hierarchical, and well-organized structured output, perfect for complex data like profiles, products, events, or summaries.

from pydantic import BaseModel, Field
from typing import List
from peargent import create_agent
from peargent.models import openai

# ----- Nested Models -----
class Address(BaseModel):
    street: str = Field(description="Street name")
    city: str = Field(description="City name")
    country: str = Field(description="Country name")

class UserProfile(BaseModel): 
    name: str = Field(description="Full name of the user")
    age: int = Field(description="Age of the user", ge=0, le=120)
    email: str = Field(description="Email address")
    # Nesting Address schema
    address: Address = Field(description="Residential address") 
    hobbies: List[str] = Field(description="List of hobbies")

# ----- Create Agent with Nested Schema -----

agent = create_agent(
    name="ProfileBuilder",
    description="Builds structured user profiles",
    persona="You extract and organize user information accurately.",
    model=openai("gpt-4o"),
    output_schema=UserProfile
)

profile = agent.run("Generate a profile for John Doe who lives in London.")
print(profile)

Output shape:

{
  "name": "John Doe",
  "age": 32,
  "email": "john.doe@example.com",
  "address": {
    "street": "221B Baker Street",
    "city": "London",
    "country": "United Kingdom"
  },
  "hobbies": ["reading", "cycling"]
}

How Structured Output Works

Output Schema Is Extracted

Agent first reads your output_schema (Pydantic model) and extracts field names, types, required fields, and constraints (e.g., min_length, ge, le). This forms the core JSON schema that the model must follow.

Validator Docstrings Are Collected

Next, Agent scans your Pydantic validators and collects the docstrings you wrote inside them. These docstrings describe custom rules in natural language, such as “Name must not be generic” or “Price must be realistic”.

These docstrings are critical because:

The LLM understands natural language rules
It reduces retries (→ lower cost)
It helps the model produce valid JSON on the first attempt

Schema + Validator Rules Are Combined and Sent to the Model

Agent merges the JSON schema, field constraints, and validator docstrings into a single structured prompt.

At this point, the Model receives:

The complete structure it must output
Every validation rule it must follow
Clear natural-language constraints

This ensures the Model is fully aware of what the final response should look like.

Agent sends the error back to the Model
The Model tries again
Loop continues until max retries are reached

Final Clean Pydantic Object Returned

After validation succeeds, Agent returns a fully typed, fully validated, and safe-to-use object.