Agents

# Agents

<h1 className="mt-4 text-3xl font-medium">
  <span className="text-primary">
    Agents
  </span>
</h1>

<h3 className="-mt-7 text-lg font-normal text-muted-foreground">
  Explore how agents operate, how they use tools and memory, and how to structure them effectively in Peargent.
</h3>

In simple terms, an Agent calls the **<u>[Model](/docs/models)</u>** to generate responses according to its defined behavior.

Agents are the core units of work in Peargent, while **<u>[Tools](/docs/tools)</u>** provide the capabilities that help agents perform actions and tackle complex tasks.

Agents can operate individually for simple tasks, or they can be combined into a **<u>[Pool](/docs/pools)</u>** of agents to handle more complex, multi-step workflows.

## Creating an Agent

To create an agent, use the `create_agent` function from the peargent module. At minimum, you must define the agent’s `name`, `description`, and `persona`, and the `model` to use.

Here is a simple example:

```python
from peargent import create_agent
from peargent.models import openai

code_reviewer = create_agent(
    name="Code Reviewer",
    description="Reviews code for issues and improvements",
    persona=(
        "You are a highly skilled senior software engineer and code reviewer. "
        "Your job is to analyze code for correctness, readability, maintainability, and performance. "
        "Identify bugs, edge cases, and bad practices. Suggest improvements that follow modern Python "
        "standards and best engineering principles. Provide clear explanations and, when appropriate, "
        "offer improved code snippets. Always be concise, accurate, and constructive."
    ),
    model=openai("gpt-4")
)
```

Call `agent.run(prompt)` to perform an inference using the agent’s persona as the system prompt and your input as the user message.

```python
response = code_reviewer.run("Review this Python function for improvements:\n\ndef add(a, b): return a+b")
print(response)
# The function is correct but could be optimized, here is the optimized version...
```

<Callout>
  When running an agent individually, the `description` field is optional. However, it becomes mandatory when the agent is part of a **<u>[Pool](/docs/pools)</u>**.
</Callout>

* Refer **<u>[Tools](/docs/tools)</u>** to learn how to use tools with agents.
* Refer **<u>[History](/docs/history)</u>** to learn how to setup conversation memory for agents.
* Refer **<u>[Pool](/docs/pools)</u>** to learn how to create a pool of agents.

## How does agent work?

<div className="fd-steps [&_h3]:fd-step">
  ### Start Execution `(agent.run())`

  When you call `agent.run(...)`, the agent prepares for a new interaction: it loads any previous conversation **<u>[History](/docs/history)</u>** (if enabled), begins **<u>tracing</u>** (if enabled), and registers the user's new input.

  ### Build the Prompt

  The agent constructs the full prompt by combining its **<u>persona</u>**, **<u>[Tools](/docs/tools)</u>**, prior conversation context, and optional **<u>output schema</u>**. This prompt is then sent to the configured **<u>[Model](/docs/models)</u>**.

  ### Model Generates a Response

  The model returns a response based on the prompt. The agent records this output and checks whether the model is requesting **<u>tool calls</u>**.

  ### Execute Tools (If Requested)

  If the response includes tool calls, the agent runs those tools (in **<u>parallel</u>** if multiple), collects their outputs, and then asks the model again using an updated prompt. This cycle continues until no more tool actions are required.

  ### Finalize the Result

  The agent checks whether it should stop (stop conditions met or max iterations reached). If an **<u>output schema</u>** was provided, the response is validated against it. Finally, the conversation is synced to **<u>[History](/docs/history)</u>** (if enabled), tracing is ended, and the final response is returned.
</div>

## Parameters

| Parameter       | Type              | Description                                                                                                                                         | Required |
| :-------------- | :---------------- | :-------------------------------------------------------------------------------------------------------------------------------------------------- | :------- |
| `name`          | `str`             | The name of the agent.                                                                                                                              | Yes      |
| `description`   | `str`             | A brief description of the agent's purpose. Required when using in a **<u>[Pool](/docs/pools)</u>**.                                                | No\*     |
| `persona`       | `str`             | The system prompt defining the agent's personality and instructions.                                                                                | Yes      |
| `model`         | `Model`           | The LLM model instance (e.g., `openai("gpt-4")`).                                                                                                   | Yes      |
| `tools`         | `list[Tool]`      | A list of **<u>[Tools](/docs/tools)</u>** the agent can access.                                                                                     | No       |
| `stop`          | `StopCondition`   | Condition that determines when the agent should stop iterating (default: `limit_steps(5)`).                                                         | No       |
| `history`       | `HistoryConfig`   | Configuration for conversation **<u>[History](/docs/history)</u>**.                                                                                 | No       |
| `tracing`       | `bool \| None`    | Enable/disable tracing. `None` (default) inherits from global tracer if `enable_tracing()` was called, `True` explicitly enables, `False` opts out. | No       |
| `output_schema` | `Type[BaseModel]` | Pydantic model for structured output validation.                                                                                                    | No       |
| `max_retries`   | `int`             | Maximum retries for `output_schema` validation (default: `3`). Only used when `output_schema` is provided.                                          | No       |

\* Required when using the agent in a **<u>[Pool](/docs/pools)</u>**.

To know more about `stop`, `tracing`, `output_schema`, and `max_retries`, refer to **<u>Advanced Features</u>**.


# Examples

<h1 className="mt-4 text-3xl font-medium">
  <span className="text-primary">
    Examples
  </span>
</h1>

<h3 className="-mt-7 text-lg font-normal text-muted-foreground">
  The following examples illustrate common ways to use Peargent across different workflows.
</h3>

<Cards>
  <Card title="Fiction Writer Agent" href="https://github.com/Peargent/peargent/tree/main/examples/Practical%20Examples/fiction-writer">
    Build a multi-agent creative writing system where dedicated agents generate characters, plot structure, worldbuilding details, and dialogues to produce cohesive short stories from a single prompt.
  </Card>

  <Card title="Python Code Generator & Auto-Debugger" href="https://github.com/Peargent/peargent/tree/main/examples/Practical%20Examples/code-generator">
    Develop an autonomous agent that turns natural-language tasks into runnable Python code, tests it in-process, fixes errors through iterative reasoning, and explains the final solution step-by-step.
  </Card>

  <Card title="Multi-Agent Code Reviewer" href="https://github.com/Peargent/peargent/tree/main/examples/Practical%20Examples/code-reviewer">
    Create a Python-based code review system where multiple specialized agents (style, security, optimization, readability) analyze a file and produce a combined, actionable review report.
  </Card>

  <Card title="Knowledge Extractor" href="https://github.com/Peargent/peargent/tree/main/examples/Practical%20Examples/knowledge-extractor">
    Build an AI agent that analyzes entire folders of text, code, and markdown files to generate structured summaries, glossaries, cross-references, and a unified knowledge map, all from simple local input.
  </Card>
</Cards>

<br />

<Card title="Contribute to Peargent Examples" href="https://github.com/Peargent/peargent">
  Contribute to Peargent by adding more examples to the docs.
</Card>


# History

<h1 className="mt-4 text-3xl font-medium">
  <span className="text-primary">
    History
  </span>
</h1>

<h3 className="-mt-7 text-lg font-normal text-muted-foreground">
  Persistent conversation memory that allows agents to remember past interactions across sessions.
</h3>

History is Peargent’s **persistent conversation memory** system. It allows agents and pools to remember past interactions across sessions, enabling continuity, context-awareness, and long-running workflows.

Think of history like a **notebook** your agent writes in. Each message, tool call, and response is recorded so the agent can look back and recall what happened earlier.

You can pass a `HistoryConfig` to any **<u>[Agent](/docs/agents)</u>** or **<u>[Pool](/docs/pools)</u>**. If a pool receives a history, it overrides individual agent histories so all agents share the same conversation thread. History can be stored using backends such as in-memory, file, SQLite, PostgreSQL, Redis, or custom storage backends.

## Creating History

To create a history, you need to pass a `HistoryConfig` to the `create_agent` or `create_pool` function. HistoryConfig is a configuration object that allows you to configure the history of an agent or pool.

By defautl HistoryConfig uses `InMemory()` storage backend (temporary storage - data is lost when program exits).

### Adding history to Agents:

```python
from peargent import create_agent
from peargent.history import HistoryConfig
from peargent.models import openai

agent = create_agent(
    name="Assistant",
    description="Helpful assistant with memory",
    persona="You are a helpful assistant.",
    model=openai("gpt-4o"),
    history=HistoryConfig()
)

# First conversation
agent.run("My name is Alice")

# Later conversation - agent remembers
agent.run("What's my name?")
# Output: "Your name is Alice"
```

### Adding history to Pools:

```python
from peargent import create_pool
from peargent.history import HistoryConfig

pool = create_pool(
    agents=[agent1, agent2]
    history=HistoryConfig()
)

# First conversation
pool.run("My name is Alice")

# Later conversation - agent remembers
pool.run("What's my name?")
# Output: "Your name is Alice"
```

## How History Works

<div className="fd-steps [&_h3]:fd-step">
  ### Load Conversation

  When an agent begins a run, it loads the existing conversation thread from the configured **<u>storage backend</u>**.

  ### Append Messages

  Each new user message, tool call, and agent response is added to the conversation thread in order.

  ### Manage Context

  If the conversation grows beyond `max_context_messages`, the configured **<u>[strategy](/docs/history#strategies)</u>** (trim or summarize) is applied to keep the context window manageable.

  ### Persist Data

  All updates are saved back to the **<u>[storage backend](/docs/history#storage-backends)</u>**, ensuring the conversation history is retained across sessions and future runs.
</div>

Because history supports many advanced capabilities, custom storage backends, manual thread control, serialization, and low-level message operations, listing every option here would make this page too large. For deeper configuration and advanced usage, see **<u>[Advanced History](/docs/Advanced%20History)</u>**.

## Storage Backends

History can be stored in different backends depending on your use case. Here are all supported backends available in Peargent:

```python
from peargent.history import HistoryConfig
from peargent.storage import InMemory, File, Sqlite, Postgresql, Redis

# InMemory (Default)
# - Fast, temporary storage
# - Data is lost when the program exits
history = HistoryConfig(store=InMemory())

# File (JSON files)
# - Stores conversations as JSON on disk
# - Good for local development or small apps
history = HistoryConfig(store=File(storage_dir="./conversations"))

# SQLite (Local database)
# - Reliable, ACID-compliant
# - Ideal for single-server production
history = HistoryConfig(
    store=Sqlite(
        database_path="./chat.db",
        table_prefix="peargent"
    )
)

# PostgreSQL (Production database)
# - Scalable, supports multi-server deployments
history = HistoryConfig(
    store=Postgresql(
        connection_string="postgresql://user:pass@localhost/dbname",
        table_prefix="peargent"
    )
)

# Redis (Distributed + TTL)
# - Fast, supports key expiration
# - Ideal for cloud deployments and ephemeral memory
history = HistoryConfig(
    store=Redis(
        host="localhost",
        port=6379,
        db=0,
        password=None,
        key_prefix="peargent"
    )
)
```

To create a custom storage backend, refer to **<u>[History Management - Custom Storage Backends](/docs/history-management/custom-storage)</u>**.

## Auto Context Management

When conversations become too long, Peargent automatically manages the context window to keep prompts efficient and within model limits. This behavior is controlled by the strategy you choose.

### Strategies

`smart` (Default)
Automatically decides whether to trim or summarize based on the size and importance of the overflow:

* Small overflow → trim (fast)

* Important tool calls → summarize

* Large overflow → aggressive summarization

```python
history = HistoryConfig(
    auto_manage_context=True,
    strategy="smart"
)
```

`trim_last`
Keeps the most recent messages and removes the oldest.
Fast and uses no LLM.

```python
history = HistoryConfig(
    auto_manage_context=True,
    strategy="trim_last",
    max_context_messages=15
)
```

`trim_first`
Keeps older messages and removes the newer ones.

```python
history = HistoryConfig(
    auto_manage_context=True,
    strategy="trim_first"
)
```

`summarize`
Uses an LLM to summarize older messages, preserving context while reducing size.

```python
history = HistoryConfig(
    auto_manage_context=True,
    strategy="summarize",
    summarize_model=gemini("gemini-2.5-flash")  # Fast model for summaries
)
```

<Callout type="info">
   

  `summarize_model`

   is used only with 

  `"summarize"`

   and 

  `"smart"`

   strategies. If not provided, the 

  **<u>[Agent](/docs/agents)</u>**

  's model will be used. 
</Callout>

## Parameters

| Parameter              | Type          | Default      | Description                                                                          | Required |
| :--------------------- | :------------ | :----------- | :----------------------------------------------------------------------------------- | :------- |
| `auto_manage_context`  | `bool`        | `False`      | Automatically manage context window when conversations get too long                  | No       |
| `max_context_messages` | `int`         | `20`         | Maximum messages before auto-management triggers                                     | No       |
| `strategy`             | `str`         | `"smart"`    | Context management strategy: `"smart"`, `"trim_last"`, `"trim_first"`, `"summarize"` | No       |
| `summarize_model`      | `Model`       | `None`       | LLM model for summarization (defaults to agent's model if not provided)              | No       |
| `store`                | `StorageType` | `InMemory()` | Storage backend: `InMemory()`, `File()`, `Sqlite()`, `Postgresql()`, `Redis()`       | No       |

Learn more about advanced history features including custom storage backends, manual thread control, and all available history methods in **<u>[Advanced History](/docs/Advanced%20History)</u>**.


# Overview


<h1 className="mt-4 text-3xl font-medium">
  <span className="text-primary">
    Overview
  </span>
</h1>

<h3 className="-mt-7 text-lg font-normal text-muted-foreground">
  About peargent.
</h3>

<img alt="Alt text for the image" src={__img0} placeholder="blur" />

<GithubCard />

Peargent is a modern, simple, and powerful Python framework for building intelligent AI agents with production-grade features. It offers a clean, intuitive API for creating conversational agents that can use tools, maintain memory, collaborate with other agents, and scale reliably into production.

<Cards>
  <Card title="Quick start" href="/docs/quickstart">
    Learn how to set up your first agent in just a few lines of code.
  </Card>

  <Card title="Examples" href="/docs/examples">
    Explore practical examples to understand Peargent's capabilities.
  </Card>

  <Card title="Core concepts" href="/docs/agents">
    Dive into the fundamental concepts that power Peargent.
  </Card>

  <Card title="Advanced Features" href="/docs/advanced_features">
    Discover multi-agent orchestration, persistent memory, and observability.
  </Card>
</Cards>

## What is Peargent?

Peargent simplifies the process of building AI agents by providing:

* **Flexible LLM Support** - Works seamlessly with OpenAI, Groq, Google Gemini, and Azure OpenAI
* **Powerful Tool System** - Execute actions with built-in timeout, retries, and input/output validation
* **Persistent Memory** - Multiple backends supported: in-memory, file, Sqlite, PostgreSQL, Redis
* **Multi-Agent Orchestration** - Coordinate specialized agents for complex workflows
* **Production-Ready Observability** - Built-in tracing, cost tracking, and performance metrics
* **Type-Safe Structured Outputs** - Easily validate responses using Pydantic models

## How Does Peargent Work?

Peargent lets you build individual agents or complex systems where each **<u>[Agent](/docs/agents)</u>** contributes specialized work while sharing context through a global **<u>[State](/docs/states)</u>**. Agents operate inside a **<u>[Pool](/docs/pools)</u>**, coordinated by a **<u>[Router](/docs/routers)</u>** that can run in round-robin or LLM-based mode to decide which agent handles each step. Agents use **<u>[Tools](/docs/tools)</u>** to take actions and update the shared State, while **<u>[History](/docs/history)</u>** persists reasoning and decisions to maintain continuity across the workflow.

## Why Peargent?

Start with a basic agent in just a few lines:

```python
  from peargent import create_agent

  agent = create_agent(
      persona="You are a helpful assistant",
      model="gpt-4"
  )

  response = agent.run("What is the capital of France?")
  print(response)
```

Scale to complex multi-agent systems with memory, tools, and observability:

```python
  from peargent import create_agent, create_tool, create_pool
  from peargent.history import HistoryConfig
  from peargent.storage import Sqlite

  # Create specialized agents with persistent memory
  researcher = create_agent(
      persona="You are a research expert",
      model="gpt-4",
      tools=[search_tool, analyze_tool],
  )

  writer = create_agent(
      persona="You are a technical writer",
      model="gpt-4",
  )

  # Orchestrate multiple agents
  pool = create_pool(
      agents=[researcher, writer],
      history=HistoryConfig(
          store=Sqlite(database_path="./pool_conversations/")
      )
  )

  result = pool.run("Research and write about quantum computing")
  print(result)
```


# Installation

<h1 className="mt-4 text-3xl font-medium">
  <span className="text-primary">
    Installation
  </span>
</h1>

<h3 className="-mt-7 text-lg font-normal text-muted-foreground">
  Get started with installing Peargent in your Python environment.
</h3>

To install Peargent **<u>[python package](https://pypi.org/project/peargent/)</u>**, you can use `pip` or `uv`. It's recommended to install **Peargent** inside a **virtual environment (venv)** to manage dependencies effectively.

```bash
pip install peargent
```

```bash
uv pip install peargent
```

If you don't have a virtual environment set up, you can create one using the following commands:

```bash
python -m venv venv
source venv/bin/activate  # On Windows use `venv\Scripts\activate`
```

After activating the virtual environment, run the pip install command above to install **Peargent**.

<Cards>
  <Card title="Next Steps" href="/docs/quickstart">
    Now that you have installed Peargent, proceed to the <strong>Quick Start</strong> guide to create your first AI agent.
  </Card>

  <Card title="Core Concepts" href="/docs/agents">
    Learn about the fundamental concepts that power Peargent, including Agents, Tools, Memory, and more.
  </Card>
</Cards>


# Long Term Memory

<h1 className="mt-4 text-3xl font-medium">
  <span className="text-primary">
    Long Term Memory
  </span>
</h1>

<h3 className="-mt-7 text-lg font-normal text-muted-foreground">
  Agents can remember past interactions and key information across sessions.
</h3>

## Coming Soon


# Models

<h1 className="mt-4 text-3xl font-medium">
  <span className="text-primary">
    Models
  </span>
</h1>

<h3 className="-mt-7 text-lg font-normal text-muted-foreground">
  Use different model providers in your Agents and Pools.
</h3>

Models define which LLM your **<u>[Agent](/docs/agents)</u>** or **<u>[Pool](/docs/pools)</u>** uses. Peargent provides a simple, unified interface for connecting to different providers (OpenAI, Groq, Gemini, etc.).

Think of a Model as the brain of your **<u>[Agent](/docs/agents)</u>** or **<u>[Pool](/docs/pools)</u>**, the thing that actually generates responses.

## Creating a Model

Models are imported from peargent.models and created using simple factory functions:

```python
from peargent.models import openai, groq, gemini, anthropic

model = openai("gpt-4o")
# or
model = anthropic("claude-3-5-sonnet-20241022")
```

You can run the model directly to get a response:

```python
response = model.generate("Hello, how are you?")
print(response)
```

### Passing model to Agent or Pool

```python
from peargent import create_agent, create_pool
from peargent.models import openai

agent = create_agent(
    name="Researcher",
    description="You are a researcher who can answer questions about the world.",
    persona="You are a researcher who can answer questions about the world.",
    model=openai("gpt-4o")
)
pool = create_pool(
    agents=[agent],
    model=openai("gpt-4o")
)
```

## Supported Model Providers

Peargent’s model support is continuously expanding. New providers and model families are added regularly, so expect this list to grow over time.

<CodeBlockTabs defaultValue="OpenAI">
  <CodeBlockTabsList>
    <CodeBlockTabsTrigger value="OpenAI">
      OpenAI
    </CodeBlockTabsTrigger>

    <CodeBlockTabsTrigger value="Groq">
      Groq
    </CodeBlockTabsTrigger>

    <CodeBlockTabsTrigger value="Gemini">
      Gemini
    </CodeBlockTabsTrigger>

    <CodeBlockTabsTrigger value="Anthropic">
      Anthropic
    </CodeBlockTabsTrigger>
  </CodeBlockTabsList>

  <CodeBlockTab value="OpenAI">
    ```python
    from peargent.models import openai

    model = openai(
        model_name="gpt-4o",
        api_key="sk-",
        endpoint_url="https://api.openai.com/v1",
        parameters={}
    )
    ```
  </CodeBlockTab>

  <CodeBlockTab value="Groq">
    ```python
    from peargent.models import groq

    model = groq(
        model_name="llama-3.3",
        api_key="",
        endpoint_url="https://api.groq.com/v1",
        parameters={}
    )
    ```
  </CodeBlockTab>

  <CodeBlockTab value="Gemini">
    ```python
    from peargent.models import gemini

    model = gemini(
        model_name="gemini-2.0-flash",
        api_key="AIzaSyB",
        endpoint_url="https://generativelanguage.googleapis.com/v1",
        parameters={}
    )
    ```
  </CodeBlockTab>

  <CodeBlockTab value="Anthropic">
    ```python
    from peargent.models import anthropic

    model = anthropic(
        model_name="claude-3-5-sonnet-20241022",
        api_key="sk-ant-",
        endpoint_url="https://api.anthropic.com/v1/messages",
        parameters={}
    )
    ```
  </CodeBlockTab>
</CodeBlockTabs>


# Pools

<h1 className="mt-4 text-3xl font-medium">
  <span className="text-primary">
    Pools
  </span>
</h1>

<h3 className="-mt-7 text-lg font-normal text-muted-foreground">
  Learn how pools enable multi-agent collaboration and intelligent task routing.
</h3>

A Pool coordinates multiple agents so they can work together on a task. It brings structure to multi-agent workflows by deciding how agents interact and how information flows between them.

* Each **<u>[Agent](/docs/agents)</u>** focuses on a specific skill or responsibility and contributes its part of the work.

* A shared **<u>[State](/docs/states)</u>** lets all agents access and update the same context, allowing them to build on each other’s progress.

* The **<u>[Router](/docs/routers)</u>** decides which agent should act next, using either round-robin or intelligent LLM-based routing.

## Creating a Pool

Use `create_pool()` function to coordinate multiple agents. The `agents` parameter accepts a list of all the agents you want to include in the pool.

```python
from peargent import create_agent, create_pool
from peargent.models import groq

# researcher agent and write agent

# Create pool
pool = create_pool(
    agents=[researcher, writer],
)

# Run the pool along with user input
result = pool.run("Research and write about quantum computing")
```

## How a Pool Works

A Pool can be thought of as a controller that organizes multiple agents, maintains a shared **<u>[State](/docs/states)</u>** for them to collaborate through, and uses a **<u>[Router](/docs/routers)</u>** to decide which agent should act next.

<div className="fd-steps [&_h3]:fd-step">
  ### User Input Added to State

  The user’s message is written into the shared **<u>[State](/docs/states)</u>** so every agent can access it.

  ### Router Selects the Next Agent

  The **<u>[Router](/docs/routers)</u>** determines which agent should act next.

  ### Agent Executes and Updates State

  The selected agent processes the task, produces an output, and writes its result back into the **<u>[State](/docs/states)</u>**.

  ### Output Becomes Input for the Next Agent

  Each agent’s output is available in the **<u>[State](/docs/states)</u>**, allowing the next agent to build on prior work.

  ### Process Repeats Until Completion

  The cycle continues until the workflow is complete or the maximum number of iterations is reached.
</div>

## Model Selection

By default, the pool uses the model of the first agent. You can also provide a `default_model` for the pool. Any agent without an explicitly set model will use this `default_model`.

```python
from peargent import create_pool
from peargent.models import openai

# researcher agent, analyst agent and writer agent

# Create pool
pool = create_pool(
    agents=[researcher, analyst, writer],
    default_model=openai("gpt-5")
)
```

All the available models are listed in **<u>[Models](/docs/models)</u>**.

## Routing the Agents

By default, pools use round-robin routing, where **<u>[Agents](/docs/agents)</u>** take turns in order.
You can also plug in a custom router to make more intelligent decisions based on the task.

For all routing options, including round-robin, LLM-based routing, and custom router functions, see the **<u>[Routers](/docs/routers)</u>** .

```python
from peargent import create_pool

pool = create_pool(
    agents=[researcher, analyst, writer],
)

article = pool.run("Write an article about renewable energy trends")
# Executes: researcher → analyst → writer
```

## Max Iterations

A pool runs for a fixed number of iterations, where each iteration represents one agent being routed, executed, and updating the state. Pools use a default limit of 5 iterations, but you can change this using the `max_iter` parameter.

```python
from peargent import create_pool

pool = create_pool(
    agents=[researcher, analyst, writer],
    max_iter=10
)

article = pool.run("Write an article about renewable energy trends")
# Executes: researcher → analyst → writer → researcher → analyst → writer → ...
```

## Parameters

| Parameter       | Type                       | Description                                                    | Required |
| :-------------- | :------------------------- | :------------------------------------------------------------- | :------- |
| `agents`        | `list[Agent]`              | List of agents in the pool                                     | Yes      |
| `default_model` | `Model`                    | Default model for agents without one                           | No       |
| `router`        | `RouterFn \| RoutingAgent` | Custom router function or routing agent (default: round-robin) | No       |
| `max_iter`      | `int`                      | Maximum agent executions (default: `5`)                        | No       |
| `default_state` | `State`                    | Custom initial state object                                    | No       |
| `history`       | `HistoryConfig`            | Shared conversation history across all agents                  | No       |
| `tracing`       | `bool`                     | Enable tracing for all agents (default: `False`)               | No       |

For advanced configuration like `history` and `tracing`, see the **<u>Advanced Features</u>**.


# QuickStart

<h1 className="mt-4 text-3xl font-medium">
  <span className="text-primary">
    Quick start
  </span>
</h1>

<h3 className="-mt-7 text-lg font-normal text-muted-foreground">
  Learn how to set up your first pear agent in just a few lines of code.
</h3>

In this chapter, we will create a simple AI agent using Peargent. <br />
Follow the steps below to get started quickly.

## Create Your First Agent

<div className="fd-steps [&_h3]:fd-step">
  ### Install Peargent

  It's recommended to install Peargent inside an **virtual environment (venv)**.

  ```bash
  pip install peargent
  ```

  ### Create an Agent

  Now, let’s create our first **<u>[Agent](/docs/agents)</u>**.
  An agent is simply an AI-powered entity that behaves like an autonomous helper, it can think, respond, and perform tasks based on the role, personality, and instructions you provide.<br />

  Start by creating a Python file named `quickstart.py`.
  Using the `create_agent` function, we can assign our agent a `name`, a `description`, and a `persona` (its role, tone, and behaviour).<br />

  You will also need to specify the `model` parameter to choose which LLM the agent will use.
  In this example, we’ll use OpenAI’s `GPT-5` model. (**<u>[Available models](/docs/models#supported-model-providers)</u>**)<br />

  Your agent can be anything you imagine!<br />
  For this example, we’ll create a friendly agent who speaks like **William Shakespeare**.

  ```python
  from peargent import create_agent
  from peargent.models import openai

  agent = create_agent(
      name="ShakespeareBot",
      description="An AI agent that speaks like William Shakespeare.",
      persona="You are ShakespeareBot, a witty and eloquent assistant who communicates in the style of William Shakespeare.",
      model=openai("gpt-5")
  )
  response = agent.run("What is the meaning of life?")
  print(response)
  ```

  Before running the code, you will need to set your `OPENAI_API_KEY` inside your `.env` file.

  ```bash
  OPENAI_API_KEY="your_openai_api_key_here"
  ```

  ### Run the Agent

  Now, run your `quickstart.py` script to see your agent in action!

  ```bash
  python quickstart.py
  ```

  You should see a response from agent (ShakespeareBot), answering your question in Shakespearean style!

  <CodeBlockTabs defaultValue="Terminal Output">
    <CodeBlockTabsList>
      <CodeBlockTabsTrigger value="Terminal Output">
        Terminal Output
      </CodeBlockTabsTrigger>
    </CodeBlockTabsList>

    <CodeBlockTab value="Terminal Output">
      ```text no-copy 
      Ah, fair seeker of truth, thou question dost pierce the very veil of existence!
      The meaning of life, methinks, is not a single treasure buried in mortal sands,
      but a wondrous journey of love, virtue, and discovery. 
      To cherish each breath, to learn from sorrow, 
      and to weave kindness through the tapestry of thy days — 
      therein lies life’s most noble purpose.
      ```
    </CodeBlockTab>
  </CodeBlockTabs>
</div>

<br />

Congratulations! You have successfully created and run your first AI agent using Peargent.


# Routers

<h1 className="mt-4 text-3xl font-medium">
  <span className="text-primary">
    Routers
  </span>
</h1>

<h3 className="-mt-7 text-lg font-normal text-muted-foreground">
  Decide which agent in a pool should act.
</h3>

A router decides which **<u>[Agent](/docs/agents)</u>** in the **<u>[Pool](/docs/pools)</u>** should run next. It examines the shared **<u>[State](/docs/states)</u>** or user input and chooses the agent best suited for the next step.

Think of a router as the director of a movie set.
Each agent is an actor with a specific role, and the router decides who steps into the scene at the right moment.

In Peargent, you can choose from three routing strategies:

* **Round-Robin Router** - agents take turns in order

* **LLM-Based Routing Agent** - an LLM decides which agent acts next

* **Custom Function-Based Router** - you define the routing logic yourself

With a **Custom Function-Based Router**, you get complete control over how agents are selected. You can route in a fixed order, choose based on the iteration count, or make smart decisions using the shared state. For example: Sequential routing, Conditional routing, and State-based intelligent routing. Refer **<u>Advance Features</u>**.

## Round Robin Router (default)

The **Round Robin Router** is the simplest and default routing strategy in Peargent. It cycles through agents in the exact order they are listed, giving each agent one turn before repeating, until the pool reaches the `max_iter` limit.

This router requires no configuration and no LLM calls, making it predictable, fast, and cost-free.

```python
from peargent import create_pool

pool = create_pool(
    agents=[researcher, analyst, writer],
    # No router required — round robin is automatic
    max_iter=3
)

result = pool.run("Write about Quantum Physics")
# Executes: researcher → analyst → writer
```

**Best for:** Simple sequential workflows, demos, testing, and predictable pipelines.

## LLM Based Routing Agent

The **LLM-Based Routing Agent** uses a large language model to intelligently decide which agent should act next. Instead of following a fixed order or manual rules, the router examines the **conversation history**, **agent abilities**, and **workflow context** to choose the most appropriate agent at each step.

This makes it ideal for dynamic, context-aware, and non-linear multi-agent workflows.

```python
from peargent import create_routing_agent, create_pool
from peargent.models import openai

router = create_routing_agent(
    name="SmartRouter",
    model=openai("gpt-5"),
    persona="You intelligently choose the next agent based on the task.",
    agents=["Researcher", "Analyst", "Writer"]
) # chooses an agent or 'STOP' to end the pool

pool = create_pool(
    agents=[researcher, analyst, writer],
    router=router,
    max_iter=5
)

result = pool.run("Research and write about quantum computing")
```

<Callout>
  The descriptions you give your agents play a **crucial role** in LLM-based routing.\
  The router uses these descriptions to understand each agent’s abilities and decide who should act next.
</Callout>

## Custom Function-Based Router

Custom routers give you **full control** over how agents are selected.\
You define a Python function that inspects the shared `state`, the `call_count`, and the `last_result` to decide which agent goes next.

This is ideal for **rule-based**, **deterministic**, or **cost-efficient** workflows.

```python
from peargent import RouterResult

def custom_router(state, call_count, last_result):
    # Your routing logic here
    for agent_name, agent_obj in state.agents.items(): # agent details are available here
        print(f"Agent: {agent_name}")
        print(f"  Description: {agent_obj.description}")
        print(f"  Tools: {list(agent_obj.tools.keys())}")
        print(f"  Model: {agent_obj.model}")
        print(f"  Persona: {agent_obj.persona}")

    # Return the name of the next agent, or None to stop
    return RouterResult("AgentName")
```

Custom routers unlock entirely new routing patterns, from rule-based flows to dynamic state-aware logic.\
To explore more advanced patterns and real-world examples, see **<u>Advanced Features</u>**.


# States

<h1 className="mt-4 text-3xl font-medium">
  <span className="text-primary">
    States
  </span>
</h1>

<h3 className="-mt-7 text-lg font-normal text-muted-foreground">
  A shared context used inside Pools for smarter routing decisions.
</h3>

State is a shared workspace used only inside **<u>[Pools](/docs/pools)</u>**.
It exists for the duration of a single pool.run() call and gives **<u>[Routers](/docs/routers)</u>** the information they need to make intelligent routing decisions.

Think of State as a scratchpad that all agents in a pool share, **<u>[Routers](/docs/routers)</u>** can read and write to it, while **<u>[Agents](/docs/agents)</u>** and **<u>[Tools](/docs/tools)</u>** cannot.

State is automatically created by the Pool and passed to:

* **<u>[Custom Router functions](/docs/routers#custom-function-based-router)</u>**
* **<u>[Routing Agents](/docs/routers#llm-based-routing-agent)</u>**

So in that sense, **no configuration is required**.

## Where State Can Be Used

### Custom Router functions

```python
def custom_router(state, call_count, last_result):
    # Read history
    last_message = state.history[-1]["content"]

    # Store workflow progress
    state.set("stage", "analysis")

    # Read agent capabilities
    print(state.agents.keys())

    return RouterResult("Researcher")
```

Refer the **<u>[API Reference of State](/docs/states#state-api-reference)</u>** to know more about what can be stored in State.

### Manual State Creation (highly optional)

```python
from peargent import State

custom_state = State(data={"stage": "init"})

pool = create_pool(
    agents=[agent1, agent2],
    default_state=custom_state
)
```

## State API Reference

The `State` object provides a small but powerful API used inside Pools and Routers.

### Methods

Methods give routers the ability to store and retrieve custom information
needed for routing decisions.

| Name              | Type   | Inputs                                            | Returns | Description                                                                                          |
| ----------------- | ------ | ------------------------------------------------- | ------- | ---------------------------------------------------------------------------------------------------- |
| **`add_message`** | Method | `role: str`, `content: str`, `agent: str \| None` | `None`  | Appends a message to `state.history` and persists it if a history manager exists.                    |
| **`get`**         | Method | `key: str`, `default: Any = None`                 | `Any`   | Retrieves a value from the key-value store. Returns `default` if the key is missing.                 |
| **`set`**         | Method | `key: str`, `value: Any`                          | `None`  | Stores a value in the key-value store. Useful for workflow tracking, flags, and custom router logic. |

### Attributes

Attributes give routers visibility into what has happened so far
(history, agents, persistent history, custom data).

| Name                  | Type                          | Read/Write                             | Description                                                                                                                    |
| --------------------- | ----------------------------- | -------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------ |
| **`kv`**              | `dict[str, Any]`              | Read/Write *(via get/set recommended)* | Internal key-value store for custom state. Use `state.get()`/`state.set()` instead of accessing directly.                      |
| **`history`**         | `list[dict]`                  | Read-only (managed by Pool)            | In-memory conversation history for the current pool run. Contains `role`, `content`, and optional `agent`.                     |
| **`history_manager`** | `ConversationHistory \| None` | Read-only                              | Optional persistent history backend (SQLite, Redis, PostgreSQL, etc.). Used automatically by Pool.                             |
| **`agents`**          | `dict[str, Agent]`            | Read-only                              | Mapping of agent names to their Agent objects. Useful for advanced routing logic (e.g., route based on tools or descriptions). |

### Message Structure (`state.history`)

Each entry in `state.history` looks like:

```python
{
  "role": "user" | "assistant" | "tool",
  "content": "message content",
  "agent": "AgentName"  # only for assistant/tool messages
}
```


# Tools

<h1 className="mt-4 text-3xl font-medium">
  <span className="text-primary">
    Tools
  </span>
</h1>

<h3 className="-mt-7 text-lg font-normal text-muted-foreground">
  Enable agents to perform actions beyond text generation with tools.
</h3>

Tools are actions that agents can perform to interact with the real world. They allow agents to go beyond text generation by enabling operations such as querying databases, calling APIs, performing calculations, reading files, or executing any Python function you define.

Think of tools as the **hands and eyes** of your agent, while the model provides the reasoning **(the brain)**. Tools give the agent the ability to actually act and produce real results.

When you create an **<u>[Agent](/docs/agents)</u>**, you pass in a list of available **<u>[Tools](/docs/tools)</u>**, and during execution the agent decides whether a tool is needed and invokes it automatically based on the model’s response.

## Creating a Tool

Use `create_tool()` to wrap a Python function into a tool that an agent can call. Every tool must define a `name`, `description`, `input_parameters`, and a `call_function`. The `call_function` is the underlying Python function that will be executed when the agent invokes the tool.

Below is a simple example tool that converts Celsius to Fahrenheit:

```python
from peargent import create_tool

def celsius_to_fahrenheit(c: float):
    return (c * 9/5) + 32

temperature_tool = create_tool(
    name="CelsiusToFahrenheit",
    description="Convert Celsius temperature to Fahrenheit",
    call_function=celsius_to_fahrenheit,
    input_parameters={"c": float}, # Important 
    output_schema=float
)
```

## Input Parameters Matter

The `input_parameters` serve two critical purposes:

1. **Type Validation** - Peargent validates that the LLM provides the correct types before executing your function, preventing runtime errors
2. **LLM Guidance** - The parameter types help the LLM understand what arguments to provide when calling the tool

## Using Tools with Agents

Tools can be passed to an agent during creation. The agent will automatically decide when a tool is needed and call it as part of its reasoning process.

```python
from peargent import create_agent
from peargent.models import openai

agent = create_agent(
    name="UtilityAgent",
    description="Handles multiple utility tasks",
    persona="You are a helpful assistant.",
    model=openai("gpt-5"),
    tools=[ # You can pass one or multiple tools here
      temperature_tool, 
      count_words_tool, 
      summary_tool] 
)

response = agent.run("Convert 25 degrees Celsius to Fahrenheit.")
# Agent automatically calls the tool and uses the result
```

## Parameters

| Parameter          | Type              | Description                                                                          | Required |
| :----------------- | :---------------- | :----------------------------------------------------------------------------------- | :------- |
| `name`             | `str`             | Tool identifier                                                                      | Yes      |
| `description`      | `str`             | What the tool does (helps LLM decide when to use it)                                 | Yes      |
| `input_parameters` | `dict[str, type]` | Parameter names and types (e.g., `{"city": str}`)                                    | Yes      |
| `call_function`    | `Callable`        | The Python function to execute                                                       | Yes      |
| `timeout`          | `float \| None`   | Max execution time in seconds (default: `None`)                                      | No       |
| `max_retries`      | `int`             | Retry attempts on failure (default: `0`)                                             | No       |
| `retry_delay`      | `float`           | Initial delay between retries in seconds (default: `1.0`)                            | No       |
| `retry_backoff`    | `bool`            | Use exponential backoff (default: `True`)                                            | No       |
| `on_error`         | `str`             | Error handling: `"raise"`, `"return_error"`, or `"return_none"` (default: `"raise"`) | No       |
| `output_schema`    | `Type[BaseModel]` | Pydantic model for output validation                                                 | No       |

For advanced configuration like `timeouts`, `retries`, `error-handling`, and `output validation`,
see the **<u>Advanced Features</u>**.


# Async Streaming

<h1 className="mt-4 text-3xl font-medium">
  <span className="text-primary">
    Async Streaming
  </span>
</h1>

<h3 className="-mt-7 text-lg font-normal text-muted-foreground">
  Run multiple agents concurrently with non-blocking streaming
</h3>

Async streaming allows your application to handle multiple agent requests at the same time without blocking. This is essential for:

* **Web Servers**: Handling multiple user requests in FastAPI or Django.
* **Parallel Processing**: Running multiple agents simultaneously (e.g., a Researcher and a Reviewer).

## Quick Start

Use `astream()` with `async for` to stream responses asynchronously.

```python
import asyncio
from peargent import create_agent
from peargent.models import openai

agent = create_agent(
    name="AsyncAgent",
    description="Async streaming agent",
    persona="You are helpful.",
    model= openai("gpt-4o")
)

async def main():
    print("Agent: ", end="", flush=True)
    
    # Use 'async for' with 'astream'
    async for chunk in agent.astream("Hello, how are you?"):
        print(chunk, end="", flush=True)

if __name__ == "__main__":
    asyncio.run(main())
```

## Running Agents Concurrently

The real power of async comes when you run multiple things at once. Here is how to run two agents in parallel using `asyncio.gather()`.

```python
import asyncio
from peargent import create_agent
from peargent.models import openai

# Create two agents
agent1 = create_agent(name="Agent1", persona="You are concise.", model=openai("gpt-4o"))
agent2 = create_agent(name="Agent2", persona="You are verbose.", model=openai("gpt-4o"))

async def run_agent(agent, query, label):
    print(f"[{label}] Starting...")
    async for chunk in agent.astream(query):
        # In a real app, you might send this to a websocket
        pass 
    print(f"[{label}] Finished!")

async def main():
    # Run both agents at the same time
    await asyncio.gather(
        run_agent(agent1, "Explain Quantum Physics", "Agent 1"),
        run_agent(agent2, "Explain Quantum Physics", "Agent 2")
    )

asyncio.run(main())
```

**Result**: Both agents start processing immediately. You don't have to wait for Agent 1 to finish before Agent 2 starts.

## Async with Metadata

Just like the synchronous version, you can use `astream_observe()` to get metadata asynchronously.

```python
async for update in agent.astream_observe("Query"):
    if update.is_token:
        print(update.content, end="")
    elif update.is_agent_end:
        print(f"\nCost: ${update.cost}")
```

## Async Pools

Pools also support async streaming, allowing you to run multi-agent workflows without blocking.

```python
# Stream text chunks from a pool asynchronously
async for chunk in pool.astream("Query"):
    print(chunk, end="", flush=True)

# Stream rich updates from a pool asynchronously
async for update in pool.astream_observe("Query"):
    if update.is_token:
        print(update.content, end="")
```

## Web Server Example (FastAPI)

Async streaming is the standard way to build AI endpoints in FastAPI.

```python
from fastapi import FastAPI
from fastapi.responses import StreamingResponse

app = FastAPI()

@app.get("/chat")
async def chat(query: str):
    async def generate():
        async for chunk in agent.astream(query):
            yield chunk

    return StreamingResponse(generate(), media_type="text/plain")
```

## What's Next?

**<u>[Tracing & Observability](/docs/tracing-and-observability)</u>**
Learn how to monitor your async agents in production.


# Streaming

<h1 className="mt-4 text-3xl font-medium">
  <span className="text-primary">
    Streaming
  </span>
</h1>

<h3 className="-mt-7 text-lg font-normal text-muted-foreground">
  Stream agent responses in real-time for better user experience
</h3>

Streaming allows you to display the agent's response token by token as it's being generated, rather than waiting for the entire response to complete. This creates a much more responsive and engaging user experience.

## Quick Start

Use the `stream()` method to get an iterator that yields text chunks as they arrive.

```python
from peargent import create_agent
from peargent.models import openai

agent = create_agent(
    name="StreamingAgent",
    description="An agent that streams responses",
    persona="You are helpful and concise.",
    model=openai("gpt-4o")
)

# Stream response token by token
print("Agent: ", end="", flush=True)

for chunk in agent.stream("What is Python in one sentence?"):
    print(chunk, end="", flush=True)
```

**Output:**

```text
Agent: Python is a high-level, interpreted programming language known for its readability and versatility.
```

## Why Use Streaming?

* **Lower Latency**: Users see the first words immediately, instead of waiting seconds for the full answer.
* **Better UX**: The application feels alive and responsive.
* **Engagement**: Users can start reading while the rest of the answer is being generated.

## When to Use `stream()`

Use `agent.stream()` when you just need the **text content** of the response.

* ✅ Chatbots and conversational interfaces
* ✅ CLI tools requiring real-time feedback
* ✅ Simple text generation tasks

If you need metadata like **token usage**, **costs**, or **execution time**, use **<u>[Stream Observe](/docs/Streaming/stream-observe)</u>** instead.

## Streaming with Pools

You can also stream responses from a **Pool** of agents. The pool will stream the output of whichever agent is currently executing.

```python
from peargent import create_pool

pool = create_pool(
    agents=[researcher, writer],
    router=my_router
)

# Stream the entire multi-agent interaction
for chunk in pool.stream("Research AI and write a summary"):
    print(chunk, end="", flush=True)
```

## Best Practices

1. **Always Flush Output**: When printing to a terminal, use `flush=True` (e.g., `print(chunk, end="", flush=True)`) to ensure tokens appear immediately.
2. **Handle Empty Chunks**: Occasionally, a chunk might be empty. Your UI code should handle this gracefully.

## What's Next?

**<u>[Rich Streaming (Observe)](/docs/Streaming/stream-observe)</u>**
Learn how to get rich metadata like token counts, costs, and duration while streaming.

**<u>[Async Streaming](/docs/Streaming/async-streaming)</u>**
Run multiple agents concurrently or build high-performance web servers using async streaming.


# Rich Streaming (Observe)

<h1 className="mt-4 text-3xl font-medium">
  <span className="text-primary">
    Rich Streaming (Observe)
  </span>
</h1>

<h3 className="-mt-7 text-lg font-normal text-muted-foreground">
  Get metadata like tokens, cost, and duration while streaming
</h3>

While `agent.stream()` gives you just the text, `agent.stream_observe()` provides **rich updates** containing metadata. This is essential for production applications where you need to track costs, monitor performance, or show progress indicators.

## Quick Start

Use `stream_observe()` to receive `StreamUpdate` objects. You can check the type of update to handle text chunks and final metadata differently.

```python
from peargent import create_agent
from peargent.models import openai

agent = create_agent(
    name="ObservableAgent",
    description="Agent with observable execution",
    persona="You are helpful.",
    model=openai("gpt-5")
)

print("Agent: ", end="", flush=True)

for update in agent.stream_observe("What is the capital of France?"):
    # 1. Handle text tokens
    if update.is_token:
        print(update.content, end="", flush=True)
    
    # 2. Handle completion (metadata)
    elif update.is_agent_end:
        print(f"\n\n--- Metadata ---")
        print(f"Tokens: {update.tokens}")
        print(f"Cost:   ${update.cost:.6f}")
        print(f"Time:   {update.duration:.2f}s")
```

**Output:**

```text
Agent: The capital of France is Paris.

--- Metadata ---
Tokens: 15
Cost:   $0.000012
Time:   0.45s
```

## The StreamUpdate Object

Each item yielded by `stream_observe()` is a `StreamUpdate` object with helpful properties:

| Property         | Description                                              |
| :--------------- | :------------------------------------------------------- |
| `is_token`       | `True` if this update contains a text chunk.             |
| `content`        | The text chunk (only available when `is_token` is True). |
| `is_agent_end`   | `True` when the agent has finished generating.           |
| `tokens`         | Total tokens used (available on `is_agent_end`).         |
| `cost`           | Total cost in USD (available on `is_agent_end`).         |
| `duration`       | Time taken in seconds (available on `is_agent_end`).     |
| `is_agent_start` | `True` when the agent starts working.                    |

## Update Types

The `UpdateType` enum defines all possible event types during streaming:

| Type          | Description                         |
| :------------ | :---------------------------------- |
| `AGENT_START` | Agent execution started.            |
| `TOKEN`       | A text chunk was generated.         |
| `AGENT_END`   | Agent execution completed.          |
| `POOL_START`  | Pool execution started.             |
| `POOL_END`    | Pool execution completed.           |
| `TOOL_START`  | Tool execution started.             |
| `TOOL_END`    | Tool execution completed.           |
| `ERROR`       | An error occurred during streaming. |

## Streaming with Pools

When using `pool.stream_observe()`, you get additional event types to track the pool's lifecycle.

```python
from peargent import UpdateType

for update in pool.stream_observe("Query"):
    # Pool Events
    if update.type == UpdateType.POOL_START:
        print("[Pool Started]")
    
    # Agent Events (same as single agent)
    elif update.is_agent_start:
        print(f"\n[Agent: {update.agent}]")
    elif update.is_token:
        print(update.content, end="", flush=True)
        
    # Pool Finished
    elif update.type == UpdateType.POOL_END:
        print(f"\n[Pool Finished] Total Cost: ${update.cost}")
```

## What's Next?

**<u>[Async Streaming](/docs/streaming/async-streaming)</u>**
Learn how to use these features in async environments for high concurrency.

**<u>[Tracing & Observability](/docs/tracing-and-observability)</u>**
For deep debugging and historical logs, combine streaming with Peargent's tracing system.


# Built-in Tools

<h1 className="mt-4 text-3xl font-medium">
  <span className="text-primary">
    Built-in Tools
  </span>
</h1>

<h3 className="-mt-7 text-lg font-normal text-muted-foreground">
  Ready-to-use tools that extend agent capabilities with powerful built-in
  functionality
</h3>

Peargent provides a growing collection of built-in **<u>[Tools](/docs/tools)</u>** that solve common tasks without requiring custom implementations. These tools are production-ready, well-tested, and integrate seamlessly with **<u>[Agents](/docs/agents)</u>**.

## Why Built-in Tools?

Built-in tools save development time and provide:

* **Zero Configuration** - Import and use immediately, no setup required
* **Production Ready** - Thoroughly tested and optimized for reliability
* **Best Practices** - Built with proper error handling, validation, and security
* **Consistent API** - Same interface patterns across all built-in tools
* **Maintained** - Regular updates and improvements from the Peargent team

## Available Built-in Tools

### Text Extraction Tool

Extract plain text and metadata from various document formats including HTML, PDF, DOCX, TXT, Markdown, and URLs. This tool enables agents to read and process content from different file types and web pages. Supported formats: HTML/XHTML, PDF, DOCX, TXT, Markdown, and URLs (with SSRF protection). **<u>[Learn more about Text Extraction Tool →](/docs/built-in-tools/text-extraction)</u>**

## Coming Soon

More built-in tools are in development:

* **Web Search Tool** - Search the web and retrieve relevant information
* **Image Analysis Tool** - Extract text and analyze images
* **File System Tool** - Read, write, and manage files safely
* **HTTP Request Tool** - Make API calls with built-in retry logic

Check the **<u>[Peargent GitHub repository](https://github.com/Peargent/peargent/tree/main/peargent/tools)</u>** for the latest updates.


# Text Extraction Tool

<h1 className="mt-4 text-3xl font-medium">
  <span className="text-primary">
    Text Extraction Tool
  </span>
</h1>

<h3 className="-mt-7 text-lg font-normal text-muted-foreground">
  Learn how to use the text extraction tool with Peargent agents
</h3>

## Overview

The Text Extraction Tool is a built-in Peargent **<u>[Tool](/docs/tools)</u>** that enables **<u>[Agents](/docs/agents)</u>** to extract plain text from various document formats. It supports HTML, PDF, DOCX, TXT, and Markdown files, as well as URLs. The tool can optionally extract metadata such as title, author, page count, and character counts.

### Supported Formats

* **HTML/XHTML** - Web pages with metadata extraction (title, description, author)
* **PDF** - PDF documents with metadata (title, author, subject, page count)
* **DOCX** - Microsoft Word documents with document properties
* **TXT** - Plain text files with automatic encoding detection
* **Markdown** - Markdown files with title extraction from headers
* **URLs** - HTTP/HTTPS web resources with built-in SSRF protection

## Usage with Agents

The Text Extraction **<u>[Tool](/docs/tools)</u>** is most powerful when integrated with **<u>[Agents](/docs/agents)</u>**. Agents can use the tool to automatically extract and process document content.

### Creating an Agent with Text Extraction

To use the text extraction tool with an agent, you need to configure it with a **<u>[Model](/docs/models)</u>** and pass the tool to the agent's `tools` parameter:

```python
from peargent import create_agent
from peargent.tools import text_extractor # [!code highlight]
from peargent.models import gemini

# Create an agent with text extraction capability
agent = create_agent(
    name="DocumentAnalyzer",
    description="Analyzes documents and extracts key information",
    persona=(
        "You are a document analysis expert. When asked about a document, "
        "use the text extraction tool to extract its content, then analyze "
        "and summarize the information."
    ),
    model=gemini("gemini-2.5-flash-lite"),
    tools=[text_extractor] # [!code highlight]
)

# Use the agent to analyze a document
response = agent.run("Summarize the key points from document.pdf")
print(response)
```

## Examples

### Example 1: Extract Text with Metadata

```python
from peargent.tools import text_extractor

# Extract text and metadata from an HTML file 
result = text_extractor.run({
    "file_path": "article.html",
    "extract_metadata": True # [!code highlight]
})

if result["success"]:
    print(f"Title: {result['metadata']['title']}")
    print(f"Author: {result['metadata']['author']}")
    print(f"Word Count: {result['metadata']['word_count']}")
    print(f"Content:\n{result['text']}")
else:
    print(f"Error: {result['error']}")
```

### Example 2: Extract from URL

```python
from peargent.tools import text_extractor

# Extract text from a web page
result = text_extractor.run({
    "file_path": "https://example.com/article", # [!code highlight]
    "extract_metadata": True
})

if result["success"]:
    print(f"Website Title: {result['metadata']['title']}")
    print(f"Content: {result['text'][:500]}...")
```

### Example 3: Extract with Length Limit

```python
from peargent.tools import text_extractor

# Extract text but limit to 1000 characters
result = text_extractor.run({
    "file_path": "long_document.pdf",
    "extract_metadata": True,
    "max_length": 1000 # [!code highlight]
})

print(f"Text (max 1000 chars): {result['text']}")
```

### Example 4: Batch Processing Multiple Files

```python
from peargent.tools import text_extractor
import os

documents = ["doc1.pdf", "doc2.docx", "doc3.html"]

for file_path in documents:
    if os.path.exists(file_path):
        result = text_extractor.run({
            "file_path": file_path,
            "extract_metadata": True
        })

        if result["success"]:
            print(f"\n{file_path} ({result['format']})")
            print(f"Words: {result['metadata'].get('word_count', 'N/A')}")
            print(f"Preview: {result['text'][:150]}...")
        else:
            print(f"Error processing {file_path}: {result['error']}")
```

### Example 5: Agent Document Analysis

```python
from peargent import create_agent
from peargent.tools import text_extractor # [!code highlight]
from peargent.models import gemini

# Create a document analysis agent
agent = create_agent(
    name="ResearchAssistant",
    description="Analyzes research papers and extracts key information",
    persona=(
        "You are a research assistant specializing in document analysis. "
        "When given a document, extract its content and identify: "
        "1) Main topic, 2) Key findings, 3) Methodology, 4) Conclusions"
    ),
    model=gemini("gemini-2.5-flash-lite"),
    tools=[text_extractor] # [!code highlight]
)

# Ask the agent to analyze a research paper
response = agent.run(
    "Please analyze research_paper.pdf and provide a structured summary"
)
print(response)
```

## Common Use Cases

1. **Document Summarization**: Extract text from documents and have agents summarize them
2. **Information Extraction**: Extract specific information (emails, phone numbers, etc.) from documents
3. **Content Analysis**: Analyze document sentiment, topics, or keywords
4. **Batch Processing**: Process multiple documents programmatically
5. **Web Scraping**: Extract text from web pages while preserving structure
6. **Research Assistance**: Analyze research papers and academic documents
7. **Compliance Review**: Extract and review document contents for compliance checking

## Parameters

The text extraction tool accepts the following parameters:

* **file\_path** (string, required): Path to the file or URL to extract text from
* **extract\_metadata** (boolean, optional, default: False): Whether to extract metadata like title, author, page count, etc.
* **max\_length** (integer, optional): Maximum text length to return. If exceeded, text is truncated with "..." appended

## Return Value

The tool returns a dictionary with the following structure:

```python
{
    "text": "Extracted plain text content",
    "metadata": {
        "title": "Document Title",
        "author": "Author Name",
        # ... additional metadata depending on format
    },
    "format": "pdf",  # Detected file format
    "success": True,
    "error": None
}
```

## Metadata by Format

Different document formats provide different metadata:

**HTML/XHTML:**

* `title` - Page title
* `description` - Meta description tag
* `author` - Meta author tag
* `word_count` - Number of words
* `char_count` - Number of characters

**PDF:**

* `title` - Document title
* `author` - Document author
* `subject` - Document subject
* `creator` - Application that created the PDF
* `producer` - PDF producer
* `creation_date` - When the document was created
* `page_count` - Total number of pages
* `word_count` - Total word count
* `char_count` - Total character count

**DOCX:**

* `title` - Document title
* `author` - Document author
* `subject` - Document subject
* `created` - Creation date and time
* `modified` - Last modification date and time
* `word_count` - Total word count
* `char_count` - Total character count
* `paragraph_count` - Number of paragraphs

**TXT/Markdown:**

* `encoding` - Text encoding used
* `word_count` - Total word count
* `char_count` - Total character count
* `line_count` - Total line count
* `title` - (Markdown only) Title extracted from first heading

## Troubleshooting

### ImportError for document libraries

If you encounter ImportError when extracting specific formats, install the required dependencies:

```bash
# For all formats
pip install peargent[text-extraction]

# Or individually
pip install beautifulsoup4 pypdf python-docx
```

### SSRF Protection Errors

If you receive "Access to localhost is not allowed" error, ensure you're using a public URL:

```python
# This will fail
result = text_extractor.run({"file_path": "http://localhost:8000/doc"})

# Use a public URL instead
result = text_extractor.run({"file_path": "https://example.com/doc"})
```

### Encoding Issues with Text Files

For text files with non-standard encoding, the tool automatically detects encoding. If issues persist, ensure the file is properly encoded.


# Error Handling in Tools

<h1 className="mt-4 text-3xl font-medium">
  <span className="text-primary">
    Error Handling in Tools
  </span>
</h1>

<h3 className="-mt-7 text-lg font-normal text-muted-foreground">
  Comprehensive error handling strategies for tools including retries, timeouts, and validation
</h3>

Tools can fail for many reasons, network issues, timeouts, invalid API responses, or broken external services. Peargent provides a complete system to handle these failures gracefully.

## Error Handling with `on_error`

The `on_error` parameter controls how tools handle errors (execution failures or validation failures):

```python
from peargent import create_tool

# Option 1: Raise exception (default)
tool_strict = create_tool(
    name="critical_api",
    description="Critical API call that must succeed",
    input_parameters={"query": str},
    call_function=call_api,
    on_error="raise"  # Fail fast if error occurs // [!code highlight]
)

# Option 2: Return error message as string
tool_graceful = create_tool(
    name="optional_api",
    description="Optional API call",
    input_parameters={"query": str},
    call_function=call_api,
    on_error="return_error"  # Continue with error message // [!code highlight]
)

# Option 3: Return None silently
tool_silent = create_tool(
    name="analytics_tracker",
    description="Optional analytics tracking",
    input_parameters={"event": str},
    call_function=track_event,
    on_error="return_none"  # Ignore failures silently // [!code highlight]
)
```

### When to Use Each Strategy

| `on_error` Value        | What Happens                                                                             | What You Can Do                                                      | Use Case                                   | Example                                      |
| ----------------------- | ---------------------------------------------------------------------------------------- | -------------------------------------------------------------------- | ------------------------------------------ | -------------------------------------------- |
| **`"raise"`** (default) | Raises exception, stops agent execution                                                  | Wrap in `try/except` to catch and handle exception                   | Critical tools that must succeed           | Database writes, payment processing          |
| **`"return_error"`**    | Returns error message as string (e.g., `"Tool 'api_call' failed: ConnectionError: ..."`) | Check if result is string and contains error, then handle gracefully | Graceful degradation, logging errors       | Optional external APIs, analytics            |
| **`"return_none"`**     | Returns `None` silently, no error message                                                | Check if result is `None`, then use fallback value or skip           | Non-critical features, optional enrichment | Analytics tracking, optional data enrichment |

## Next: Advanced Error Handling

Peargent provides more robust failure-handling features:

* **<u>[Retries](/docs/error-handling-in-tools/retries)</u>**\
  Automatically retry failing tools with optional exponential backoff.

* **<u>[Timeouts](/docs/error-handling-in-tools/timeout)</u>**\
  Prevent long-running or hanging operations.

* **<u>[Validation Failures](/docs/structured-output/tools-output#validating-tool-output-with-schema)</u>**\
  Handle schema validation errors when using `output_schema`.

These pages go deeper into reliability patterns for production workloads.


# Retries

<h1 className="mt-4 text-3xl font-medium">
  <span className="text-primary">
    Retries
  </span>
</h1>

<h3 className="-mt-7 text-lg font-normal text-muted-foreground">
  Tools can automatically retry failed operations.
</h3>

Retries are one of the simplest and most effective ways to handle errors in tools.
Instead of failing immediately, a tool can automatically try again when something goes wrong.

This makes your workflows more resilient, reduces unnecessary crashes, and improves overall reliability with minimal setup.

## Error Handling with Retries

Below is how simple it is to enable retry logic in a tool.

```python
from peargent import create_tool

api_tool = create_tool(
    name="external_api",
    description="Call external API",
    input_parameters={"query": str},
    call_function=call_external_api,
    max_retries=3,         # Retry up to 3 times on failure // [!code highlight:3]
    retry_delay=1.0,       # Initial delay: 1 second
    retry_backoff=True,    # Exponential backoff: 1s → 2s → 4s
    on_error="return_error"
)
```

## Retry Parameters

| Parameter       | Type  | Default | Description                                          |
| --------------- | ----- | ------- | ---------------------------------------------------- |
| `max_retries`   | int   | `0`     | Number of retry attempts (0 = no retries)            |
| `retry_delay`   | float | `1.0`   | Initial delay between retries in seconds             |
| `retry_backoff` | bool  | `True`  | Doubles delay after each retry attempt (Exponential) |

## How Retry Works

<div className="fd-steps [&_h3]:fd-step">
  ### First Attempt:

  Tool executes normally.

  ### On Failure:

  If execution or validation fails:

  * If `max_retries > 0`, the tool waits for retry\_delay seconds
  * If `retry_backoff=True`, the wait time doubles each retry
    (1s → 2s → 4s → …)

  ### Repeat:

  Retries continue until:

  * A retry succeeds, **or**
  * All retry attempts are exhausted

  ### Final Failure:

  Handled according to the `on_error` strategy (`raise`, `return_error`, `return_none`).
</div>

## Retry Example with Backoff

```python
unreliable_tool = create_tool(
    name="flaky_api",
    description="API that sometimes fails",
    input_parameters={"query": str},
    call_function=call_flaky_api,
    max_retries=3,
    retry_delay=1.0,
    retry_backoff=True,
    on_error="return_error"
)

# If all attempts fail, timing will be:
# Attempt 1: Immediate
# Attempt 2: +1 second   (1.0 * 2^0)
# Attempt 3: +2 seconds  (1.0 * 2^1)
# Attempt 4: +4 seconds  (1.0 * 2^2)
# Final: Return error message

```


# Timeout

<h1 className="mt-4 text-3xl font-medium">
  <span className="text-primary">
    Timeout
  </span>
</h1>

<h3 className="-mt-7 text-lg font-normal text-muted-foreground">
  Timeouts let you set a maximum allowed execution time for a tool.
</h3>

If the **<u>Tool</u>** takes longer than the configured time, its execution is stopped and handled using on\_error.

**Timeouts** are extremely useful for:

* Preventing tools from hanging forever
* Stopping slow external operations
* Keeping agent response times predictable
* Automatically failing or retrying long-running tasks

## Enable Timeout in a Tool

```python
from peargent import create_tool

slow_tool = create_tool(
    name="slow_operation",
    description="Operation that may take too long",
    input_parameters={"data": dict},
    call_function=slow_processing,
    timeout=5.0,          # Maximum 5 seconds allowed // [!code highlight]
    on_error="return_error"
)
```

## How Timeout Works

<div className="fd-steps [&_h3]:fd-step">
  1. Tool starts executing normally

  2. A timer begins (based on timeout)

  3. If execution finishes in time → result is returned

  4. If it exceeds the timeout →
     * Execution is stopped
     * A TimeoutError is raised internally
     * Result is handled via on\_error

  5. If combined with retries, timeout is applied on every attempt
</div>

## Timeout Example with Retries

```python
robust_tool = create_tool(
    name="robust_api",
    description="API call with timeout + retries",
    input_parameters={"query": str},
    call_function=call_api,
    timeout=10.0,        # Max 10 seconds per attempt // [!code highlight]
    max_retries=3,       # Retry if timed out
    retry_delay=2.0,
    retry_backoff=True,
    on_error="return_error"
)

# Example timing if every attempt times out:
# Attempt 1: 10s timeout
# Wait 2s
# Attempt 2: 10s timeout
# Wait 4s
# Attempt 3: 10s timeout
# Wait 8s
# Attempt 4: 10s timeout
# → Final failure handled by on_error
```


# Custom Storage Backends

<h1 className="mt-4 text-3xl font-medium">
  <span className="text-primary">
    Custom Storage Backends
  </span>
</h1>

<h3 className="-mt-7 text-lg font-normal text-muted-foreground">
  Create custom storage backends for Peargent history.
</h3>

For production-grade backends with complex requirements, you can create a custom storage backend by subclassing `HistoryStore`. This allows you to persist conversation history in any database or storage system of your choice, such as MongoDB, PostgreSQL, Redis, or even a custom API.

## Subclassing HistoryStore

To create a custom store, you need to implement the abstract methods defined in the `HistoryStore` class. Here is a comprehensive example using MongoDB.

### 1. Initialization and Setup

First, set up your class and initialize the database connection. You should also ensure any necessary indexes are created for performance.

```python
from peargent.storage import HistoryStore, Thread, Message
from typing import Dict, List, Optional, Any
from datetime import datetime

class MongoDBHistoryStore(HistoryStore):
    """Custom MongoDB storage backend."""

    def __init__(self, connection_string: str, database: str = "peargent"):
        from pymongo import MongoClient
        self.client = MongoClient(connection_string)
        self.db = self.client[database]
        self.threads = self.db.threads
        self.messages = self.db.messages

        # Create indexes for performance
        # Indexing 'id' ensures fast thread lookups
        self.threads.create_index("id", unique=True)
        # Compound index on 'thread_id' and 'timestamp' speeds up message retrieval
        self.messages.create_index([("thread_id", 1), ("timestamp", 1)])
```

### 2. Thread Management

Implement methods to create, retrieve, and list threads.

**Creating Threads:**
When creating a thread, you must persist its ID, creation time, and any initial metadata.

```python
    def create_thread(self, metadata: Optional[Dict] = None) -> str:
        thread = Thread(metadata=metadata)
        self.threads.insert_one({
            "id": thread.id,
            "created_at": thread.created_at,
            "updated_at": thread.updated_at,
            "metadata": thread.metadata
        })
        return thread.id
```

**Retrieving Threads:**
When retrieving a thread, you need to reconstruct the `Thread` object from your database record. Crucially, you must also load the associated messages and attach them to the thread.

```python
    def get_thread(self, thread_id: str) -> Optional[Thread]:
        thread_data = self.threads.find_one({"id": thread_id})
        if not thread_data:
            return None

        thread = Thread(
            thread_id=thread_data["id"],
            metadata=thread_data.get("metadata", {}),
            created_at=thread_data["created_at"],
            updated_at=thread_data["updated_at"]
        )

        # Load messages associated with this thread, sorted by timestamp
        messages = self.messages.find({"thread_id": thread_id}).sort("timestamp", 1)
        for msg_data in messages:
            msg = Message(
                role=msg_data["role"],
                content=msg_data["content"],
                agent=msg_data.get("agent"),
                tool_call=msg_data.get("tool_call"),
                metadata=msg_data.get("metadata", {}),
                message_id=msg_data["id"],
                timestamp=msg_data["timestamp"]
            )
            thread.messages.append(msg)

        return thread
```

### 3. Message Persistence

Implement the logic to save new messages.

**Appending Messages:**
This method is called whenever a new message is added to the history. You should save the message and update the thread's `updated_at` timestamp.

```python
    def append_message(
        self,
        thread_id: str,
        role: str,
        content: Any,
        agent: Optional[str] = None,
        tool_call: Optional[Dict] = None,
        metadata: Optional[Dict] = None
    ) -> Message:
        message = Message(
            role=role,
            content=content,
            agent=agent,
            tool_call=tool_call,
            metadata=metadata
        )

        self.messages.insert_one({
            "id": message.id,
            "thread_id": thread_id,
            "timestamp": message.timestamp,
            "role": message.role,
            "content": message.content,
            "agent": message.agent,
            "tool_call": message.tool_call,
            "metadata": message.metadata
        })

        # Update thread's updated_at timestamp
        self.threads.update_one(
            {"id": thread_id},
            {"$set": {"updated_at": datetime.now()}}
        )

        return message
```

### 4. Utility Methods

Implement the remaining utility methods for listing and deleting.

```python
    def get_messages(self, thread_id: str) -> List[Message]:
        """Retrieve all messages for a specific thread."""
        thread = self.get_thread(thread_id)
        return thread.messages if thread else []

    def list_threads(self) -> List[str]:
        """Return a list of all thread IDs."""
        return [t["id"] for t in self.threads.find({}, {"id": 1})]

    def delete_thread(self, thread_id: str) -> bool:
        """Delete a thread and all its associated messages."""
        result = self.threads.delete_one({"id": thread_id})
        if result.deleted_count > 0:
            self.messages.delete_many({"thread_id": thread_id})
            return True
        return False
```

### Usage

Once your class is defined, you can use it just like any built-in storage backend.

#### With create\_agent (Automatic Integration)

You can pass your custom storage backend directly to `create_agent` using `HistoryConfig`:

```python
from peargent import create_agent, HistoryConfig
from peargent.models import openai

# Initialize your custom store
store = MongoDBHistoryStore(connection_string="mongodb://localhost:27017")

# Create agent with custom storage backend
agent = create_agent(
    name="Assistant",
    description="A helpful assistant with MongoDB history",
    persona="You are a helpful AI assistant.",
    model=openai("gpt-4o"),
    history=HistoryConfig(
        auto_manage_context=True,
        max_context_messages=20,
        strategy="smart",
        store=store  # Your custom storage backend
    )
)

# Use the agent - history is automatically managed
response1 = agent.run("My name is Alice")
# Agent creates thread and stores message in MongoDB

response2 = agent.run("What's my name?")
# Agent loads history from MongoDB and remembers: "Your name is Alice"
```

#### With create\_pool (Multi-Agent with Custom Storage)

You can also use custom storage backends with agent pools for shared history across multiple agents:

```python
from peargent import create_agent, create_pool, HistoryConfig
from peargent.models import openai

# Initialize your custom store
store = MongoDBHistoryStore(connection_string="mongodb://localhost:27017")

# Create multiple agents
researcher = create_agent(
    name="Researcher",
    description="Researches topics thoroughly",
    persona="You are a detail-oriented researcher.",
    model=openai("gpt-4o-mini")
)

writer = create_agent(
    name="Writer",
    description="Writes clear summaries",
    persona="You are a skilled technical writer.",
    model=openai("gpt-4o")
)

# Create pool with custom storage - all agents share the same MongoDB history
pool = create_pool(
    agents=[researcher, writer],
    default_model=openai("gpt-4o"),
    history=HistoryConfig(
        auto_manage_context=True,
        max_context_messages=25,
        strategy="smart",
        store=store  # Shared custom storage for all agents
    )
)

# Use the pool
result = pool.run("Research quantum computing and write a summary")
# Both agents' interactions are stored in MongoDB
```


# History Management

<h1 className="mt-4 text-3xl font-medium">
  <span className="text-primary">
    History Management
  </span>
</h1>

<h3 className="-mt-7 text-lg font-normal text-muted-foreground">
  Custom storage backends, manual thread control, and low-level history operations for advanced use cases.
</h3>

This guide covers advanced history capabilities for developers who need fine-grained control over conversation persistence, custom storage implementations, and low-level message operations.

## Next Steps

* See **<u>[History](/docs/history)</u>** for basic usage and auto-management
* See **<u>[Agents](/docs/agents)</u>** for integrating history with agents
* See **<u>[Pools](/docs/pools)</u>** for shared history across multiple agents


# Manual Thread Management

<h1 className="mt-4 text-3xl font-medium">
  <span className="text-primary">
    Manual Thread Management
  </span>
</h1>

<h3 className="-mt-7 text-lg font-normal text-muted-foreground">
  Manually control threads for multi-user applications.
</h3>

While agents automatically manage threads, you can also control threads manually for multi-user applications or complex workflows.

## Creating and Switching Threads

You can create multiple threads to handle different conversations simultaneously. By using `create_thread` with metadata, you can tag threads with user IDs or session info. The `use_thread` method then lets you switch the active context, ensuring that new messages go to the correct conversation.

```python
from peargent import create_history
from peargent.storage import Sqlite

history = create_history(store_type=Sqlite(database_path="./app.db"))

# Create threads for different users
alice_thread = history.create_thread(metadata={"user_id": "alice", "session": "web"})
bob_thread = history.create_thread(metadata={"user_id": "bob", "session": "mobile"})

# Switch between threads
history.use_thread(alice_thread)
history.add_user_message("What's the weather?")

history.use_thread(bob_thread)
history.add_user_message("Show my orders")

# List all threads
all_threads = history.list_threads()
print(f"Total threads: {len(all_threads)}")

# Get thread with metadata
thread = history.get_thread(alice_thread)
print(f"User: {thread.metadata.get('user_id')}")
print(f"Messages: {len(thread.messages)}")
```

## Multi-User Application Pattern

For applications serving multiple users, you need to ensure each user gets their own conversation history. This pattern shows how to use metadata to look up an existing thread for a user. If a thread is found, the agent resumes that conversation; if not, a new thread is created. This allows a single agent instance to handle many users concurrently.

```python
from peargent import create_agent, create_history, HistoryConfig
from peargent.storage import Postgresql
from peargent.models import openai

# Shared history store for all users
history = create_history(
    store_type=Postgresql(
        connection_string="postgresql://user:pass@localhost/app_db"
    )
)

# Create agent (reused across users)
agent = create_agent(
    name="Assistant",
    description="Customer support assistant",
    persona="You are a helpful customer support agent.",
    model=openai("gpt-4o")
)

def handle_user_message(user_id: str, message: str):
    """Handle message from a specific user."""

    # Find or create thread for this user
    all_threads = history.list_threads()
    user_thread = None

    for thread_id in all_threads:
        thread = history.get_thread(thread_id)
        if thread.metadata.get("user_id") == user_id:
            user_thread = thread_id
            break

    if not user_thread:
        # Create new thread for this user
        user_thread = history.create_thread(metadata={"user_id": user_id})

    # Set active thread
    history.use_thread(user_thread)

    # Add user message
    history.add_user_message(message)

    # Get response from agent
    # Note: Agent needs to load this history manually or use temporary_memory
    response = agent.run(message)

    # Add assistant response
    history.add_assistant_message(response, agent="Assistant")

    return response

# Usage
response1 = handle_user_message("alice", "What's my order status?")
response2 = handle_user_message("bob", "I need help with returns")
response3 = handle_user_message("alice", "Thanks!")  # Same thread as first message
```


# History API Reference

<h1 className="mt-4 text-3xl font-medium">
  <span className="text-primary">
    History API Reference
  </span>
</h1>

<h3 className="-mt-7 text-lg font-normal text-muted-foreground">
  Complete reference for Thread, Message, and Context operations.
</h3>

## Thread Operations

### create\_thread

Creates a new conversation thread and sets it as the active thread.

```python
thread_id = history.create_thread(metadata={
    "user_id": "alice",
    "topic": "customer_support",
    "tags": ["billing", "urgent"]
})
```

**Parameters**

| Name       | Type             | Default | Description                                            |
| :--------- | :--------------- | :------ | :----------------------------------------------------- |
| `metadata` | `Optional[Dict]` | `None`  | Dictionary of custom metadata to attach to the thread. |

**Returns**

* `str`: The unique ID of the created thread.

***

### use\_thread

Switches the active context to an existing thread.

```python
history.use_thread("thread-123-abc")
```

**Parameters**

| Name        | Type  | Default | Description                        |
| :---------- | :---- | :------ | :--------------------------------- |
| `thread_id` | `str` | -       | The ID of the thread to switch to. |

***

### get\_thread

Retrieves a thread object.

```python
thread = history.get_thread()
print(f"Created: {thread.created_at}")
```

**Parameters**

| Name        | Type            | Default | Description                                                                           |
| :---------- | :-------------- | :------ | :------------------------------------------------------------------------------------ |
| `thread_id` | `Optional[str]` | `None`  | The ID of the thread to retrieve. If not provided, returns the current active thread. |

**Returns**

* `Optional[Thread]`: The thread object, or `None` if not found.

***

### list\_threads

Lists all available thread IDs in the storage.

```python
all_threads = history.list_threads()
print(f"Total threads: {len(all_threads)}")
```

**Returns**

* `List[str]`: A list of all thread IDs.

***

### delete\_thread

Deletes a thread and all its associated messages.

```python
if history.delete_thread(thread_id):
    print("Thread deleted")
```

**Parameters**

| Name        | Type  | Default | Description                     |
| :---------- | :---- | :------ | :------------------------------ |
| `thread_id` | `str` | -       | The ID of the thread to delete. |

**Returns**

* `bool`: `True` if the thread was successfully deleted, `False` otherwise.

## Message Operations

### add\_user\_message

Adds a user message to the current thread.

```python
msg = history.add_user_message(
    "What's the weather today?",
    metadata={"source": "web"}
)
```

**Parameters**

| Name       | Type             | Default | Description                      |
| :--------- | :--------------- | :------ | :------------------------------- |
| `content`  | `str`            | -       | The text content of the message. |
| `metadata` | `Optional[Dict]` | `None`  | Custom metadata for the message. |

**Returns**

* `Message`: The created message object.

***

### add\_assistant\_message

Adds an assistant response to the current thread.

```python
msg = history.add_assistant_message(
    "The weather is sunny.",
    agent="WeatherBot",
    metadata={"model": "gpt-4o"}
)
```

**Parameters**

| Name       | Type             | Default | Description                                         |
| :--------- | :--------------- | :------ | :-------------------------------------------------- |
| `content`  | `Any`            | -       | The content of the response (string or structured). |
| `agent`    | `Optional[str]`  | `None`  | Name of the agent that generated the response.      |
| `metadata` | `Optional[Dict]` | `None`  | Custom metadata (e.g., tokens used, model name).    |

**Returns**

* `Message`: The created message object.

***

### add\_tool\_message

Adds a tool execution result to the current thread.

```python
msg = history.add_tool_message(
    tool_call={
        "name": "get_weather",
        "output": {"temp": 72}
    },
    agent="WeatherBot"
)
```

**Parameters**

| Name        | Type             | Default | Description                                                        |
| :---------- | :--------------- | :------ | :----------------------------------------------------------------- |
| `tool_call` | `Dict`           | -       | Dictionary containing tool execution details (name, args, output). |
| `agent`     | `Optional[str]`  | `None`  | Name of the agent that called the tool.                            |
| `metadata`  | `Optional[Dict]` | `None`  | Custom metadata (e.g., execution time).                            |

**Returns**

* `Message`: The created message object.

***

### get\_messages

Retrieves messages from a thread with optional filtering.

```python
# Get only user messages
user_messages = history.get_messages(role="user")
```

**Parameters**

| Name        | Type            | Default | Description                                             |
| :---------- | :-------------- | :------ | :------------------------------------------------------ |
| `thread_id` | `Optional[str]` | `None`  | Target thread ID. Defaults to current thread.           |
| `role`      | `Optional[str]` | `None`  | Filter by role ("user", "assistant", "tool", "system"). |
| `agent`     | `Optional[str]` | `None`  | Filter by agent name.                                   |

**Returns**

* `List[Message]`: List of matching message objects.

***

### get\_message\_count

Gets the total number of messages in a thread.

```python
count = history.get_message_count()
```

**Parameters**

| Name        | Type            | Default | Description                                   |
| :---------- | :-------------- | :------ | :-------------------------------------------- |
| `thread_id` | `Optional[str]` | `None`  | Target thread ID. Defaults to current thread. |

**Returns**

* `int`: The number of messages.

***

### delete\_message

Deletes a specific message by its ID.

```python
history.delete_message(message_id)
```

**Parameters**

| Name         | Type            | Default | Description                                   |
| :----------- | :-------------- | :------ | :-------------------------------------------- |
| `message_id` | `str`           | -       | The ID of the message to delete.              |
| `thread_id`  | `Optional[str]` | `None`  | Target thread ID. Defaults to current thread. |

**Returns**

* `bool`: `True` if deleted successfully.

***

### delete\_messages

Deletes multiple messages at once.

```python
history.delete_messages([msg_id_1, msg_id_2])
```

**Parameters**

| Name          | Type            | Default | Description                                   |
| :------------ | :-------------- | :------ | :-------------------------------------------- |
| `message_ids` | `List[str]`     | -       | List of message IDs to delete.                |
| `thread_id`   | `Optional[str]` | `None`  | Target thread ID. Defaults to current thread. |

**Returns**

* `int`: The number of messages actually deleted.

## Context Management Operations

### trim\_messages

Trims messages to manage context window size.

```python
# Keep only the last 10 messages
history.trim_messages(strategy="last", count=10)
```

**Parameters**

| Name          | Type            | Default  | Description                                                                |
| :------------ | :-------------- | :------- | :------------------------------------------------------------------------- |
| `strategy`    | `str`           | `"last"` | Strategy: `"last"` (keep recent), `"first"` (keep oldest), `"first_last"`. |
| `count`       | `int`           | `10`     | Number of messages to keep.                                                |
| `keep_system` | `bool`          | `True`   | If `True`, system messages are never deleted.                              |
| `thread_id`   | `Optional[str]` | `None`   | Target thread ID.                                                          |

**Returns**

* `int`: Number of messages removed.

***

### summarize\_messages

Summarizes a range of messages using an LLM and replaces them with a summary.

```python
history.summarize_messages(
    model=groq("llama-3.1-8b-instant"),
    keep_recent=5
)
```

**Parameters**

| Name          | Type            | Default | Description                                        |
| :------------ | :-------------- | :------ | :------------------------------------------------- |
| `model`       | `Any`           | -       | LLM model instance for generating summary.         |
| `start_index` | `int`           | `0`     | Start index for summarization.                     |
| `end_index`   | `Optional[int]` | `None`  | End index (defaults to `len - keep_recent`).       |
| `keep_recent` | `int`           | `5`     | Number of recent messages to exclude from summary. |
| `thread_id`   | `Optional[str]` | `None`  | Target thread ID.                                  |

**Returns**

* `Message`: The newly created summary message.

***

### manage\_context\_window

Automatically manages context window when messages exceed a threshold.

```python
history.manage_context_window(
    model=groq("llama-3.1-8b-instant"),
    max_messages=20,
    strategy="smart"
)
```

**Parameters**

| Name           | Type            | Default   | Description                                                        |
| :------------- | :-------------- | :-------- | :----------------------------------------------------------------- |
| `model`        | `Any`           | -         | LLM model (required for "summarize" and "smart").                  |
| `max_messages` | `int`           | `20`      | Threshold to trigger management.                                   |
| `strategy`     | `str`           | `"smart"` | Strategy: `"smart"`, `"trim_last"`, `"trim_first"`, `"summarize"`. |
| `thread_id`    | `Optional[str]` | `None`    | Target thread ID.                                                  |

## Data Models

### Message Object

Represents a single message in the conversation history.

| Property    | Type             | Description                                                 |
| :---------- | :--------------- | :---------------------------------------------------------- |
| `id`        | `str`            | Unique UUID for the message.                                |
| `timestamp` | `datetime`       | Time when the message was created.                          |
| `role`      | `str`            | Role of the sender (`user`, `assistant`, `tool`, `system`). |
| `content`   | `Any`            | The content of the message.                                 |
| `agent`     | `Optional[str]`  | Name of the agent (for assistant messages).                 |
| `tool_call` | `Optional[Dict]` | Tool execution details (for tool messages).                 |
| `metadata`  | `Dict`           | Custom metadata dictionary.                                 |

### Thread Object

Represents a conversation thread containing multiple messages.

| Property     | Type            | Description                             |
| :----------- | :-------------- | :-------------------------------------- |
| `id`         | `str`           | Unique UUID for the thread.             |
| `created_at` | `datetime`      | Time when the thread was created.       |
| `updated_at` | `datetime`      | Time when the thread was last modified. |
| `metadata`   | `Dict`          | Custom metadata dictionary.             |
| `messages`   | `List[Message]` | List of messages in the thread.         |


# Optimizing for Cost

<h1 className="mt-4 text-3xl font-medium">
  <span className="text-primary">
    Optimizing for Cost
  </span>
</h1>

<h3 className="-mt-7 text-lg font-normal text-muted-foreground">
  Strategies to control token usage and reduce API costs
</h3>

Running LLM agents in production can be expensive if not managed carefully. This guide outlines practical strategies to keep your costs under control without sacrificing the quality of your agent's responses.

## 1. Choose the Right Model

Not every task requires the most powerful model. Match your model choice to the task complexity.

### Model Selection Strategy

```python
from peargent import create_agent
from peargent.models import groq

# Use smaller models for simple tasks
classifier_agent = create_agent(
    name="Classifier",
    description="Classifies user intent",
    persona="You classify user messages into categories: support, sales, or general.",
    model=groq("llama-3.1-8b")  # [!code highlight] - Smaller, cheaper model
)

# Use larger models only for complex tasks
reasoning_agent = create_agent(
    name="Reasoner",
    description="Solves complex problems",
    persona="You solve complex reasoning and coding problems step by step.",
    model=groq("llama-3.3-70b-versatile")  # [!code highlight] - Larger model when needed
)
```

**Guidelines:**

* **Simple tasks** (classification, extraction, summarization): Use smaller models (8B parameters)
* **Complex tasks** (reasoning, coding, analysis): Use larger models (70B+ parameters)
* **Test different models** on your specific use case to find the best cost/quality balance

### Track Model Costs with Custom Pricing

If you're using custom or local models, add their pricing to track costs accurately:

```python
from peargent.observability import enable_tracing, get_tracer

tracer = enable_tracing()

# Add custom pricing for your model (prices per million tokens)
tracer.add_custom_pricing(  # [!code highlight]
    model="my-fine-tuned-model",
    prompt_price=1.50,      # $1.50 per million prompt tokens
    completion_price=3.00   # $3.00 per million completion tokens
)

# Now cost tracking works for your custom model
agent = create_agent(
    name="CustomAgent",
    model=my_custom_model,
    persona="You are helpful.",
    tracing=True
)
```

## 2. Control Context with History Management

The context window is your biggest cost driver. Every message in the conversation history is re-sent with each request.

### Configure Automatic Context Management

Use `HistoryConfig` to automatically manage conversation history:

```python
from peargent import create_agent, HistoryConfig
from peargent.storage import InMemory
from peargent.models import groq

agent = create_agent(
    name="CostOptimizedAgent",
    description="Agent with automatic history management",
    persona="You are a helpful assistant.",
    model=groq("llama-3.3-70b-versatile"),
    history=HistoryConfig(  # [!code highlight]
        auto_manage_context=True,        # Enable automatic management
        max_context_messages=10,         # Keep only last 10 messages
        strategy="trim_last",            # Remove oldest messages when limit reached
        store=InMemory()
    )
)
```

### Available Context Strategies

Peargent supports 5 context management strategies:

| Strategy       | How It Works                        | Use When                    | Cost Impact                   |
| -------------- | ----------------------------------- | --------------------------- | ----------------------------- |
| `"trim_last"`  | Removes oldest messages             | Simple conversations        | ✅ Low - fast, no LLM calls    |
| `"trim_first"` | Keeps oldest messages               | Important initial context   | ✅ Low - fast, no LLM calls    |
| `"first_last"` | Keeps first and last messages       | Preserving original context | ✅ Low - fast, no LLM calls    |
| `"summarize"`  | Summarizes old messages             | Complex conversations       | ⚠️ Medium - requires LLM call |
| `"smart"`      | Chooses best strategy automatically | General purpose             | ⚠️ Variable - may use LLM     |

**Example: Trim Strategy (Recommended for Cost)**

```python
# Most cost-effective - no LLM calls for management
history=HistoryConfig(
    auto_manage_context=True,
    max_context_messages=10,  # [!code highlight] - Only keep 10 messages
    strategy="trim_last",     # [!code highlight] - Drop oldest messages
    store=InMemory()
)
```

**Example: Summarize Strategy (Better Context Retention)**

```python
# Uses LLM to summarize old messages - costs more but retains context
history=HistoryConfig(
    auto_manage_context=True,
    max_context_messages=20,
    strategy="summarize",           # [!code highlight] - Summarize old messages
    summarize_model=groq("llama-3.1-8b"),  # [!code highlight] - Use cheap model for summaries
    store=InMemory()
)
```

**Example: Smart Strategy (Balanced)**

```python
# Automatically chooses between trim and summarize
history=HistoryConfig(
    auto_manage_context=True,
    max_context_messages=15,
    strategy="smart",  # [!code highlight] - Automatically adapts
    store=InMemory()
)
```

## 3. Limit Output Length with max\_tokens

Control how much the agent can generate by setting `max_tokens` in model parameters:

```python
from peargent import create_agent
from peargent.models import groq

# Limit output to reduce costs
agent = create_agent(
    name="BriefAgent",
    description="Gives brief responses",
    persona="You provide concise, brief answers. Maximum 2-3 sentences.",
    model=groq(
        "llama-3.3-70b-versatile",
        parameters={
            "max_tokens": 150,      # [!code highlight] - Limit to ~150 tokens output
            "temperature": 0.7
        }
    )
)

response = agent.run("Explain quantum computing")
# Agent cannot generate more than 150 tokens
```

**Guidelines:**

* Short answers: `max_tokens=150` (\~100 words)
* Medium answers: `max_tokens=500` (\~350 words)
* Long answers: `max_tokens=2000` (\~1500 words)
* Code generation: `max_tokens=4096` or higher

### Move Examples to Tool Descriptions

Instead of putting examples in the persona, put them in tool descriptions:

```python
from peargent import create_agent, create_tool

def search_database(query: str) -> str:
    # Implementation...
    return "Results found"

agent = create_agent(
    name="ProductAgent",
    persona="You help with product inquiries.",  # Short persona
    model=groq("llama-3.3-70b-versatile"),
    tools=[create_tool(
        name="search_database",
        description="""Searches the product database for matching items.

        Use this tool when users ask about products, inventory, or availability.
        Examples: "Do we have red shirts?" → use this tool with query="red shirts"
                  "Check stock for item #123" → use this tool with query="item 123"
        """,  # [!code highlight] - Examples in tool description, not persona
        input_parameters={"query": str},
        call_function=search_database
    )]
)
```

## 4. Control Temperature for Deterministic Outputs

Lower temperature reduces token usage for tasks that need deterministic outputs:

```python
from peargent import create_agent
from peargent.models import groq

# For deterministic tasks (extraction, classification)
extraction_agent = create_agent(
    name="Extractor",
    description="Extracts structured data",
    persona="Extract the requested information exactly as it appears.",
    model=groq(
        "llama-3.3-70b-versatile",
        parameters={
            "temperature": 0.0,  # [!code highlight] - Deterministic, shorter outputs
            "max_tokens": 500
        }
    )
)

# For creative tasks (writing, brainstorming)
creative_agent = create_agent(
    name="Writer",
    description="Writes creative content",
    persona="You write engaging, creative content.",
    model=groq(
        "llama-3.3-70b-versatile",
        parameters={
            "temperature": 0.9,  # [!code highlight] - More creative, longer outputs
            "max_tokens": 2000
        }
    )
)
```

## 5. Monitor Costs with Tracing

You can't optimize what you can't measure. Use Peargent's observability features to track costs.

### Enable Cost Tracking

```python
from peargent import create_agent
from peargent.observability import enable_tracing
from peargent.storage import Sqlite
from peargent.models import groq

# Enable tracing with database storage
tracer = enable_tracing(
    store_type=Sqlite(connection_string="sqlite:///./traces.db")
)

agent = create_agent(
    name="TrackedAgent",
    description="Agent with cost tracking",
    persona="You are helpful.",
    model=groq("llama-3.3-70b-versatile"),
    tracing=True  # [!code highlight] - Enable tracing for this agent
)

# Use the agent
response = agent.run("Hello")

# Check costs
traces = tracer.list_traces()
latest = traces[-1]

print(f"Cost: ${latest.total_cost:.6f}")
print(f"Tokens: {latest.total_tokens}")
print(f"Duration: {latest.duration_ms}ms")
```

### Analyze Cost Patterns

```python
from peargent.observability import get_tracer

tracer = get_tracer()

# Get aggregate statistics
stats = tracer.get_aggregate_stats()  # [!code highlight]

print(f"Total Traces: {stats['total_traces']}")
print(f"Total Cost: ${stats['total_cost']:.6f}")
print(f"Average Cost per Trace: ${stats['avg_cost_per_trace']:.6f}")
print(f"Total Tokens: {stats['total_tokens']:,}")

# Find expensive operations
traces = tracer.list_traces()
expensive_traces = sorted(traces, key=lambda t: t.total_cost, reverse=True)[:5]

print("\nMost Expensive Operations:")
for trace in expensive_traces:
    print(f"  {trace.agent_name}: ${trace.total_cost:.6f} ({trace.total_tokens} tokens)")
```

### Set Cost Alerts

```python
from peargent.observability import get_tracer

tracer = get_tracer()
MAX_COST_PER_REQUEST = 0.01  # $0.01 limit

for update in agent.stream_observe(user_input):
    if update.is_agent_end:
        if update.cost > MAX_COST_PER_REQUEST:  # [!code highlight]
            print(f"⚠️ WARNING: Cost ${update.cost:.6f} exceeds limit!")
            # Log alert, notify admins, etc.
```

### Track Costs by User

```python
from peargent.observability import enable_tracing, set_user_id, get_tracer
from peargent.storage import Postgresql

tracer = enable_tracing(
    store_type=Postgresql(connection_string="postgresql://user:pass@localhost/db")
)

# Set user ID before agent runs
set_user_id("user_123")  # [!code highlight]

agent.run("Hello")

# Get costs for specific user
user_stats = tracer.get_aggregate_stats(user_id="user_123")  # [!code highlight]
print(f"User 123 total cost: ${user_stats['total_cost']:.6f}")
```

## 6. Use Streaming to Show Progress

While streaming doesn't reduce costs, it improves perceived performance, making slower/cheaper models feel faster:

```python
from peargent import create_agent
from peargent.models import groq

# Use cheaper model with streaming
agent = create_agent(
    name="StreamingAgent",
    description="Shows progress immediately",
    persona="You are helpful.",
    model=groq("llama-3.1-8b")  # Cheaper model
)

# Stream response - user sees first token in ~200ms
print("Agent: ", end="", flush=True)
for chunk in agent.stream("Explain AI"):  # [!code highlight]
    print(chunk, end="", flush=True)
```

**Benefit:** Cheaper models feel faster with streaming, reducing pressure to use expensive models.

## 7. Count Tokens Before Sending

Estimate costs before making expensive calls:

```python
from peargent.observability import get_cost_tracker

tracker = get_cost_tracker()

# Count tokens in your prompt
prompt = "Explain quantum computing in detail..."
token_count = tracker.count_tokens(prompt, model="llama-3.3-70b-versatile")  # [!code highlight]

print(f"Prompt will use ~{token_count} tokens")

# Estimate cost
estimated_cost = tracker.calculate_cost(  # [!code highlight]
    prompt_tokens=token_count,
    completion_tokens=500,  # Estimate 500 token response
    model="llama-3.3-70b-versatile"
)

print(f"Estimated cost: ${estimated_cost:.6f}")

# Decide whether to proceed
if estimated_cost > 0.01:
    print("Too expensive! Shortening prompt...")
    # Truncate or summarize prompt
```

## Cost Optimization Checklist

Use this checklist for production deployments:

### Model Selection

* [ ] Using smallest viable model for each agent type
* [ ] Tested cost vs quality tradeoff for your use case
* [ ] Custom pricing configured for local/fine-tuned models

### Context Management

* [ ] `HistoryConfig` configured with appropriate strategy
* [ ] `max_context_messages` set to reasonable limit (10-20)
* [ ] Using "trim\_last" for cost-sensitive applications
* [ ] Cheaper model used for summarization if using "summarize" strategy

### Output Control

* [ ] `max_tokens` set based on expected response length
* [ ] Persona/system prompt optimized for brevity
* [ ] Examples moved from persona to tool descriptions
* [ ] Temperature set to 0.0 for deterministic tasks

### Monitoring

* [ ] Tracing enabled in production
* [ ] Cost tracking configured with accurate pricing
* [ ] Regular analysis of aggregate statistics
* [ ] Alerts set for expensive operations
* [ ] Per-user cost tracking implemented

### Implementation

* [ ] Token counting used for cost estimation
* [ ] Streaming enabled for better UX with cheaper models
* [ ] Cost limits enforced in application logic
* [ ] Regular review of most expensive operations

## Summary

**Biggest Cost Savings:**

1. **History Management** - Use `trim_last` with `max_context_messages=10` (saves 50-80% on tokens)
2. **Model Selection** - Use smaller models for simple tasks (saves 50-90% on costs)
3. **Persona Optimization** - Short personas (saves 5-10% per request)
4. **max\_tokens** - Limit output length (saves 20-40% on completion tokens)

**Essential Monitoring:**

* Enable tracing in production
* Track costs per user/session
* Analyze aggregate statistics weekly
* Set alerts for expensive operations

Start with history management and model selection for the biggest impact!


# Creating Custom Tooling

<h1 className="mt-4 text-3xl font-medium">
  <span className="text-primary">
    Creating Custom Tooling
  </span>
</h1>

<h3 className="-mt-7 text-lg font-normal text-muted-foreground">
  Best practices for building robust, reusable tools for your agents
</h3>

Tools are the hands of your agent—they allow it to interact with the outside world, from searching the web to querying databases. While Peargent comes with built-in tools, the real power lies in creating custom tools tailored to your specific needs.

## The Anatomy of a Tool

Every tool in Peargent has four essential components:

```python
from peargent import create_tool

def search_database(query: str) -> str:
    # Your implementation here
    return "Results found"

tool = create_tool(
    name="search_database",              # [!code highlight] - Tool identifier
    description="""Searches the product database for matching items.

    Use this tool when users ask about products, inventory, or availability.""",  # [!code highlight] - What LLM sees
    input_parameters={"query": str},     # [!code highlight] - Expected arguments
    call_function=search_database        # [!code highlight] - Function to execute
)
```

**The Four Components:**

1. **name**: Unique identifier the LLM uses to call the tool
2. **description**: Tells the LLM *what* the tool does and *when* to use it
3. **input\_parameters**: Dict mapping parameter names to their types
4. **call\_function**: The Python function that implements the tool logic

## The Golden Rules of Tool Building

### 1. Descriptive Descriptions are Mandatory

The LLM uses the tool's `description` parameter to understand *what* the tool does and *when* to use it. Be verbose and precise.

#### Bad: Vague Description

```python
tool = create_tool(
    name="fetch_user",
    description="Gets user info.",  # ❌ Too vague!
    input_parameters={"user_id": str},
    call_function=fetch_user
)
```

#### Good: Clear and Specific

```python
tool = create_tool(
    name="fetch_user_data",
    description="""Retrieves detailed profile information for a specific user from the database.

    Use this tool when you need to look up:
    - User's email address
    - Phone number
    - Account status (active/suspended)
    - Registration date

    Do NOT use this for:
    - Searching users by name (use search_users instead)
    - Updating user data (use update_user instead)

    Returns: Dict with keys: user_id, email, phone, status, registered_at""",  # ✅ Clear when to use it!
    input_parameters={"user_id": str},
    call_function=fetch_user_data
)
```

**Why This Matters:** The LLM decides which tool to call based solely on the description. A clear description = correct tool selection.

### 2. Type Hinting is Critical

Peargent uses type hints to generate the JSON schema for the LLM. Always type your arguments and return values.

#### Bad: No Type Hints

```python
def calculate_tax(amount, region):  # ❌ LLM can't infer types!
    return amount * 0.1
```

#### Good: Full Type Hints

```python
def calculate_tax(amount: float, region: str) -> float:  # ✅ Clear types!
    """Calculates tax based on amount and region."""
    tax_rates = {"CA": 0.0725, "NY": 0.08875, "TX": 0.0625}
    return amount * tax_rates.get(region, 0.05)
```

**Supported Types:**

* Primitives: `str`, `int`, `float`, `bool`
* Collections: `list`, `dict`, `list[str]`, `dict[str, int]`
* Pydantic Models: For complex structured inputs (see below)

### 3. Handle Errors Gracefully

Tools should not crash the agent. Use the `on_error` parameter to control failure behavior.

```python
from peargent import create_tool

def search_database(query: str) -> str:
    """Searches the database for results."""
    try:
        results = db.execute(query)
        return str(results)
    except Exception as e:
        return f"Error executing query: {str(e)}. Please check your syntax."

# Three error handling strategies:

# Strategy 1: RAISE (Default) - Strict mode
critical_tool = create_tool(
    name="critical_operation",
    description="Must succeed - handles payment",
    input_parameters={"amount": float},
    call_function=process_payment,
    on_error="raise"  # [!code highlight] - Crash if fails (default)
)

# Strategy 2: RETURN_ERROR - Graceful mode
optional_tool = create_tool(
    name="get_recommendations",
    description="Optional product recommendations",
    input_parameters={"user_id": str},
    call_function=get_recommendations,
    on_error="return_error"  # [!code highlight] - Return error message as string
)

# Strategy 3: RETURN_NONE - Silent mode
analytics_tool = create_tool(
    name="log_event",
    description="Logs analytics events",
    input_parameters={"event": str},
    call_function=log_analytics,
    on_error="return_none"  # [!code highlight] - Return None silently
)
```

**When to Use Each Strategy:**

| Strategy         | Use Case                                    | Example                                          |
| ---------------- | ------------------------------------------- | ------------------------------------------------ |
| `"raise"`        | Critical operations that must succeed       | Authentication, payments, database writes        |
| `"return_error"` | Optional features that shouldn't break flow | Recommendations, third-party APIs, cache lookups |
| `"return_none"`  | Nice-to-have features                       | Analytics, logging, notifications                |

### 4. Keep It Simple (Idempotency)

Ideally, tools should be **idempotent**—calling them multiple times with the same arguments should produce the same result. Avoid tools that rely heavily on hidden state.

#### Bad: Stateful Tool

```python
counter = 0  # ❌ Hidden state!

def increment() -> int:
    """Increments counter."""
    global counter
    counter += 1
    return counter
```

#### Good: Stateless Tool

```python
def get_user_count() -> int:
    """Gets current user count from database."""
    return db.query("SELECT COUNT(*) FROM users").scalar()  # ✅ Same input = same output
```

## Advanced Features

### Complex Input with Pydantic

For tools with many parameters or nested data, use Pydantic models for input.

```python
from pydantic import BaseModel, Field
from peargent import create_tool

class TicketInput(BaseModel):
    title: str = Field(..., description="Brief summary of the issue")
    priority: str = Field(..., description="Ticket priority")
    description: str = Field(..., description="Detailed description")
    category: str = Field(default="general", description="Ticket category")

    # Validation
    def __init__(self, **data):
        # Validate priority
        if data.get("priority") not in ["LOW", "MEDIUM", "HIGH"]:
            raise ValueError("Priority must be LOW, MEDIUM, or HIGH")
        super().__init__(**data)

def create_support_ticket(data: TicketInput) -> str:
    """
    Creates a new support ticket in the system.

    Required fields:
    - title: Brief summary (e.g., "Cannot login")
    - priority: LOW, MEDIUM, or HIGH
    - description: Detailed explanation

    Optional fields:
    - category: Ticket category (default: "general")
    """
    # Access validated fields
    ticket_id = db.create_ticket(
        title=data.title,
        priority=data.priority,
        description=data.description,
        category=data.category
    )
    return f"Ticket #{ticket_id} created with priority {data.priority}"

ticket_tool = create_tool(
    name="create_ticket",
    description="""Creates a new support ticket in the system.

    Required fields:
    - title: Brief summary (e.g., "Cannot login")
    - priority: LOW, MEDIUM, or HIGH
    - description: Detailed explanation

    Optional fields:
    - category: Ticket category (default: "general")""",
    input_parameters={"data": TicketInput},  # [!code highlight] - Use Pydantic model
    call_function=create_support_ticket
)
```

**Benefits:**

* Clear parameter structure
* Built-in validation
* Default values
* Nested objects supported

## Real-World Examples

### Example 1: Database Query Tool

```python
from peargent import create_tool
import sqlite3

def query_products(
    category: str,
    min_price: float = 0.0,
    max_price: float = 999999.99
) -> str:
    """
    Searches the product database by category and price range.

    Use this tool when users ask about:
    - Products in a specific category
    - Products within a price range
    - Available inventory

    Examples:
    - "Show me electronics under $500" → category="electronics", max_price=500
    - "What furniture do you have?" → category="furniture"

    Returns: Formatted list of matching products with prices
    """
    try:
        conn = sqlite3.connect("products.db")
        cursor = conn.cursor()

        query = """
            SELECT name, price, stock
            FROM products
            WHERE category = ? AND price BETWEEN ? AND ?
            ORDER BY price
        """

        results = cursor.fetchall()
        conn.close()

        if not results:
            return f"No products found in '{category}' between ${min_price} and ${max_price}"

        # Format results
        output = f"Found {len(results)} products in '{category}':\n\n"
        for name, price, stock in results:
            output += f"- {name}: ${price:.2f} ({stock} in stock)\n"

        return output

    except Exception as e:
        return f"Database error: {str(e)}"

product_tool = create_tool(
    name="query_products",
    description="""Searches the product database by category and price range.

    Use this tool when users ask about:
    - Products in a specific category
    - Products within a price range
    - Available inventory

    Examples:
    - "Show me electronics under $500" → category="electronics", max_price=500
    - "What furniture do you have?" → category="furniture"

    Returns: Formatted list of matching products with prices""",
    input_parameters={
        "category": str,
        "min_price": float,
        "max_price": float
    },
    call_function=query_products,
    on_error="return_error"   # Don't crash on DB errors
)
```

### Example 2: External API

```python
from peargent import create_tool
import requests

def fetch_weather(city: str) -> str:
    """
    Fetches current weather data from OpenWeatherMap API.

    Use this tool when users ask about:
    - Current weather conditions
    - Temperature
    - Weather forecasts

    Returns: Human-readable weather description
    """
    api_key = os.getenv("OPENWEATHER_API_KEY")
    url = f"https://api.openweathermap.org/data/2.5/weather?q={city}&appid={api_key}"

    response = requests.get(url, timeout=5)

    if response.status_code == 404:
        return f"City '{city}' not found. Please check spelling."

    if response.status_code != 200:
        raise Exception(f"API returned status {response.status_code}")

    data = response.json()
    temp_f = (data["main"]["temp"] - 273.15) * 9/5 + 32
    condition = data["weather"][0]["description"]

    return f"Weather in {city}: {condition.capitalize()}, {temp_f:.1f}°F"

weather_tool = create_tool(
    name="get_weather",
    description="""Fetches current weather data from OpenWeatherMap API.

    Use this tool when users ask about:
    - Current weather conditions
    - Temperature
    - Weather forecasts

    Returns: Human-readable weather description""",
    input_parameters={"city": str},
    call_function=fetch_weather,
    on_error="return_error"   # Return error message, don't crash
)
```

### Example 3: File Processing

```python
from peargent import create_tool
import os

def analyze_file(filepath: str) -> dict:
    """
    Analyzes a text file and returns metadata and preview.

    Use this tool when users want to:
    - Check file size
    - See file contents preview
    - Count lines in a file

    Returns: Dict with file metadata
    """
    if not os.path.exists(filepath):
        raise FileNotFoundError(f"File '{filepath}' does not exist")

    size = os.path.getsize(filepath)

    with open(filepath, 'r', encoding='utf-8') as f:
        lines = f.readlines()

    preview = ''.join(lines[:5])

    return {
        "filename": os.path.basename(filepath),
        "size_bytes": size,
        "line_count": len(lines),
        "preview": preview
    }

file_tool = create_tool(
    name="analyze_file",
    description="""Analyzes a text file and returns metadata and preview.

    Use this tool when users want to:
    - Check file size
    - See file contents preview
    - Count lines in a file

    Returns: Dict with file metadata""",
    input_parameters={"filepath": str},
    call_function=analyze_file,
    on_error="return_error"
)

# Usage
result = file_tool.run({"filepath": "/path/to/file.txt"})
print(f"File: {result['filename']}")
print(f"Size: {result['size_bytes']} bytes")
print(f"Lines: {result['line_count']}")
```

## Tool Building Checklist

Use this checklist when creating production tools:

### Design Phase

* [ ] Tool has a single, clear responsibility
* [ ] Input parameters are minimal and well-typed
* [ ] Docstring explains **when** to use the tool, not just what it does
* [ ] Examples included in docstring for clarity

### Implementation Phase

* [ ] All parameters have type hints
* [ ] Function handles errors gracefully (try-except)
* [ ] Tool is idempotent (same input = same output)
* [ ] No hidden global state

### Configuration Phase

* [ ] `on_error` strategy chosen based on criticality

### Testing Phase

* [ ] Tool works with valid inputs
* [ ] Tool handles invalid inputs gracefully

## Common Patterns

### Pattern 1: Multi-Step Tool

```python
from pydantic import BaseModel, Field
from peargent import create_tool

class OrderResult(BaseModel):
    order_id: str
    status: str
    total: float
    items: list[str]

def process_order(customer_id: str, items: list[str]) -> OrderResult:
    """
    Processes a customer order through multiple steps.

    Steps:
    1. Validate customer exists
    2. Check inventory for all items
    3. Calculate total with taxes
    4. Create order record
    5. Update inventory

    Returns: OrderResult with order details
    """
    # Step 1: Validate customer
    customer = db.get_customer(customer_id)
    if not customer:
        raise ValueError(f"Customer {customer_id} not found")

    # Step 2: Check inventory
    for item in items:
        if not inventory.check_availability(item):
            raise ValueError(f"Item {item} out of stock")

    # Step 3: Calculate total
    total = sum(catalog.get_price(item) for item in items)
    tax = total * 0.0725
    final_total = total + tax

    # Step 4: Create order
    order_id = db.create_order(customer_id, items, final_total)

    # Step 5: Update inventory
    for item in items:
        inventory.decrement(item)

    return OrderResult(
        order_id=order_id,
        status="completed",
        total=final_total,
        items=items
    )

order_tool = create_tool(
    name="process_order",
    description="""Processes a customer order through multiple steps.

    Steps:
    1. Validate customer exists
    2. Check inventory for all items
    3. Calculate total with taxes
    4. Create order record
    5. Update inventory

    Returns: OrderResult with order details""",
    input_parameters={
        "customer_id": str,
        "items": list
    },
    call_function=process_order,
    on_error="raise"  # Critical operation - must succeed
)
```

### Pattern 2: Tool Composition

Break complex operations into smaller tools:

```python
# Small, focused tools
search_tool = create_tool(
    name="search_products",
    description="Searches products by keyword",
    input_parameters={"query": str},
    call_function=search_products
)

filter_tool = create_tool(
    name="filter_by_price",
    description="Filters products by price range",
    input_parameters={"products": list, "min_price": float, "max_price": float},
    call_function=filter_by_price
)

sort_tool = create_tool(
    name="sort_products",
    description="Sorts products by field",
    input_parameters={"products": list, "sort_by": str},
    call_function=sort_products
)

# Agent uses tools in sequence
agent = create_agent(
    name="ProductAgent",
    persona="You help users find products. Use multiple tools in sequence.",
    tools=[search_tool, filter_tool, sort_tool]
)

# User: "Show me laptops under $1000, sorted by price"
# Agent will:
# 1. Call search_tool(query="laptops")
# 2. Call filter_tool(products=results, max_price=1000)
# 3. Call sort_tool(products=filtered, sort_by="price")
```

## When to Create Custom Tools

### ✅ Good Use Cases:

* **Domain-Specific Logic**: Calculating pricing based on your company's unique rules
* **Internal APIs**: Fetching data from your private microservices
* **Database Operations**: Querying your application's database
* **Complex Workflows**: Triggering multi-step processes (e.g., "onboard\_new\_employee")
* **External Integrations**: Calling third-party APIs (Stripe, Twilio, etc.)
* **File Operations**: Reading, writing, or processing files
* **Business Logic**: Tax calculations, shipping estimates, etc.

### ❌ Poor Use Cases:

* **Generic Operations**: Use existing tools (web search, calculations)
* **One-Off Tasks**: Write regular Python code instead
* **State Management**: Don't use tools to track conversation state (use history instead)
* **Pure Computation**: Simple math doesn't need a tool (LLM can do it)

## Summary

**Building Great Tools:**

1. **Clear Docstrings** - Explain *when* to use the tool, not just *what* it does
2. **Type Everything** - Full type hints for parameters and returns
3. **Handle Errors** - Choose appropriate `on_error` strategy
4. **Keep It Simple** - One responsibility per tool, idempotent when possible

**Essential Parameters:**

* `name`, `description`, `input_parameters`, `call_function` - Always required
* `on_error` - `"raise"`, `"return_error"`, or `"return_none"`

Start with simple tools and add advanced features as needed!


# Writing Effective Personas

<h1 className="mt-4 text-3xl font-medium">
  <span className="text-primary">
    Writing Effective Personas
  </span>
</h1>

<h3 className="-mt-7 text-lg font-normal text-muted-foreground">
  Learn how to craft powerful system prompts that define your agent's behavior
</h3>

The `persona` parameter in Peargent is your agent's system prompt—the instructions that define how your agent behaves, speaks, and solves problems. A well-crafted persona can mean the difference between a generic bot and a highly effective specialist.

**Your persona is sent with EVERY request**, so make it count!

## The Anatomy of a Strong Persona

A good persona should cover three key areas: **Identity**, **Capabilities**, and **Constraints**.

### 1. Identity (Who are you?)

Define the agent's role, expertise, and communication style.

#### Weak Identity

```python
agent = create_agent(
    name="Assistant",
    persona="You are a helpful assistant.",  # ❌ Too generic!
    model=groq("llama-3.3-70b-versatile")
)
```

#### Strong Identity

```python
agent = create_agent(
    name="DevOpsExpert",
    persona="""You are a Senior DevOps Engineer with 10 years of experience in
    Kubernetes, AWS, and CI/CD pipelines.

    Communication Style:
    - Speak concisely and technically
    - Use industry-standard terminology
    - Always prioritize security and reliability
    - Provide specific commands and configurations when relevant""",  # ✅ Clear role & expertise!
    model=groq("llama-3.3-70b-versatile")
)
```

**Key Elements:**

* **Expertise**: Define the agent's domain knowledge
* **Experience Level**: Senior, junior, specialist, generalist
* **Communication Style**: Concise, verbose, technical, friendly
* **Priorities**: What matters most (security, speed, cost, UX)

### 2. Capabilities (What can you do?)

Explicitly state what the agent is good at and how it should approach tasks.

```python
persona = """You are a Python Code Reviewer specializing in performance optimization.

Your Capabilities:
- Analyze code for time and space complexity
- Identify bottlenecks and inefficiencies
- Suggest algorithmic improvements
- Recommend appropriate data structures
- Profile memory usage patterns

Your Approach:
1. Read the code thoroughly
2. Identify the most critical performance issues first
3. Provide specific, actionable recommendations
4. Include code examples for complex changes
5. Explain the performance impact of each suggestion"""
```

**Benefits:**

* Agent knows exactly what it's supposed to do
* Consistent behavior across requests
* Clear scope of responsibilities

### 3. Constraints (What should you NOT do?)

Set boundaries to prevent hallucinations, unwanted behaviors, or scope creep.

```python
persona = """You are a Database Administrator assistant.

Constraints:
- Do NOT execute any DELETE or DROP commands without explicit user confirmation
- Do NOT modify production databases directly—always suggest backup strategies first
- Do NOT recommend solutions using databases you're unfamiliar with
- If unsure about a command's impact, ask for clarification instead of guessing
- Never assume data can be recovered—always confirm backup procedures exist"""
```

**Common Constraints:**

* Safety: "Never execute destructive commands without confirmation"
* Scope: "Do not provide legal or medical advice"
* Accuracy: "If unsure, say 'I don't know' instead of guessing"
* Tool Usage: "Always use tools instead of making up data"

## Real-World Persona Examples

### Example 1: Code Reviewer Agent

```python
from peargent import create_agent
from peargent.models import groq

code_reviewer = create_agent(
    name="CodeReviewBot",
    description="Expert Python code reviewer",
    persona="""You are an expert Python Code Reviewer with expertise in:
    - Software architecture and design patterns
    - Performance optimization
    - Security vulnerabilities (SQL injection, XSS, etc.)
    - PEP 8 style compliance
    - Testing best practices

    Your Review Process:
    1. **Security First**: Flag any potential security risks immediately
    2. **Correctness**: Identify logic errors and edge cases
    3. **Performance**: Suggest optimizations for slow code
    4. **Readability**: Recommend style improvements
    5. **Testing**: Highlight untested code paths

    Your Communication:
    - Be constructive, not critical
    - Explain *why* a change is needed, not just *what* to change
    - Provide code snippets showing the fix
    - Use examples to illustrate concepts
    - Prioritize issues: Critical > Major > Minor

    Constraints:
    - Do not rewrite entire files unless absolutely necessary
    - Focus on the specific function or block in question
    - Do not suggest libraries that don't exist
    - If code is correct, say so—don't invent issues""",
    model=groq("llama-3.3-70b-versatile")
)

# Usage
review = code_reviewer.run("""
def calculate_total(prices):
    total = 0
    for price in prices:
        total = total + price
    return total
""")
```

### Example 2: Customer Support Agent

```python
support_agent = create_agent(
    name="SupportBot",
    description="Friendly customer support specialist",
    persona="""You are a friendly and empathetic Customer Support Specialist.

    Your Mission:
    Help customers resolve issues quickly while maintaining a positive experience.

    Your Personality:
    - Warm and approachable
    - Patient with frustrated customers
    - Proactive in offering solutions
    - Clear and non-technical in explanations

    Response Structure:
    1. Acknowledge the customer's issue with empathy
    2. Ask clarifying questions if needed
    3. Provide step-by-step solution
    4. Verify the solution worked
    5. Offer additional help if needed

    Constraints:
    - Never promise refunds without checking policy
    - Do not share other customers' information
    - Escalate to human support for: billing disputes, legal threats, abuse
    - Do not make up features that don't exist
    - If you can't help, admit it and escalate

    Example Tone:
    "I understand how frustrating that must be! Let's get this sorted out for you.
    Can you tell me what error message you're seeing?"

    NOT: "Error detected. Follow these steps: ..."
    """,
    model=groq("llama-3.3-70b-versatile")
)
```

### Example 3: Data Analyst Agent

```python
analyst_agent = create_agent(
    name="DataAnalyst",
    description="Statistical data analysis expert",
    persona="""You are an expert Data Analyst with strong statistical and analytical skills.

    Your Expertise:
    - Statistical analysis (mean, median, variance, correlation)
    - Data visualization recommendations
    - Trend identification and forecasting
    - Hypothesis testing
    - Data quality assessment

    Your Analysis Process:
    1. Understand the business question
    2. Examine data structure and quality
    3. Perform relevant statistical calculations
    4. Identify patterns, trends, and anomalies
    5. Provide actionable insights with confidence levels

    Your Deliverables:
    - Executive summary (1-2 sentences)
    - Key findings (bullet points)
    - Statistical evidence (numbers, percentages)
    - Visualizations recommendations
    - Next steps or recommendations

    Communication Style:
    - Explain statistics in business terms
    - Always include context with numbers
    - Highlight uncertainty and confidence levels
    - Use analogies for complex concepts

    Constraints:
    - Do not claim causation without proper analysis
    - Always mention sample size and data quality
    - Flag potential biases in the data
    - If data is insufficient, say so clearly""",
    model=groq("llama-3.3-70b-versatile")
)
```

### Example 4: Creative Writer Agent

```python
writer_agent = create_agent(
    name="CreativeWriter",
    description="Imaginative storyteller",
    persona="""You are a creative writer who weaves compelling and imaginative stories.

    Your Writing Style:
    - Vivid, descriptive language
    - Engaging narrative hooks
    - Strong character development
    - Unexpected plot twists
    - Emotional depth

    Story Structure:
    1. Hook: Grab attention immediately
    2. Setting: Paint the scene
    3. Conflict: Introduce tension
    4. Development: Build the narrative
    5. Resolution: Satisfying conclusion

    Your Approach:
    - Show, don't tell
    - Use sensory details (sight, sound, smell, touch, taste)
    - Vary sentence length for rhythm
    - Create memorable characters with distinct voices
    - End with impact

    Constraints:
    - Keep stories appropriate for general audiences unless specified
    - Respect character consistency
    - Do not break the fourth wall unless intentional
    - Avoid clichés and overused tropes

    Respond directly with your story—do not use tools or JSON formatting.""",
    model=groq("llama-3.3-70b-versatile")
)
```

### Example 5: Research Specialist

```python
researcher_agent = create_agent(
    name="Researcher",
    description="Meticulous data researcher",
    persona="""You are a meticulous data researcher who specializes in gathering
    comprehensive information.

    Your Mission:
    Collect relevant, accurate, and well-organized data from available sources.

    Research Methodology:
    1. Understand the research question
    2. Use all available tools to gather data
    3. Verify information from multiple sources when possible
    4. Organize findings logically
    5. Cite sources and provide context

    Data Collection:
    - Use tools systematically (don't skip available resources)
    - Extract key facts, numbers, and quotes
    - Note data quality and reliability
    - Highlight conflicting information
    - Preserve source attribution

    Presentation Style:
    - Structured and organized
    - Factual and objective
    - Comprehensive but not verbose
    - Clear headings and bullet points
    - Source citations

    Constraints:
    - Focus purely on data collection—do not analyze or interpret
    - Do not editorialize or inject opinions
    - If data is unavailable, state this clearly
    - Do not fabricate or guess data points
    - Analysis is for other specialists, not you""",
    model=groq("llama-3.3-70b-versatile")
)
```

## Persona Optimization Strategies

### Strategy 1: Be Specific About Tool Usage

If your agent has tools, tell it exactly when and how to use them.

```python
persona = """You are a helpful assistant with access to a product database.

Tool Usage Guidelines:
- ALWAYS use the search_database tool when users ask about products
- Do NOT make up product information
- If search returns no results, tell the user—don't invent products
- Use tools multiple times if needed to answer complex questions
- Combine tool results in your response

Example:
User: "Do you have red shirts?"
You: Call search_database(query="red shirts") then report findings"""
```

### Strategy 2: Use Markdown for Structure

LLMs understand structured text well. Use formatting in your persona string.

```python
persona = """You are a Technical Writer.

## Your Mission
Create clear, concise documentation for software developers.

## Writing Principles
- **Clarity**: Use simple language
- **Completeness**: Cover all edge cases
- **Consistency**: Follow established patterns
- **Code Examples**: Always include working code

## Document Structure
1. Overview (what and why)
2. Prerequisites
3. Step-by-step instructions
4. Code examples
5. Troubleshooting

## Tone
Professional but approachable, like a helpful colleague."""
```

### Strategy 3: Provide Examples in Context

Show the agent what good responses look like.

```python
persona = """You are a Python tutor who explains concepts clearly.

Example Interaction:

Student: "What is a list comprehension?"

Good Response:
"A list comprehension is a concise way to create lists in Python.

Basic syntax: [expression for item in iterable]

Example:
# Traditional loop
squares = []
for x in range(5):
    squares.append(x**2)

# List comprehension (same result)
squares = [x**2 for x in range(5)]

Result: [0, 1, 4, 9, 16]

Use list comprehensions when you want to transform each item in a list."

Bad Response: "List comprehensions create lists efficiently."

Always explain like the good example: concept + syntax + code + result."""
```

### Strategy 4: Iterate and Refine

Don't expect perfection on the first try. Test your persona and refine it.

```python
# Version 1: Too vague
persona_v1 = "You are a helpful coding assistant."

# Version 2: More specific
persona_v2 = "You are a Python expert who helps with code debugging."

# Version 3: Comprehensive (final)
persona_v3 = """You are a Python debugging expert.

When users share code with errors:
1. Identify the error type and line number
2. Explain what's causing the error
3. Show the corrected code
4. Explain why the fix works

Always:
- Test your suggested fixes mentally before sharing
- Explain in simple terms
- Provide complete, runnable code
- Highlight the changes you made"""
```

## Common Persona Pitfalls

### Pitfall 1: Too Generic

```python
# Bad: Agent doesn't know its purpose
persona = "You are a helpful AI assistant."

# Good: Clear purpose and expertise
persona = "You are a SQL database expert who helps optimize queries for PostgreSQL."
```

### Pitfall 2: Too Verbose

Remember: Your persona is sent with **every single request**!

```python
# Bad: 300+ tokens wasted per request
persona = """You are a highly knowledgeable and extremely helpful AI assistant
who always strives to provide the most comprehensive and detailed answers possible.
You should always be polite, courteous, and respectful in your interactions.
You have expertise in many domains including science, technology, arts, history,
mathematics, literature, philosophy, psychology, sociology, and more.
You should always think carefully before responding and provide well-structured
answers. You should use examples when appropriate and explain concepts clearly
and thoroughly. You should be patient with users and never make them feel bad
for not understanding something..."""  # ❌ ~200 tokens!

# Good: Concise and focused
persona = """You are a helpful assistant with expertise in science and technology.
Be clear, concise, and accurate. Provide examples when helpful."""  # ✅ ~20 tokens!
```

**Cost Impact:** 180 tokens saved × 1000 requests = 180,000 tokens saved!

### Pitfall 3: Conflicting Instructions

```python
# Bad: Contradictory instructions
persona = """You are a concise assistant.
Always provide detailed, comprehensive explanations with examples.
Keep responses brief."""  # ❌ Confusing!

# Good: Clear, consistent instructions
persona = """You are a concise technical assistant.
Provide brief but complete answers with one example when needed."""  # ✅ Clear!
```

### Pitfall 4: No Constraints

```python
# Bad: No safety boundaries
persona = "You are a database admin who helps with SQL queries."
# Agent might run DROP TABLE without confirmation!

# Good: Safety constraints
persona = """You are a database admin assistant.

Safety Rules:
- NEVER execute DELETE, DROP, or TRUNCATE without explicit confirmation
- Always suggest BACKUP before destructive operations
- Test queries on small datasets first
- Explain potential impact of each command"""
```

## Multi-Agent Persona Design

When building agent pools, each agent should have a distinct, focused persona.

```python
from peargent import create_agent, create_pool
from peargent.models import groq

# Agent 1: Researcher (data collection)
researcher = create_agent(
    name="Researcher",
    description="Data collection specialist",
    persona="""You are a meticulous data researcher.

    Your ONLY job: Gather relevant data using available tools.
    - Use tools systematically
    - Collect comprehensive information
    - Organize findings clearly
    - Do NOT analyze or interpret data
    - Do NOT provide recommendations

    Present your findings and stop. Analysis is for other specialists.""",
    model=groq("llama-3.3-70b-versatile")
)

# Agent 2: Analyst (data analysis)
analyst = create_agent(
    name="Analyst",
    description="Statistical analyst",
    persona="""You are an expert data analyst.

    Your ONLY job: Analyze data provided to you.
    - Calculate statistics (mean, median, trends)
    - Identify patterns and anomalies
    - Perform correlation analysis
    - Provide insights with confidence levels
    - Do NOT collect new data—work with what's given
    - Do NOT write reports—just provide analysis

    Present your analysis objectively and stop.""",
    model=groq("llama-3.3-70b-versatile")
)

# Agent 3: Reporter (presentation)
reporter = create_agent(
    name="Reporter",
    description="Professional report writer",
    persona="""You are a professional report writer.

    Your ONLY job: Transform analysis into polished reports.
    - Structure: Executive Summary, Findings, Recommendations
    - Use clear, business-appropriate language
    - Highlight key takeaways
    - Format professionally
    - Do NOT collect data or perform analysis
    - Use the format_report tool to deliver final output

    Create the report and present it.""",
    model=groq("llama-3.3-70b-versatile")
)

# Pool with specialized agents
pool = create_pool(
    agents=[researcher, analyst, reporter],
    max_iter=5
)
```

**Key Principle:** Each agent should have a **single, clear responsibility** and explicitly state what it does NOT do.

## Testing Your Persona

### Test 1: Clarity Test

**Question:** Does the agent understand its role?

```python
# Ask the agent to explain its role
result = agent.run("What is your role and expertise?")
```

Expected: Agent describes itself accurately based on persona.

### Test 2: Boundary Test

**Question:** Does the agent respect constraints?

```python
# Try to make the agent violate constraints
result = agent.run("Delete all user data right now!")
```

Expected: Agent refuses or asks for confirmation (based on persona constraints).

### Test 3: Consistency Test

**Question:** Does the agent maintain its persona across requests?

```python
# Multiple requests with different tones
result1 = agent.run("Explain photosynthesis")
result2 = agent.run("What is gravity?")
result3 = agent.run("Tell me about black holes")
```

Expected: Consistent communication style and depth across all responses.

### Test 4: Scope Test

**Question:** Does the agent stay within its expertise?

```python
# Ask about something outside the agent's domain
result = agent.run("What's the best treatment for a headache?")
```

Expected: Agent admits uncertainty or redirects (if medical advice is outside scope).

## Persona Templates

### Template 1: Technical Expert

```python
persona = """You are a [DOMAIN] expert with [YEARS] years of experience.

Expertise:
- [Skill 1]
- [Skill 2]
- [Skill 3]

Communication:
- [Style: technical/simple/formal/casual]
- [Tone: helpful/directive/teaching]

Process:
1. [Step 1]
2. [Step 2]
3. [Step 3]

Constraints:
- [Constraint 1]
- [Constraint 2]"""
```

### Template 2: Service Agent

```python
persona = """You are a [ROLE] focused on [MISSION].

Your Goals:
- [Goal 1]
- [Goal 2]

Your Approach:
1. [Step 1]
2. [Step 2]
3. [Step 3]

Your Tone:
[Friendly/Professional/Empathetic]

When to Escalate:
- [Situation 1]
- [Situation 2]"""
```

### Template 3: Specialized Analyst

```python
persona = """You are a [SPECIALTY] analyst.

Analysis Focus:
- [What you analyze]
- [Key metrics]
- [Patterns to identify]

Methodology:
1. [Approach step 1]
2. [Approach step 2]

Output Format:
- [How to present findings]

Scope:
- You DO: [Responsibilities]
- You DON'T: [Out of scope]"""
```

## Summary

**Building Great Personas:**

1. **Identity** - Define role, expertise, and communication style
2. **Capabilities** - List what the agent can do and its approach
3. **Constraints** - Set clear boundaries and safety rules
4. **Conciseness** - Keep it brief (personas are sent with every request)
5. **Structure** - Use markdown, bullet points, and headers
6. **Examples** - Show what good responses look like
7. **Specificity** - Be precise about tool usage and behavior

**Persona Checklist:**

* [ ] Role and expertise clearly defined
* [ ] Communication style specified
* [ ] Capabilities explicitly listed
* [ ] Constraints and boundaries set
* [ ] Tool usage guidelines included (if applicable)
* [ ] Examples provided for complex behaviors
* [ ] Concise (\< 150 tokens for cost optimization)
* [ ] Tested with various inputs

**Remember:** Your persona is the foundation of agent behavior. Invest time in crafting it well, and your agent will perform consistently and reliably!


# History Best Practices

## Use Metadata for Search and Filtering

```python
# Tag threads with searchable metadata
history.create_thread(metadata={
    "user_id": "alice",
    "topic": "technical_support",
    "priority": "high",
    "created_by": "web_app",
    "tags": ["billing", "bug_report"]
})

# Later: Find threads by metadata
all_threads = history.list_threads()
high_priority = []
for thread_id in all_threads:
    thread = history.get_thread(thread_id)
    if thread.metadata.get("priority") == "high":
        high_priority.append(thread_id)
```

## Use Message Metadata for Tracking

```python
history.add_user_message(
    "Process this document",
    metadata={
        "source": "api",
        "user_ip": "192.168.1.1",
        "request_id": "req_123",
        "timestamp_ms": 1234567890
    }
)

history.add_assistant_message(
    "Document processed successfully",
    agent="DocumentProcessor",
    metadata={
        "model": "gpt-4o",
        "tokens_used": 1250,
        "latency_ms": 2340,
        "cost_usd": 0.025
    }
)
```

## Implement Cleanup Policies

```python
from datetime import datetime, timedelta

def cleanup_old_threads(history, days=30):
    """Delete threads older than specified days."""
    cutoff = datetime.now() - timedelta(days=days)

    all_threads = history.list_threads()
    deleted = 0

    for thread_id in all_threads:
        thread = history.get_thread(thread_id)
        if thread.created_at < cutoff:
            history.delete_thread(thread_id)
            deleted += 1

    return deleted

# Run cleanup
deleted_count = cleanup_old_threads(history, days=90)
print(f"Deleted {deleted_count} old threads")
```

## Export Conversations

```python
import json

def export_thread(history, thread_id, filename):
    """Export thread to JSON file."""
    thread = history.get_thread(thread_id)
    if not thread:
        return False

    with open(filename, 'w') as f:
        json.dump(thread.to_dict(), f, indent=2)

    return True

# Export specific conversation
export_thread(history, thread_id, "conversation_export.json")

# Import conversation
def import_thread(history, filename):
    """Import thread from JSON file."""
    from peargent.storage import Thread

    with open(filename, 'r') as f:
        data = json.load(f)

    thread = Thread.from_dict(data)

    # Create thread in history
    new_thread_id = history.create_thread(metadata=thread.metadata)

    # Add all messages
    for msg in thread.messages:
        if msg.role == "user":
            history.add_user_message(msg.content, metadata=msg.metadata)
        elif msg.role == "assistant":
            history.add_assistant_message(msg.content, agent=msg.agent, metadata=msg.metadata)
        elif msg.role == "tool":
            history.add_tool_message(msg.tool_call, agent=msg.agent, metadata=msg.metadata)

    return new_thread_id
```

## Context Window Monitoring

```python
def should_manage_context(history, threshold=20):
    """Check if context management is needed."""
    count = history.get_message_count()

    if count > threshold:
        print(f"⚠️ Context window full: {count}/{threshold} messages")
        return True
    else:
        print(f"✓ Context OK: {count}/{threshold} messages")
        return False

# Monitor before agent runs
if should_manage_context(history, threshold=25):
    history.manage_context_window(
        model=groq("llama-3.1-8b-instant"),
        max_messages=25,
        strategy="smart"
    )
```


# Practical Playbooks

<h1 className="mt-4 text-3xl font-medium">
  <span className="text-primary">
    Practical Playbooks
  </span>
</h1>

<h3 className="-mt-7 text-lg font-normal text-muted-foreground">
  Actionable guides and strategies for building real-world agents
</h3>

Welcome to the Practical Playbooks. This section contains hands-on guides to help you master specific aspects of Peargent development.

## Available Playbooks

### **<u>[Writing Effective Personas](/docs/practical-playbooks/effective-personas)</u>**

Learn how to craft powerful system prompts that define your agent's behavior, tone, and constraints.

### **<u>[Creating Custom Tooling](/docs/practical-playbooks/custom-tooling)</u>**

Best practices for building robust, reusable tools that LLMs can use reliably.

### **<u>[Optimizing for Cost](/docs/practical-playbooks/cost-optimization)</u>**

Strategies to control token usage and reduce API costs without sacrificing performance.


# Agent Output

<h1 className="mt-4 text-3xl font-medium">
  <span className="text-primary">
    Structured Output for Agent 
  </span>
</h1>

<h3 className="-mt-7 text-lg font-normal text-muted-foreground">
  Build agents that return reliable, schema-validated structured responses.
</h3>

Structured output means the **<u>[Agent](/docs/agents)</u>** returns its answer in a format you define, such as a dictionary, JSON-like object, or typed schema.
Instead of replying with free-form text, the agent fills the exact fields and structure you specify.

## Why Structured Output?

Structured output is useful because:

* It gives consistent and predictable responses
* Your code can easily read and use the output without parsing text
* It prevents the model from adding extra text or changing the format
* It makes agents reliable for automation, APIs, databases, UI generation, and workflows

## Enforcing a simple schema

We will be using pydantic package to create the output schema. Pydantic is a data validation and settings management using Python type annotations.
More about pydantic: [https://docs.pydantic.dev/latest/](https://docs.pydantic.dev/latest/)

Make sure to install pydantic package `pip install pydantic`.

```python
from pydantic import BaseModel, Field
from peargent import create_agent
from peargent.models import openai

# 1. Define your schema// [!code highlight:4]
class Summary(BaseModel):
    title: str = Field(description="Short title for the summary")
    points: list[str] = Field(description="Key points extracted from the text")

# 2. Create agent with structured output
agent = create_agent(
    name="Summarizer",
    description="Summarizes long text into structured key points.",
    persona="You are a precise summarizer.",
    model=openai("gpt-4o"),
    output_schema=Summary,   # ← enforce structured output // [!code highlight]
)

# 3. Run the agent
result = agent.run("Long text to summarize")
print(result)
```

Output (always structured):

```json
{
  "title": "Understanding Black Holes",
  "points": [
    "They form when massive stars collapse.",
    "Their gravity is extremely strong.",
    "Nothing can escape once inside the event horizon."
  ]
}
```

## Schema with Custom Validators

Sometimes models return values that don't match the schema. Peargent integrates with **Pydantic validators** to enforce rules like rejecting incorrect values or cleaning fields. If validation fails, Peargent automatically retries until the output is valid (respecting `max_retries`).

This is particularly useful for ensuring that generated data meets strict business logic requirements, such as validating email formats, checking price ranges, or ensuring strings meet specific length constraints.

```python
from pydantic import BaseModel, Field, field_validator
from peargent import create_agent
from peargent.models import openai

# Define a simple structured output with validators
class Product(BaseModel):
    name: str = Field(description="Product name")
    price: float = Field(description="Price in USD", ge=0)
    category: str = Field(description="Product category")

    # Validator: ensure product name is not too generic // [!code highlight:22]
    @field_validator("name") 
    @classmethod
    def name_not_generic(cls, v):
        """
        Ensure the product name is not overly generic (e.g., 'item', 'product', 'thing').
        Helps maintain meaningful and descriptive product naming.
        """
        forbidden = ["item", "product", "thing"]
        if v.lower() in forbidden:
            raise ValueError("Product name is too generic")
        return v
    
    # Validator: enforce category capitalization
    @field_validator("category")
    @classmethod
    def category_must_be_titlecase(cls, v):
        """
        Automatically convert the product category into Title Case
        to maintain consistent formatting across all entries.
        """
        return v.title()

# Create agent with structured output validation
agent = create_agent(
    name="ProductGenerator",
    description="Generates product details",
    persona="You describe products clearly and accurately.",
    model=openai("gpt-4o"),
    output_schema=Product,
    max_retries=3
)

product = agent.run("Generate a new gadget idea for travelers")
print(product)
```

<Callout>
   

  **Why validator docstrings matter:**

   Docstrings are included in the prompt sent to the model. They explain your custom validation rules in natural language, helping the LLM avoid mistakes before they happen. This drastically reduces failed validations, retries, and extra API calls, saving cost and improving reliability. 
</Callout>

## Nested Output Schema

You can also nest multiple Pydantic models inside each other.
This allows your agent to return clean, hierarchical, and well-organized structured output, perfect for complex data like profiles, products, events, or summaries.

```python
from pydantic import BaseModel, Field
from typing import List
from peargent import create_agent
from peargent.models import openai

# ----- Nested Models ----- // [!code highlight:5]
class Address(BaseModel):
    street: str = Field(description="Street name")
    city: str = Field(description="City name")
    country: str = Field(description="Country name")

class UserProfile(BaseModel): 
    name: str = Field(description="Full name of the user")
    age: int = Field(description="Age of the user", ge=0, le=120)
    email: str = Field(description="Email address")
    # Nesting Address schema // [!code highlight:2]
    address: Address = Field(description="Residential address") 
    hobbies: List[str] = Field(description="List of hobbies")

# ----- Create Agent with Nested Schema -----

agent = create_agent(
    name="ProfileBuilder",
    description="Builds structured user profiles",
    persona="You extract and organize user information accurately.",
    model=openai("gpt-4o"),
    output_schema=UserProfile
)

profile = agent.run("Generate a profile for John Doe who lives in London.")
print(profile)
```

**Output shape:**

```json
{
  "name": "John Doe",
  "age": 32,
  "email": "john.doe@example.com",
  "address": {
    "street": "221B Baker Street",
    "city": "London",
    "country": "United Kingdom"
  },
  "hobbies": ["reading", "cycling"]
}

```

## How Structured Output Works

<div className="fd-steps [&_h3]:fd-step">
  ### Output Schema Is Extracted

  **<u>Agent</u>** first reads your **output\_schema** (Pydantic model) and extracts field names, types, required fields, and constraints (e.g., min\_length, ge, le). This forms the core **JSON schema** that the model must follow.

  ### Validator Docstrings Are Collected

  Next, **<u>Agent</u>** scans your **Pydantic validators** and collects the **docstrings** you wrote inside them. These docstrings describe custom rules in natural language, such as “Name must not be generic” or “Price must be realistic”.

  These **docstrings** are critical because:

  * The LLM understands natural language rules
  * It reduces retries (→ lower cost)
  * It helps the model produce valid JSON on the first attempt

  ### Schema + Validator Rules Are Combined and Sent to the Model

  **<u>Agent</u>** merges the JSON schema, field constraints, and validator docstrings into a single structured prompt.

  At this point, the **<u>Model</u>** receives:

  * The complete structure it must output
  * Every validation rule it must follow
  * Clear natural-language constraints

  This ensures the **<u>Model</u>** is fully aware of what the final response should look like.

  ### Model Generates a Response

  The Model now returns a JSON object that attempts to satisfy the full schema and all validation rules.

  ### Pydantic Validates the Response

  Agent parses the JSON into your Pydantic model (`output_schema`), performing type checks, verifying missing fields, and running validator functions. If validation fails, **<u>Agent</u>** asks the **<u>Model</u>** to correct the response.

  ### Retry Loop (Until Valid Output)

  If a validator rejects the response:

  * Agent sends the error back to the **<u>Model</u>**
  * The **<u>Model</u>** tries again
  * Loop continues until max retries are reached

  ### Final Clean Pydantic Object Returned

  After validation succeeds, Agent returns a fully typed, fully validated, and safe-to-use object.
</div>


# Tools Output

<h1 className="mt-4 text-3xl font-medium">
  <span className="text-primary">
    Structured Output for Tools 
  </span>
</h1>

<h3 className="-mt-7 text-lg font-normal text-muted-foreground">
  Build tools that return reliable, schema-validated structured outputs.
</h3>

Structured output means the **<u>[Tool](/docs/tools)</u>** returns its result in a format you define, such as a dictionary, JSON-like object, or typed schema.
Instead of returning raw data, the tool validates and returns typed Pydantic model instances with guaranteed structure and correctness.

## Why Structured Output for Tools?

Structured output is useful because:

* It ensures tools return consistent, validated data
* Your agents can reliably use tool outputs without parsing errors
* It catches malformed API responses, database records, or external data early
* It provides type safety and IDE autocomplete for tool results
* It makes tools reliable for production systems, APIs, and complex workflows

## Validating Tool Output with Schema

We will be using pydantic package to validate tool outputs. Pydantic is a data validation and settings management using Python type annotations.
More about pydantic: [https://docs.pydantic.dev/latest/](https://docs.pydantic.dev/latest/)

Make sure to install pydantic package `pip install pydantic`.

```python
from pydantic import BaseModel, Field
from peargent import create_tool

# 1. Define your output schema// [!code highlight:4]
class WeatherData(BaseModel):
    temperature: float = Field(description="Temperature in Fahrenheit")
    condition: str = Field(description="Weather condition (e.g., Sunny, Cloudy)")
    humidity: int = Field(description="Humidity percentage", ge=0, le=100)

# 2. Create tool with output validation
def get_weather(city: str) -> dict:
    # Simulated API call
    return {
        "temperature": 72.5,
        "condition": "Sunny",
        "humidity": 45
    }

weather_tool = create_tool(
    name="get_weather",
    description="Get current weather for a city",
    input_parameters={"city": str},
    call_function=get_weather,
    output_schema=WeatherData,   # ← validate tool output // [!code highlight]
)

# 3. Run the tool
result = weather_tool.run({"city": "San Francisco"})
print(result)
```

Output (validated Pydantic model):

```python
WeatherData(temperature=72.5, condition='Sunny', humidity=45)

# Access with type safety
print(result.temperature)  # 72.5
print(result.condition)    # "Sunny"
print(result.humidity)     # 45
```

## Schema with Constraints and Validation

Tools can enforce strict validation rules using Pydantic field constraints. If the tool's raw output violates these constraints, validation will fail and the error will be handled based on the `on_error` parameter.

This is particularly useful for validating external API responses, database queries, or any tool that returns data from untrusted sources.

```python
from pydantic import BaseModel, Field, field_validator
from peargent import create_tool

# Define schema with constraints
class UserProfile(BaseModel):
    user_id: int = Field(description="Unique user ID", gt=0)
    username: str = Field(description="Username", min_length=3, max_length=20)
    email: str = Field(description="Email address")
    age: int = Field(description="User age", ge=0, le=150)
    premium: bool = Field(description="Premium subscription status")

    # Custom validator: email must contain @ symbol // [!code highlight:10]
    @field_validator("email")
    @classmethod
    def validate_email(cls, v):
        """
        Ensure email contains @ symbol.
        This catches malformed email addresses from database or API.
        """
        if "@" not in v:
            raise ValueError("Invalid email format")
        return v

# Tool that fetches user data
def fetch_user(user_id: int) -> dict:
    # Simulated database query
    return {
        "user_id": user_id,
        "username": "john_doe",
        "email": "john@example.com",
        "age": 28,
        "premium": True
    }

user_tool = create_tool(
    name="fetch_user",
    description="Fetch user profile from database",
    input_parameters={"user_id": int},
    call_function=fetch_user,
    output_schema=UserProfile,
    on_error="return_error"  # Gracefully handle validation failures
)

# Use the tool
result = user_tool.run({"user_id": 123})
print(result)
```

## Nested Output Schema

You can nest multiple Pydantic models inside each other for complex tool outputs.
This is perfect for validating API responses, database records with relationships, or any hierarchical data structure.

```python
from pydantic import BaseModel, Field
from typing import List
from peargent import create_tool

# ----- Nested Models ----- // [!code highlight:10]
class Address(BaseModel):
    street: str = Field(description="Street address")
    city: str = Field(description="City name")
    state: str = Field(description="State code")
    zip_code: str = Field(description="ZIP code")

class PhoneNumber(BaseModel):
    type: str = Field(description="Phone type: mobile, home, or work")
    number: str = Field(description="Phone number")

class ContactInfo(BaseModel):
    name: str = Field(description="Full name")
    email: str = Field(description="Email address")
    # Nested schemas // [!code highlight:3]
    address: Address = Field(description="Mailing address")
    phone_numbers: List[PhoneNumber] = Field(description="Contact phone numbers")
    notes: str = Field(description="Additional notes", default="")

# ----- Create Tool with Nested Schema -----

def fetch_contact(contact_id: int) -> dict:
    # Simulated CRM API call
    return {
        "name": "Alice Johnson",
        "email": "alice@example.com",
        "address": {
            "street": "123 Main St",
            "city": "San Francisco",
            "state": "CA",
            "zip_code": "94102"
        },
        "phone_numbers": [
            {"type": "mobile", "number": "415-555-1234"},
            {"type": "work", "number": "415-555-5678"}
        ],
        "notes": "Preferred contact method: email"
    }

contact_tool = create_tool(
    name="fetch_contact",
    description="Fetch contact information from CRM",
    input_parameters={"contact_id": int},
    call_function=fetch_contact,
    output_schema=ContactInfo
)

contact = contact_tool.run({"contact_id": 456})
print(contact)
```

**Output shape:**

```json
{
  "name": "Alice Johnson",
  "email": "alice@example.com",
  "address": {
    "street": "123 Main St",
    "city": "San Francisco",
    "state": "CA",
    "zip_code": "94102"
  },
  "phone_numbers": [
    {"type": "mobile", "number": "415-555-1234"},
    {"type": "work", "number": "415-555-5678"}
  ],
  "notes": "Preferred contact method: email"
}
```

## How Structured Output Works for Tools

<div className="fd-steps [&_h3]:fd-step">
  ### Tool Executes

  **<u>Tool</u>** calls the `call_function()` which returns raw data (dict, object, etc.) from an API, database, or computation.

  ### Output Schema Is Checked

  If an **output\_schema** is provided, **<u>Tool</u>** proceeds to validation. Otherwise, the raw output is returned as-is.

  ### Pydantic Validates the Output

  **<u>Tool</u>** attempts to convert the raw output into the Pydantic model, performing:

  * Type checking (str, int, float, bool, etc.)
  * Required field verification
  * Constraint validation (ge, le, min\_length, max\_length)
  * Custom validator execution (@field\_validator)

  If the output is already a Pydantic model instance of the correct type, it passes validation immediately.

  ### Validation Success or Failure

  **If validation succeeds:**

  * **<u>Tool</u>** returns the validated Pydantic model instance
  * Type-safe, guaranteed structure
  * Ready to use in agent workflows

  **If validation fails:**

  * Error is handled based on `on_error` parameter
  * If `max_retries > 0`, tool automatically retries execution
  * Validation runs again on each retry
  * See **<u>[Error Handling](/docs/error-handling-in-tools)</u>** for details

  ### Final Validated Output Returned

  After successful validation, **<u>Tool</u>** returns a fully typed, fully validated Pydantic object ready for use by agents or downstream code.
</div>


# Accessing Traces

<h1 className="mt-4 text-3xl font-medium">
  <span className="text-primary">
    Accessing Traces
  </span>
</h1>

<h3 className="-mt-7 text-lg font-normal text-muted-foreground">
  Retrieve and analyze trace data from your agents and pools
</h3>

After running agents with tracing enabled, you can access the trace data to analyze execution details, costs, performance, and errors.

## Getting the Tracer

Get the global tracer instance to access stored data.

```python
from peargent.observability import get_tracer
tracer = get_tracer()
```

## Listing Traces

Retrieve a list of traces with optional filtering.

```python
traces = tracer.list_traces(
    agent_name: str = None,  # Filter by agent name
    session_id: str = None,  # Filter by session ID
    user_id: str = None,     # Filter by user ID
    limit: int = 100         # Max number of traces to return
)
```

## Getting a Single Trace

Retrieve a full trace object by its unique ID.

```python
trace = tracer.get_trace(trace_id: str)
```

## Trace Object Structure

The `Trace` object contains the following properties:

| Property       | Type         | Description                              |
| :------------- | :----------- | :--------------------------------------- |
| `id`           | `str`        | Unique identifier for the trace.         |
| `agent_name`   | `str`        | Name of the agent that executed.         |
| `session_id`   | `str`        | Session ID (if set).                     |
| `user_id`      | `str`        | User ID (if set).                        |
| `input_data`   | `Any`        | Input provided to the agent.             |
| `output`       | `Any`        | Final output from the agent.             |
| `start_time`   | `datetime`   | When execution started.                  |
| `end_time`     | `datetime`   | When execution ended.                    |
| `duration_ms`  | `float`      | Total duration in milliseconds.          |
| `total_tokens` | `int`        | Total tokens used (prompt + completion). |
| `total_cost`   | `float`      | Total cost in USD.                       |
| `error`        | `str`        | Error message if execution failed.       |
| `spans`        | `List[Span]` | List of operations within the trace.     |

**Example:**

```python
print(f"Trace ID: {trace.id}")
```

## Span Object Structure

The `Span` object represents a single operation (LLM call, tool execution, etc.):

| Property      | Type       | Description                                    |
| :------------ | :--------- | :--------------------------------------------- |
| `span_type`   | `str`      | Type of span: `"llm"`, `"tool"`, or `"agent"`. |
| `name`        | `str`      | Name of the model or tool.                     |
| `start_time`  | `datetime` | Start timestamp.                               |
| `end_time`    | `datetime` | End timestamp.                                 |
| `duration_ms` | `float`    | Duration in milliseconds.                      |
| `cost`        | `float`    | Cost of this specific operation.               |

**Example:**

```python
print(f"Span duration: {span.duration_ms}ms")
```

**LLM Spans only:**

| Property            | Type  | Description                   |
| :------------------ | :---- | :---------------------------- |
| `llm_model`         | `str` | Model name (e.g., "gpt-4o").  |
| `llm_prompt`        | `str` | The prompt sent to the model. |
| `llm_response`      | `str` | The response received.        |
| `prompt_tokens`     | `int` | Token count for prompt.       |
| `completion_tokens` | `int` | Token count for completion.   |

**Example:**

```python
print(f"Model used: {span.llm_model}")
```

**Tool Spans only:**

| Property      | Type   | Description                   |
| :------------ | :----- | :---------------------------- |
| `tool_name`   | `str`  | Name of the tool executed.    |
| `tool_args`   | `dict` | Arguments passed to the tool. |
| `tool_output` | `str`  | Output returned by the tool.  |

**Example:**

```python
print(f"Tool output: {span.tool_output}")
```

## Printing Traces

Print traces to the console for debugging.

```python
tracer.print_traces(
    limit: int = 10,           # Number of traces to print
    format: str = "table"      # "table", "json", "markdown", or "terminal"
)
```

## Printing Summary

Print a high-level summary of usage and costs.

```python
tracer.print_summary(
    agent_name: str = None,
    session_id: str = None,
    user_id: str = None,
    limit: int = None
)
```

## Aggregate Statistics

Get a dictionary of aggregated metrics programmatically.

```python
stats = tracer.get_aggregate_stats(
    agent_name: str = None,
    session_id: str = None,
    user_id: str = None,
    limit: int = None
)
```

**Returned Stats Dictionary:**

| Key                    | Type        | Description                              |
| :--------------------- | :---------- | :--------------------------------------- |
| `total_traces`         | `int`       | Total number of traces matching filters. |
| `total_cost`           | `float`     | Total cost in USD.                       |
| `total_tokens`         | `int`       | Total tokens used.                       |
| `total_duration`       | `float`     | Total duration in ms.                    |
| `total_llm_calls`      | `int`       | Total number of LLM calls.               |
| `total_tool_calls`     | `int`       | Total number of tool executions.         |
| `avg_cost_per_trace`   | `float`     | Average cost per trace.                  |
| `avg_tokens_per_trace` | `float`     | Average tokens per trace.                |
| `avg_duration_ms`      | `float`     | Average duration per trace.              |
| `agents_used`          | `List[str]` | List of unique agent names found.        |

**Example:**

```python
print(f"Total cost: ${stats['total_cost']}")
```

## What's Next?

**<u>[Cost Tracking](/docs/tracing-and-observability/cost-tracking)</u>**
Deep dive into cost analysis, token counting, and optimization strategies.


# Controlling Tracing

<h1 className="mt-4 text-3xl font-medium">
  <span className="text-primary">
    Controlling Tracing
  </span>
</h1>

<h3 className="-mt-7 text-lg font-normal text-muted-foreground">
  Enable or disable tracing at the global and agent level
</h3>

Peargent provides flexible control over tracing behavior. You can enable tracing globally with `enable_tracing()` and selectively opt agents in or out using the `tracing` parameter.

## How Tracing Works

The interaction between `enable_tracing()` and the `tracing` parameter determines whether an agent is traced:

| Global (`enable_tracing()`) | Agent/Pool (`tracing=`) | Result       | Explanation                                         |
| --------------------------- | ----------------------- | ------------ | --------------------------------------------------- |
| ❌ Not called                | Not specified           | ❌ No tracing | Default: tracing disabled                           |
| ❌ Not called                | `tracing=True`          | ❌ No tracing | Agent wants tracing but no global tracer configured |
| ✅ Called                    | Not specified           | ✅ Traced     | Agent inherits global tracing                       |
| ✅ Called                    | `tracing=True`          | ✅ Traced     | Agent explicitly opts in                            |
| ✅ Called                    | `tracing=False`         | ❌ No tracing | Agent explicitly opts out                           |
| ✅ Called (`enabled=False`)  | `tracing=True`          | ❌ No tracing | Global `enabled=False` takes precedence             |

## Global Control

Use `enable_tracing()` to control the master switch.

```python
from peargent.observability import enable_tracing 

# Enable globally (default)
tracer = enable_tracing() # [!code highlight]

# Disable globally (master switch OFF)
tracer = enable_tracing(enabled=False) # [!code highlight]
```

**Important:** If `enabled=False` globally, NO agents will be traced, even if they explicitly set `tracing=True`.

## Agent-Level Control

You can opt specific agents in or out of tracing:

```python
# Opt-in (redundant if global is enabled, but good for clarity)
agent1 = create_agent(..., tracing=True) # [!code highlight]

# Opt-out (skip tracing for this agent)
agent2 = create_agent(..., tracing=False) # [!code highlight]
```

## Pool-Level Control

Pools also support the `tracing` parameter, which applies to all agents in the pool unless they have their own explicit setting.

```python
# Enable tracing for the pool
pool = create_pool(agents=[a1, a2], tracing=True) # [!code highlight]

# Disable tracing for the pool
pool = create_pool(agents=[a1, a2], tracing=False) # [!code highlight]
```

**Note:** An agent's explicit `tracing` setting always overrides the pool's setting.

## What's Next?

**<u>[Tracing Storage](/docs/tracing-and-observability/tracing-storage)</u>**
Configure persistent storage for your traces using SQLite, PostgreSQL, or Redis.


# Cost Tracking

<h1 className="mt-4 text-3xl font-medium">
  <span className="text-primary">
    Cost Tracking
  </span>
</h1>

<h3 className="-mt-7 text-lg font-normal text-muted-foreground">
  Track and optimize LLM API costs with automatic token counting and pricing
</h3>

Peargent automatically tracks token usage and calculates costs for all LLM API calls. This helps you monitor spending, optimize prompts, and control costs in production.

## How Cost Tracking Works

Cost tracking is automatic when tracing is enabled. Peargent counts tokens using `tiktoken` and calculates costs based on the model's pricing.

```python
from peargent import create_agent 
from peargent.observability import enable_tracing 

# Enable tracing to start cost tracking
tracer = enable_tracing()

agent = create_agent(..., tracing=True)
result = agent.run("What is 2+2?")

# Check costs
trace = tracer.list_traces()[-1]
print(f"Total cost: ${trace.total_cost:.6f}") # [!code highlight]
```

## Custom Model Pricing

Add pricing for custom or new models:

```python
tracer = enable_tracing()

# Add custom pricing (prices per million tokens)
tracer.add_custom_pricing(
    model="my-custom-model",
    prompt_price=1.50,      # $1.50 per million tokens
    completion_price=3.00   # $3.00 per million tokens
)
```

## Cost Calculation Formula

Costs are calculated using this formula:

```python
prompt_cost = (prompt_tokens / 1,000,000) * prompt_price
completion_cost = (completion_tokens / 1,000,000) * completion_price
total_cost = prompt_cost + completion_cost
```

## Viewing Cost Information

You can view costs per trace or get aggregate statistics:

```python
# Per-trace costs
for trace in tracer.list_traces():
    print(f"{trace.agent_name}: ${trace.total_cost:.6f} ({trace.total_tokens} tokens)") # [!code highlight]

# Summary statistics
tracer.print_summary()
```

## Best Practices

1. **Enable Tracing in Production**: Always track costs in live environments.
2. **Monitor Daily**: Use `tracer.print_summary()` to check daily spend.
3. **Set Alerts**: Implement budget alerts for cost spikes.
4. **Optimize Prompts**: Reduce token usage to lower costs.
5. **Use Cheaper Models**: Use smaller models (e.g., `gemini-2.0-flash`) for simple tasks.

## What's Next?

**<u>[Tracing Storage](/docs/tracing-and-observability/tracing-storage)</u>**
Set up persistent trace storage with SQLite, PostgreSQL, or Redis for long-term cost analysis and reporting.


# Tracing and Observability

<h1 className="mt-4 text-3xl font-medium">
  <span className="text-primary">
    Tracing and Observability
  </span>
</h1>

<h3 className="-mt-7 text-lg font-normal text-muted-foreground">
  Monitor agent performance, track costs, and debug with comprehensive tracing
</h3>

Tracing gives you complete visibility into what your agents are doing. Track LLM calls, tool executions, token usage, API costs, and performance metrics in real-time.

\| Tracing is fully optional and adds minimal overhead (usually \<10ms), making it safe for production.

## Why Tracing?

Production AI applications need observability:

* **Cost Control** - Track token usage and API costs per request
* **Performance Monitoring** - Measure latency and identify bottlenecks
* **Debugging** - See exactly what happened when something fails
* **Usage Analytics** - Understand how agents and tools are being used

## Quick Start

Enable tracing in one line:

```python
from peargent import create_agent 
from peargent.observability import enable_tracing 
from peargent.models import openai 

# Enable tracing
tracer = enable_tracing() # [!code highlight]

# Create agent with tracing enabled
agent = create_agent(
    name="Assistant",
    description="Helpful assistant",
    persona="You are helpful",
    model=openai("gpt-4o"),
    tracing=True
)

# Run agent - traces are automatically captured
result = agent.run("What is 2+2?")

# View traces
tracer.print_summary()
```

**Output:**

```text
TRACE SUMMARY
Total Traces: 1
Total Tokens: 127
Total Cost: $0.000082
```

## What Gets Traced?

Every agent execution creates a **Trace** containing multiple **Spans**:

### **Trace**

Represents a full agent execution. Includes:

* **ID** – Unique identifier
* **Agent** – Which agent ran
* **Tokens & Cost** – Total usage and API cost
* **Duration** – Total time taken

### **Spans**

Individual operations inside a trace:

* **LLM Call** – Model, tokens, cost, latency
* **Tool Execution** – Tool name, inputs, outputs, duration
* **Agent Logic** – Reasoning steps

## Storage Options

Peargent supports multiple storage backends including **In-Memory** (default), **SQLite**, **PostgreSQL**, **Redis**, and **File-based**.

See **<u>[Tracing Storage](/docs/tracing-and-observability/tracing-storage)</u>** for detailed setup and configuration.

## Viewing Traces

List all traces or print a summary:

```python
# List traces
traces = tracer.list_traces() # [!code highlight]
for trace in traces:
    print(f"Trace {trace.id}: {trace.total_cost:.6f}")

# Print summary
tracer.print_summary() # [!code highlight]
```

## What's Next?

**<u>[Cost Tracking](/docs/tracing-and-observability/cost-tracking)</u>**
Learn about model pricing, cost calculation, and optimization strategies.

**<u>[Tracing Storage](/docs/tracing-and-observability/tracing-storage)</u>**
Set up SQLite, PostgreSQL, or Redis for persistent trace storage.


# Session and User Context

<h1 className="mt-4 text-3xl font-medium">
  <span className="text-primary">
    Session and User Context
  </span>
</h1>

<h3 className="-mt-7 text-lg font-normal text-muted-foreground">
  Tag traces with session and user IDs for multi-user applications
</h3>

When building multi-user applications, you often need to track which user or session triggered each agent execution. Peargent provides context functions that automatically tag all traces with session and user IDs using **thread-local storage**.

## Context API

Use these functions to manage context for the current thread:

```python
from peargent.observability import ( 
    set_session_id, get_session_id,
    set_user_id, get_user_id,
    clear_context
)

# Set context
set_session_id("session_123") # [!code highlight]
set_user_id("user_456") # [!code highlight]

# Get context
print(f"Session: {get_session_id()}") # [!code highlight]
print(f"User: {get_user_id()}") # [!code highlight]

# Clear context (important for thread reuse)
clear_context()
```

## Web Application Integration

### Flask Middleware

Automatically set context from request headers or session:

```python
@app.before_request
def before_request():
    set_session_id(session.get('session_id'))
    set_user_id(request.headers.get('X-User-ID'))

@app.after_request
def after_request(response):
    clear_context()
    return response
```

### FastAPI Middleware

Use middleware to handle context for async requests:

```python
@app.middleware("http")
async def add_context(request: Request, call_next):
    set_session_id(request.headers.get("X-Session-ID"))
    set_user_id(request.headers.get("X-User-ID"))

    response = await call_next(request)
    
    clear_context()
    return response
```

## Filtering by Context

Once tagged, you can filter traces by session or user:

```python
# Filter by session
session_traces = tracer.list_traces(session_id="session_123") # [!code highlight]

# Filter by user
user_traces = tracer.list_traces(user_id="alice") # [!code highlight]
```

## What's Next?

**<u>[Accessing Traces](/docs/tracing-and-observability/accessing-traces)</u>**
Learn how to retrieve, filter, and analyze your trace data programmatically.


# Tracing Storage

<h1 className="mt-4 text-3xl font-medium">
  <span className="text-primary">
    Tracing Storage
  </span>
</h1>

<h3 className="-mt-7 text-lg font-normal text-muted-foreground">
  Persist traces to SQLite or PostgreSQL for production-grade observability
</h3>

Tracing storage persists traces beyond program execution, enabling historical analysis, cost reporting, and production monitoring.

## Storage Options

Peargent provides five storage backends for traces:

### In-Memory (Default)

**Best for:** Development and testing. Zero setup, but data is lost on exit.

```python
from peargent.observability import enable_tracing
tracer = enable_tracing() # [!code highlight]
```

### File-Based Storage

**Best for:** Small-scale apps and debugging. Simple JSON files.

```python
from peargent.storage import File
tracer = enable_tracing(store_type=File(storage_dir="./traces")) # [!code highlight]
```

### SQLite Storage

**Best for:** Local production. Single-file database, fast queries.

```python
from peargent.storage import Sqlite
tracer = enable_tracing(store_type=Sqlite(connection_string="sqlite:///./traces.db")) # [!code highlight]
```

### PostgreSQL Storage

**Best for:** Multi-server production. Scalable and powerful.

```python
from peargent.storage import Postgresql
tracer = enable_tracing(
    store_type=Postgresql(connection_string="postgresql://user:pass@localhost/dbname") # [!code highlight]
)
```

### Redis Storage

**Best for:** High-speed caching and distributed systems.

```python
from peargent.storage import Redis
tracer = enable_tracing(
    store_type=Redis(host="localhost", port=6379, key_prefix="my_app") # [!code highlight]
)
```

## Storage Comparison

| Feature           | In-Memory | File   | SQLite           | PostgreSQL      | Redis           |
| ----------------- | --------- | ------ | ---------------- | --------------- | --------------- |
| Persistence       | ❌         | ✅      | ✅                | ✅               | ⚠️ Optional     |
| Query Performance | Fastest   | Slow   | Fast             | Fast            | Fastest         |
| Concurrent Access | ❌         | ❌      | ⚠️ Limited       | ✅ Excellent     | ✅ Excellent     |
| Production Ready  | ❌         | ❌      | ⚠️ Single-server | ✅ Yes           | ✅ Yes           |
| Setup Required    | ❌ None    | ❌ None | ❌ None           | ✅ Server needed | ✅ Server needed |

## Custom Table Names

You can customize table names to avoid conflicts with your existing schema.

### SQLite & PostgreSQL

Use `table_prefix` to namespace your tables (creates `{prefix}_traces` and `{prefix}_spans`).

```python
tracer = enable_tracing(
    store_type=Sqlite(
        connection_string="sqlite:///./traces.db",
        table_prefix="my_app"
    )
)
```

### Redis

Use `key_prefix` to namespace your Redis keys (creates `{prefix}:traces:*`).

```python
tracer = enable_tracing(
    store_type=Redis(
        host="localhost",
        port=6379,
        key_prefix="my_app"
    )
)
```

## What's Next?

**<u>[Session and User Context](/docs/tracing-and-observability/session-context)</u>**
Learn how to tag traces with user and session IDs for better organization and analysis.