# Agents

Agents

Explore how agents operate, how they use tools and memory, and how to structure them effectively in Peargent.

In simple terms, an Agent calls the **[Model](/docs/models)** to generate responses according to its defined behavior. Agents are the core units of work in Peargent, while **[Tools](/docs/tools)** provide the capabilities that help agents perform actions and tackle complex tasks. Agents can operate individually for simple tasks, or they can be combined into a **[Pool](/docs/pools)** of agents to handle more complex, multi-step workflows. ## Creating an Agent To create an agent, use the `create_agent` function from the peargent module. At minimum, you must define the agent’s `name`, `description`, and `persona`, and the `model` to use. Here is a simple example: ```python from peargent import create_agent from peargent.models import openai code_reviewer = create_agent( name="Code Reviewer", description="Reviews code for issues and improvements", persona=( "You are a highly skilled senior software engineer and code reviewer. " "Your job is to analyze code for correctness, readability, maintainability, and performance. " "Identify bugs, edge cases, and bad practices. Suggest improvements that follow modern Python " "standards and best engineering principles. Provide clear explanations and, when appropriate, " "offer improved code snippets. Always be concise, accurate, and constructive." ), model=openai("gpt-4") ) ``` Call `agent.run(prompt)` to perform an inference using the agent’s persona as the system prompt and your input as the user message. ```python response = code_reviewer.run("Review this Python function for improvements:\n\ndef add(a, b): return a+b") print(response) # The function is correct but could be optimized, here is the optimized version... ``` When running an agent individually, the `description` field is optional. However, it becomes mandatory when the agent is part of a **[Pool](/docs/pools)**. * Refer **[Tools](/docs/tools)** to learn how to use tools with agents. * Refer **[History](/docs/history)** to learn how to setup conversation memory for agents. * Refer **[Pool](/docs/pools)** to learn how to create a pool of agents. ## How does agent work?
### Start Execution `(agent.run())` When you call `agent.run(...)`, the agent prepares for a new interaction: it loads any previous conversation **[History](/docs/history)** (if enabled), begins **tracing** (if enabled), and registers the user's new input. ### Build the Prompt The agent constructs the full prompt by combining its **persona**, **[Tools](/docs/tools)**, prior conversation context, and optional **output schema**. This prompt is then sent to the configured **[Model](/docs/models)**. ### Model Generates a Response The model returns a response based on the prompt. The agent records this output and checks whether the model is requesting **tool calls**. ### Execute Tools (If Requested) If the response includes tool calls, the agent runs those tools (in **parallel** if multiple), collects their outputs, and then asks the model again using an updated prompt. This cycle continues until no more tool actions are required. ### Finalize the Result The agent checks whether it should stop (stop conditions met or max iterations reached). If an **output schema** was provided, the response is validated against it. Finally, the conversation is synced to **[History](/docs/history)** (if enabled), tracing is ended, and the final response is returned.
## Parameters | Parameter | Type | Description | Required | | :-------------- | :---------------- | :-------------------------------------------------------------------------------------------------------------------------------------------------- | :------- | | `name` | `str` | The name of the agent. | Yes | | `description` | `str` | A brief description of the agent's purpose. Required when using in a **[Pool](/docs/pools)**. | No\* | | `persona` | `str` | The system prompt defining the agent's personality and instructions. | Yes | | `model` | `Model` | The LLM model instance (e.g., `openai("gpt-4")`). | Yes | | `tools` | `list[Tool]` | A list of **[Tools](/docs/tools)** the agent can access. | No | | `stop` | `StopCondition` | Condition that determines when the agent should stop iterating (default: `limit_steps(5)`). | No | | `history` | `HistoryConfig` | Configuration for conversation **[History](/docs/history)**. | No | | `tracing` | `bool \| None` | Enable/disable tracing. `None` (default) inherits from global tracer if `enable_tracing()` was called, `True` explicitly enables, `False` opts out. | No | | `output_schema` | `Type[BaseModel]` | Pydantic model for structured output validation. | No | | `max_retries` | `int` | Maximum retries for `output_schema` validation (default: `3`). Only used when `output_schema` is provided. | No | \* Required when using the agent in a **[Pool](/docs/pools)**. To know more about `stop`, `tracing`, `output_schema`, and `max_retries`, refer to **Advanced Features**. # Examples

Examples

The following examples illustrate common ways to use Peargent across different workflows.

Build a multi-agent creative writing system where dedicated agents generate characters, plot structure, worldbuilding details, and dialogues to produce cohesive short stories from a single prompt. Develop an autonomous agent that turns natural-language tasks into runnable Python code, tests it in-process, fixes errors through iterative reasoning, and explains the final solution step-by-step. Create a Python-based code review system where multiple specialized agents (style, security, optimization, readability) analyze a file and produce a combined, actionable review report. Build an AI agent that analyzes entire folders of text, code, and markdown files to generate structured summaries, glossaries, cross-references, and a unified knowledge map, all from simple local input.
Contribute to Peargent by adding more examples to the docs. # History

History

Persistent conversation memory that allows agents to remember past interactions across sessions.

History is Peargent’s **persistent conversation memory** system. It allows agents and pools to remember past interactions across sessions, enabling continuity, context-awareness, and long-running workflows. Think of history like a **notebook** your agent writes in. Each message, tool call, and response is recorded so the agent can look back and recall what happened earlier. You can pass a `HistoryConfig` to any **[Agent](/docs/agents)** or **[Pool](/docs/pools)**. If a pool receives a history, it overrides individual agent histories so all agents share the same conversation thread. History can be stored using backends such as in-memory, file, SQLite, PostgreSQL, Redis, or custom storage backends. ## Creating History To create a history, you need to pass a `HistoryConfig` to the `create_agent` or `create_pool` function. HistoryConfig is a configuration object that allows you to configure the history of an agent or pool. By defautl HistoryConfig uses `InMemory()` storage backend (temporary storage - data is lost when program exits). ### Adding history to Agents: ```python from peargent import create_agent from peargent.history import HistoryConfig from peargent.models import openai agent = create_agent( name="Assistant", description="Helpful assistant with memory", persona="You are a helpful assistant.", model=openai("gpt-4o"), history=HistoryConfig() ) # First conversation agent.run("My name is Alice") # Later conversation - agent remembers agent.run("What's my name?") # Output: "Your name is Alice" ``` ### Adding history to Pools: ```python from peargent import create_pool from peargent.history import HistoryConfig pool = create_pool( agents=[agent1, agent2] history=HistoryConfig() ) # First conversation pool.run("My name is Alice") # Later conversation - agent remembers pool.run("What's my name?") # Output: "Your name is Alice" ``` ## How History Works
### Load Conversation When an agent begins a run, it loads the existing conversation thread from the configured **storage backend**. ### Append Messages Each new user message, tool call, and agent response is added to the conversation thread in order. ### Manage Context If the conversation grows beyond `max_context_messages`, the configured **[strategy](/docs/history#strategies)** (trim or summarize) is applied to keep the context window manageable. ### Persist Data All updates are saved back to the **[storage backend](/docs/history#storage-backends)**, ensuring the conversation history is retained across sessions and future runs.
Because history supports many advanced capabilities, custom storage backends, manual thread control, serialization, and low-level message operations, listing every option here would make this page too large. For deeper configuration and advanced usage, see **[Advanced History](/docs/Advanced%20History)**. ## Storage Backends History can be stored in different backends depending on your use case. Here are all supported backends available in Peargent: ```python from peargent.history import HistoryConfig from peargent.storage import InMemory, File, Sqlite, Postgresql, Redis # InMemory (Default) # - Fast, temporary storage # - Data is lost when the program exits history = HistoryConfig(store=InMemory()) # File (JSON files) # - Stores conversations as JSON on disk # - Good for local development or small apps history = HistoryConfig(store=File(storage_dir="./conversations")) # SQLite (Local database) # - Reliable, ACID-compliant # - Ideal for single-server production history = HistoryConfig( store=Sqlite( database_path="./chat.db", table_prefix="peargent" ) ) # PostgreSQL (Production database) # - Scalable, supports multi-server deployments history = HistoryConfig( store=Postgresql( connection_string="postgresql://user:pass@localhost/dbname", table_prefix="peargent" ) ) # Redis (Distributed + TTL) # - Fast, supports key expiration # - Ideal for cloud deployments and ephemeral memory history = HistoryConfig( store=Redis( host="localhost", port=6379, db=0, password=None, key_prefix="peargent" ) ) ``` To create a custom storage backend, refer to **[History Management - Custom Storage Backends](/docs/history-management/custom-storage)**. ## Auto Context Management When conversations become too long, Peargent automatically manages the context window to keep prompts efficient and within model limits. This behavior is controlled by the strategy you choose. ### Strategies `smart` (Default) Automatically decides whether to trim or summarize based on the size and importance of the overflow: * Small overflow → trim (fast) * Important tool calls → summarize * Large overflow → aggressive summarization ```python history = HistoryConfig( auto_manage_context=True, strategy="smart" ) ``` `trim_last` Keeps the most recent messages and removes the oldest. Fast and uses no LLM. ```python history = HistoryConfig( auto_manage_context=True, strategy="trim_last", max_context_messages=15 ) ``` `trim_first` Keeps older messages and removes the newer ones. ```python history = HistoryConfig( auto_manage_context=True, strategy="trim_first" ) ``` `summarize` Uses an LLM to summarize older messages, preserving context while reducing size. ```python history = HistoryConfig( auto_manage_context=True, strategy="summarize", summarize_model=gemini("gemini-2.5-flash") # Fast model for summaries ) ``` `summarize_model` is used only with `"summarize"` and `"smart"` strategies. If not provided, the **[Agent](/docs/agents)** 's model will be used. ## Parameters | Parameter | Type | Default | Description | Required | | :--------------------- | :------------ | :----------- | :----------------------------------------------------------------------------------- | :------- | | `auto_manage_context` | `bool` | `False` | Automatically manage context window when conversations get too long | No | | `max_context_messages` | `int` | `20` | Maximum messages before auto-management triggers | No | | `strategy` | `str` | `"smart"` | Context management strategy: `"smart"`, `"trim_last"`, `"trim_first"`, `"summarize"` | No | | `summarize_model` | `Model` | `None` | LLM model for summarization (defaults to agent's model if not provided) | No | | `store` | `StorageType` | `InMemory()` | Storage backend: `InMemory()`, `File()`, `Sqlite()`, `Postgresql()`, `Redis()` | No | Learn more about advanced history features including custom storage backends, manual thread control, and all available history methods in **[Advanced History](/docs/Advanced%20History)**. # Overview

Overview

About peargent.

Alt text for the image Peargent is a modern, simple, and powerful Python framework for building intelligent AI agents with production-grade features. It offers a clean, intuitive API for creating conversational agents that can use tools, maintain memory, collaborate with other agents, and scale reliably into production. Learn how to set up your first agent in just a few lines of code. Explore practical examples to understand Peargent's capabilities. Dive into the fundamental concepts that power Peargent. Discover multi-agent orchestration, persistent memory, and observability. ## What is Peargent? Peargent simplifies the process of building AI agents by providing: * **Flexible LLM Support** - Works seamlessly with OpenAI, Groq, Google Gemini, and Azure OpenAI * **Powerful Tool System** - Execute actions with built-in timeout, retries, and input/output validation * **Persistent Memory** - Multiple backends supported: in-memory, file, Sqlite, PostgreSQL, Redis * **Multi-Agent Orchestration** - Coordinate specialized agents for complex workflows * **Production-Ready Observability** - Built-in tracing, cost tracking, and performance metrics * **Type-Safe Structured Outputs** - Easily validate responses using Pydantic models ## How Does Peargent Work? Peargent lets you build individual agents or complex systems where each **[Agent](/docs/agents)** contributes specialized work while sharing context through a global **[State](/docs/states)**. Agents operate inside a **[Pool](/docs/pools)**, coordinated by a **[Router](/docs/routers)** that can run in round-robin or LLM-based mode to decide which agent handles each step. Agents use **[Tools](/docs/tools)** to take actions and update the shared State, while **[History](/docs/history)** persists reasoning and decisions to maintain continuity across the workflow. ## Why Peargent? Start with a basic agent in just a few lines: ```python from peargent import create_agent agent = create_agent( persona="You are a helpful assistant", model="gpt-4" ) response = agent.run("What is the capital of France?") print(response) ``` Scale to complex multi-agent systems with memory, tools, and observability: ```python from peargent import create_agent, create_tool, create_pool from peargent.history import HistoryConfig from peargent.storage import Sqlite # Create specialized agents with persistent memory researcher = create_agent( persona="You are a research expert", model="gpt-4", tools=[search_tool, analyze_tool], ) writer = create_agent( persona="You are a technical writer", model="gpt-4", ) # Orchestrate multiple agents pool = create_pool( agents=[researcher, writer], history=HistoryConfig( store=Sqlite(database_path="./pool_conversations/") ) ) result = pool.run("Research and write about quantum computing") print(result) ``` # Installation

Installation

Get started with installing Peargent in your Python environment.

To install Peargent **[python package](https://pypi.org/project/peargent/)**, you can use `pip` or `uv`. It's recommended to install **Peargent** inside a **virtual environment (venv)** to manage dependencies effectively. ```bash pip install peargent ``` ```bash uv pip install peargent ``` If you don't have a virtual environment set up, you can create one using the following commands: ```bash python -m venv venv source venv/bin/activate # On Windows use `venv\Scripts\activate` ``` After activating the virtual environment, run the pip install command above to install **Peargent**. Now that you have installed Peargent, proceed to the Quick Start guide to create your first AI agent. Learn about the fundamental concepts that power Peargent, including Agents, Tools, Memory, and more. # Long Term Memory

Long Term Memory

Agents can remember past interactions and key information across sessions.

## Coming Soon # Models

Models

Use different model providers in your Agents and Pools.

Models define which LLM your **[Agent](/docs/agents)** or **[Pool](/docs/pools)** uses. Peargent provides a simple, unified interface for connecting to different providers (OpenAI, Groq, Gemini, etc.). Think of a Model as the brain of your **[Agent](/docs/agents)** or **[Pool](/docs/pools)**, the thing that actually generates responses. ## Creating a Model Models are imported from peargent.models and created using simple factory functions: ```python from peargent.models import openai, groq, gemini, anthropic model = openai("gpt-4o") # or model = anthropic("claude-3-5-sonnet-20241022") ``` You can run the model directly to get a response: ```python response = model.generate("Hello, how are you?") print(response) ``` ### Passing model to Agent or Pool ```python from peargent import create_agent, create_pool from peargent.models import openai agent = create_agent( name="Researcher", description="You are a researcher who can answer questions about the world.", persona="You are a researcher who can answer questions about the world.", model=openai("gpt-4o") ) pool = create_pool( agents=[agent], model=openai("gpt-4o") ) ``` ## Supported Model Providers Peargent’s model support is continuously expanding. New providers and model families are added regularly, so expect this list to grow over time. OpenAI Groq Gemini Anthropic ```python from peargent.models import openai model = openai( model_name="gpt-4o", api_key="sk-", endpoint_url="https://api.openai.com/v1", parameters={} ) ``` ```python from peargent.models import groq model = groq( model_name="llama-3.3", api_key="", endpoint_url="https://api.groq.com/v1", parameters={} ) ``` ```python from peargent.models import gemini model = gemini( model_name="gemini-2.0-flash", api_key="AIzaSyB", endpoint_url="https://generativelanguage.googleapis.com/v1", parameters={} ) ``` ```python from peargent.models import anthropic model = anthropic( model_name="claude-3-5-sonnet-20241022", api_key="sk-ant-", endpoint_url="https://api.anthropic.com/v1/messages", parameters={} ) ``` # Pools

Pools

Learn how pools enable multi-agent collaboration and intelligent task routing.

A Pool coordinates multiple agents so they can work together on a task. It brings structure to multi-agent workflows by deciding how agents interact and how information flows between them. * Each **[Agent](/docs/agents)** focuses on a specific skill or responsibility and contributes its part of the work. * A shared **[State](/docs/states)** lets all agents access and update the same context, allowing them to build on each other’s progress. * The **[Router](/docs/routers)** decides which agent should act next, using either round-robin or intelligent LLM-based routing. ## Creating a Pool Use `create_pool()` function to coordinate multiple agents. The `agents` parameter accepts a list of all the agents you want to include in the pool. ```python from peargent import create_agent, create_pool from peargent.models import groq # researcher agent and write agent # Create pool pool = create_pool( agents=[researcher, writer], ) # Run the pool along with user input result = pool.run("Research and write about quantum computing") ``` ## How a Pool Works A Pool can be thought of as a controller that organizes multiple agents, maintains a shared **[State](/docs/states)** for them to collaborate through, and uses a **[Router](/docs/routers)** to decide which agent should act next.
### User Input Added to State The user’s message is written into the shared **[State](/docs/states)** so every agent can access it. ### Router Selects the Next Agent The **[Router](/docs/routers)** determines which agent should act next. ### Agent Executes and Updates State The selected agent processes the task, produces an output, and writes its result back into the **[State](/docs/states)**. ### Output Becomes Input for the Next Agent Each agent’s output is available in the **[State](/docs/states)**, allowing the next agent to build on prior work. ### Process Repeats Until Completion The cycle continues until the workflow is complete or the maximum number of iterations is reached.
## Model Selection By default, the pool uses the model of the first agent. You can also provide a `default_model` for the pool. Any agent without an explicitly set model will use this `default_model`. ```python from peargent import create_pool from peargent.models import openai # researcher agent, analyst agent and writer agent # Create pool pool = create_pool( agents=[researcher, analyst, writer], default_model=openai("gpt-5") ) ``` All the available models are listed in **[Models](/docs/models)**. ## Routing the Agents By default, pools use round-robin routing, where **[Agents](/docs/agents)** take turns in order. You can also plug in a custom router to make more intelligent decisions based on the task. For all routing options, including round-robin, LLM-based routing, and custom router functions, see the **[Routers](/docs/routers)** . ```python from peargent import create_pool pool = create_pool( agents=[researcher, analyst, writer], ) article = pool.run("Write an article about renewable energy trends") # Executes: researcher → analyst → writer ``` ## Max Iterations A pool runs for a fixed number of iterations, where each iteration represents one agent being routed, executed, and updating the state. Pools use a default limit of 5 iterations, but you can change this using the `max_iter` parameter. ```python from peargent import create_pool pool = create_pool( agents=[researcher, analyst, writer], max_iter=10 ) article = pool.run("Write an article about renewable energy trends") # Executes: researcher → analyst → writer → researcher → analyst → writer → ... ``` ## Parameters | Parameter | Type | Description | Required | | :-------------- | :------------------------- | :------------------------------------------------------------- | :------- | | `agents` | `list[Agent]` | List of agents in the pool | Yes | | `default_model` | `Model` | Default model for agents without one | No | | `router` | `RouterFn \| RoutingAgent` | Custom router function or routing agent (default: round-robin) | No | | `max_iter` | `int` | Maximum agent executions (default: `5`) | No | | `default_state` | `State` | Custom initial state object | No | | `history` | `HistoryConfig` | Shared conversation history across all agents | No | | `tracing` | `bool` | Enable tracing for all agents (default: `False`) | No | For advanced configuration like `history` and `tracing`, see the **Advanced Features**. # QuickStart

Quick start

Learn how to set up your first pear agent in just a few lines of code.

In this chapter, we will create a simple AI agent using Peargent.
Follow the steps below to get started quickly. ## Create Your First Agent
### Install Peargent It's recommended to install Peargent inside an **virtual environment (venv)**. ```bash pip install peargent ``` ### Create an Agent Now, let’s create our first **[Agent](/docs/agents)**. An agent is simply an AI-powered entity that behaves like an autonomous helper, it can think, respond, and perform tasks based on the role, personality, and instructions you provide.
Start by creating a Python file named `quickstart.py`. Using the `create_agent` function, we can assign our agent a `name`, a `description`, and a `persona` (its role, tone, and behaviour).
You will also need to specify the `model` parameter to choose which LLM the agent will use. In this example, we’ll use OpenAI’s `GPT-5` model. (**[Available models](/docs/models#supported-model-providers)**)
Your agent can be anything you imagine!
For this example, we’ll create a friendly agent who speaks like **William Shakespeare**. ```python from peargent import create_agent from peargent.models import openai agent = create_agent( name="ShakespeareBot", description="An AI agent that speaks like William Shakespeare.", persona="You are ShakespeareBot, a witty and eloquent assistant who communicates in the style of William Shakespeare.", model=openai("gpt-5") ) response = agent.run("What is the meaning of life?") print(response) ``` Before running the code, you will need to set your `OPENAI_API_KEY` inside your `.env` file. ```bash OPENAI_API_KEY="your_openai_api_key_here" ``` ### Run the Agent Now, run your `quickstart.py` script to see your agent in action! ```bash python quickstart.py ``` You should see a response from agent (ShakespeareBot), answering your question in Shakespearean style! Terminal Output ```text no-copy Ah, fair seeker of truth, thou question dost pierce the very veil of existence! The meaning of life, methinks, is not a single treasure buried in mortal sands, but a wondrous journey of love, virtue, and discovery. To cherish each breath, to learn from sorrow, and to weave kindness through the tapestry of thy days — therein lies life’s most noble purpose. ```

Congratulations! You have successfully created and run your first AI agent using Peargent. # Routers

Routers

Decide which agent in a pool should act.

A router decides which **[Agent](/docs/agents)** in the **[Pool](/docs/pools)** should run next. It examines the shared **[State](/docs/states)** or user input and chooses the agent best suited for the next step. Think of a router as the director of a movie set. Each agent is an actor with a specific role, and the router decides who steps into the scene at the right moment. In Peargent, you can choose from three routing strategies: * **Round-Robin Router** - agents take turns in order * **LLM-Based Routing Agent** - an LLM decides which agent acts next * **Custom Function-Based Router** - you define the routing logic yourself With a **Custom Function-Based Router**, you get complete control over how agents are selected. You can route in a fixed order, choose based on the iteration count, or make smart decisions using the shared state. For example: Sequential routing, Conditional routing, and State-based intelligent routing. Refer **Advance Features**. ## Round Robin Router (default) The **Round Robin Router** is the simplest and default routing strategy in Peargent. It cycles through agents in the exact order they are listed, giving each agent one turn before repeating, until the pool reaches the `max_iter` limit. This router requires no configuration and no LLM calls, making it predictable, fast, and cost-free. ```python from peargent import create_pool pool = create_pool( agents=[researcher, analyst, writer], # No router required — round robin is automatic max_iter=3 ) result = pool.run("Write about Quantum Physics") # Executes: researcher → analyst → writer ``` **Best for:** Simple sequential workflows, demos, testing, and predictable pipelines. ## LLM Based Routing Agent The **LLM-Based Routing Agent** uses a large language model to intelligently decide which agent should act next. Instead of following a fixed order or manual rules, the router examines the **conversation history**, **agent abilities**, and **workflow context** to choose the most appropriate agent at each step. This makes it ideal for dynamic, context-aware, and non-linear multi-agent workflows. ```python from peargent import create_routing_agent, create_pool from peargent.models import openai router = create_routing_agent( name="SmartRouter", model=openai("gpt-5"), persona="You intelligently choose the next agent based on the task.", agents=["Researcher", "Analyst", "Writer"] ) # chooses an agent or 'STOP' to end the pool pool = create_pool( agents=[researcher, analyst, writer], router=router, max_iter=5 ) result = pool.run("Research and write about quantum computing") ``` The descriptions you give your agents play a **crucial role** in LLM-based routing.\ The router uses these descriptions to understand each agent’s abilities and decide who should act next. ## Custom Function-Based Router Custom routers give you **full control** over how agents are selected.\ You define a Python function that inspects the shared `state`, the `call_count`, and the `last_result` to decide which agent goes next. This is ideal for **rule-based**, **deterministic**, or **cost-efficient** workflows. ```python from peargent import RouterResult def custom_router(state, call_count, last_result): # Your routing logic here for agent_name, agent_obj in state.agents.items(): # agent details are available here print(f"Agent: {agent_name}") print(f" Description: {agent_obj.description}") print(f" Tools: {list(agent_obj.tools.keys())}") print(f" Model: {agent_obj.model}") print(f" Persona: {agent_obj.persona}") # Return the name of the next agent, or None to stop return RouterResult("AgentName") ``` Custom routers unlock entirely new routing patterns, from rule-based flows to dynamic state-aware logic.\ To explore more advanced patterns and real-world examples, see **Advanced Features**. # States

States

A shared context used inside Pools for smarter routing decisions.

State is a shared workspace used only inside **[Pools](/docs/pools)**. It exists for the duration of a single pool.run() call and gives **[Routers](/docs/routers)** the information they need to make intelligent routing decisions. Think of State as a scratchpad that all agents in a pool share, **[Routers](/docs/routers)** can read and write to it, while **[Agents](/docs/agents)** and **[Tools](/docs/tools)** cannot. State is automatically created by the Pool and passed to: * **[Custom Router functions](/docs/routers#custom-function-based-router)** * **[Routing Agents](/docs/routers#llm-based-routing-agent)** So in that sense, **no configuration is required**. ## Where State Can Be Used ### Custom Router functions ```python def custom_router(state, call_count, last_result): # Read history last_message = state.history[-1]["content"] # Store workflow progress state.set("stage", "analysis") # Read agent capabilities print(state.agents.keys()) return RouterResult("Researcher") ``` Refer the **[API Reference of State](/docs/states#state-api-reference)** to know more about what can be stored in State. ### Manual State Creation (highly optional) ```python from peargent import State custom_state = State(data={"stage": "init"}) pool = create_pool( agents=[agent1, agent2], default_state=custom_state ) ``` ## State API Reference The `State` object provides a small but powerful API used inside Pools and Routers. ### Methods Methods give routers the ability to store and retrieve custom information needed for routing decisions. | Name | Type | Inputs | Returns | Description | | ----------------- | ------ | ------------------------------------------------- | ------- | ---------------------------------------------------------------------------------------------------- | | **`add_message`** | Method | `role: str`, `content: str`, `agent: str \| None` | `None` | Appends a message to `state.history` and persists it if a history manager exists. | | **`get`** | Method | `key: str`, `default: Any = None` | `Any` | Retrieves a value from the key-value store. Returns `default` if the key is missing. | | **`set`** | Method | `key: str`, `value: Any` | `None` | Stores a value in the key-value store. Useful for workflow tracking, flags, and custom router logic. | ### Attributes Attributes give routers visibility into what has happened so far (history, agents, persistent history, custom data). | Name | Type | Read/Write | Description | | --------------------- | ----------------------------- | -------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------ | | **`kv`** | `dict[str, Any]` | Read/Write *(via get/set recommended)* | Internal key-value store for custom state. Use `state.get()`/`state.set()` instead of accessing directly. | | **`history`** | `list[dict]` | Read-only (managed by Pool) | In-memory conversation history for the current pool run. Contains `role`, `content`, and optional `agent`. | | **`history_manager`** | `ConversationHistory \| None` | Read-only | Optional persistent history backend (SQLite, Redis, PostgreSQL, etc.). Used automatically by Pool. | | **`agents`** | `dict[str, Agent]` | Read-only | Mapping of agent names to their Agent objects. Useful for advanced routing logic (e.g., route based on tools or descriptions). | ### Message Structure (`state.history`) Each entry in `state.history` looks like: ```python { "role": "user" | "assistant" | "tool", "content": "message content", "agent": "AgentName" # only for assistant/tool messages } ``` # Tools

Tools

Enable agents to perform actions beyond text generation with tools.

Tools are actions that agents can perform to interact with the real world. They allow agents to go beyond text generation by enabling operations such as querying databases, calling APIs, performing calculations, reading files, or executing any Python function you define. Think of tools as the **hands and eyes** of your agent, while the model provides the reasoning **(the brain)**. Tools give the agent the ability to actually act and produce real results. When you create an **[Agent](/docs/agents)**, you pass in a list of available **[Tools](/docs/tools)**, and during execution the agent decides whether a tool is needed and invokes it automatically based on the model’s response. ## Creating a Tool Use `create_tool()` to wrap a Python function into a tool that an agent can call. Every tool must define a `name`, `description`, `input_parameters`, and a `call_function`. The `call_function` is the underlying Python function that will be executed when the agent invokes the tool. Below is a simple example tool that converts Celsius to Fahrenheit: ```python from peargent import create_tool def celsius_to_fahrenheit(c: float): return (c * 9/5) + 32 temperature_tool = create_tool( name="CelsiusToFahrenheit", description="Convert Celsius temperature to Fahrenheit", call_function=celsius_to_fahrenheit, input_parameters={"c": float}, # Important output_schema=float ) ``` ## Input Parameters Matter The `input_parameters` serve two critical purposes: 1. **Type Validation** - Peargent validates that the LLM provides the correct types before executing your function, preventing runtime errors 2. **LLM Guidance** - The parameter types help the LLM understand what arguments to provide when calling the tool ## Using Tools with Agents Tools can be passed to an agent during creation. The agent will automatically decide when a tool is needed and call it as part of its reasoning process. ```python from peargent import create_agent from peargent.models import openai agent = create_agent( name="UtilityAgent", description="Handles multiple utility tasks", persona="You are a helpful assistant.", model=openai("gpt-5"), tools=[ # You can pass one or multiple tools here temperature_tool, count_words_tool, summary_tool] ) response = agent.run("Convert 25 degrees Celsius to Fahrenheit.") # Agent automatically calls the tool and uses the result ``` ## Parameters | Parameter | Type | Description | Required | | :----------------- | :---------------- | :----------------------------------------------------------------------------------- | :------- | | `name` | `str` | Tool identifier | Yes | | `description` | `str` | What the tool does (helps LLM decide when to use it) | Yes | | `input_parameters` | `dict[str, type]` | Parameter names and types (e.g., `{"city": str}`) | Yes | | `call_function` | `Callable` | The Python function to execute | Yes | | `timeout` | `float \| None` | Max execution time in seconds (default: `None`) | No | | `max_retries` | `int` | Retry attempts on failure (default: `0`) | No | | `retry_delay` | `float` | Initial delay between retries in seconds (default: `1.0`) | No | | `retry_backoff` | `bool` | Use exponential backoff (default: `True`) | No | | `on_error` | `str` | Error handling: `"raise"`, `"return_error"`, or `"return_none"` (default: `"raise"`) | No | | `output_schema` | `Type[BaseModel]` | Pydantic model for output validation | No | For advanced configuration like `timeouts`, `retries`, `error-handling`, and `output validation`, see the **Advanced Features**. # Async Streaming

Async Streaming

Run multiple agents concurrently with non-blocking streaming

Async streaming allows your application to handle multiple agent requests at the same time without blocking. This is essential for: * **Web Servers**: Handling multiple user requests in FastAPI or Django. * **Parallel Processing**: Running multiple agents simultaneously (e.g., a Researcher and a Reviewer). ## Quick Start Use `astream()` with `async for` to stream responses asynchronously. ```python import asyncio from peargent import create_agent from peargent.models import openai agent = create_agent( name="AsyncAgent", description="Async streaming agent", persona="You are helpful.", model= openai("gpt-4o") ) async def main(): print("Agent: ", end="", flush=True) # Use 'async for' with 'astream' async for chunk in agent.astream("Hello, how are you?"): print(chunk, end="", flush=True) if __name__ == "__main__": asyncio.run(main()) ``` ## Running Agents Concurrently The real power of async comes when you run multiple things at once. Here is how to run two agents in parallel using `asyncio.gather()`. ```python import asyncio from peargent import create_agent from peargent.models import openai # Create two agents agent1 = create_agent(name="Agent1", persona="You are concise.", model=openai("gpt-4o")) agent2 = create_agent(name="Agent2", persona="You are verbose.", model=openai("gpt-4o")) async def run_agent(agent, query, label): print(f"[{label}] Starting...") async for chunk in agent.astream(query): # In a real app, you might send this to a websocket pass print(f"[{label}] Finished!") async def main(): # Run both agents at the same time await asyncio.gather( run_agent(agent1, "Explain Quantum Physics", "Agent 1"), run_agent(agent2, "Explain Quantum Physics", "Agent 2") ) asyncio.run(main()) ``` **Result**: Both agents start processing immediately. You don't have to wait for Agent 1 to finish before Agent 2 starts. ## Async with Metadata Just like the synchronous version, you can use `astream_observe()` to get metadata asynchronously. ```python async for update in agent.astream_observe("Query"): if update.is_token: print(update.content, end="") elif update.is_agent_end: print(f"\nCost: ${update.cost}") ``` ## Async Pools Pools also support async streaming, allowing you to run multi-agent workflows without blocking. ```python # Stream text chunks from a pool asynchronously async for chunk in pool.astream("Query"): print(chunk, end="", flush=True) # Stream rich updates from a pool asynchronously async for update in pool.astream_observe("Query"): if update.is_token: print(update.content, end="") ``` ## Web Server Example (FastAPI) Async streaming is the standard way to build AI endpoints in FastAPI. ```python from fastapi import FastAPI from fastapi.responses import StreamingResponse app = FastAPI() @app.get("/chat") async def chat(query: str): async def generate(): async for chunk in agent.astream(query): yield chunk return StreamingResponse(generate(), media_type="text/plain") ``` ## What's Next? **[Tracing & Observability](/docs/tracing-and-observability)** Learn how to monitor your async agents in production. # Streaming

Streaming

Stream agent responses in real-time for better user experience

Streaming allows you to display the agent's response token by token as it's being generated, rather than waiting for the entire response to complete. This creates a much more responsive and engaging user experience. ## Quick Start Use the `stream()` method to get an iterator that yields text chunks as they arrive. ```python from peargent import create_agent from peargent.models import openai agent = create_agent( name="StreamingAgent", description="An agent that streams responses", persona="You are helpful and concise.", model=openai("gpt-4o") ) # Stream response token by token print("Agent: ", end="", flush=True) for chunk in agent.stream("What is Python in one sentence?"): print(chunk, end="", flush=True) ``` **Output:** ```text Agent: Python is a high-level, interpreted programming language known for its readability and versatility. ``` ## Why Use Streaming? * **Lower Latency**: Users see the first words immediately, instead of waiting seconds for the full answer. * **Better UX**: The application feels alive and responsive. * **Engagement**: Users can start reading while the rest of the answer is being generated. ## When to Use `stream()` Use `agent.stream()` when you just need the **text content** of the response. * ✅ Chatbots and conversational interfaces * ✅ CLI tools requiring real-time feedback * ✅ Simple text generation tasks If you need metadata like **token usage**, **costs**, or **execution time**, use **[Stream Observe](/docs/Streaming/stream-observe)** instead. ## Streaming with Pools You can also stream responses from a **Pool** of agents. The pool will stream the output of whichever agent is currently executing. ```python from peargent import create_pool pool = create_pool( agents=[researcher, writer], router=my_router ) # Stream the entire multi-agent interaction for chunk in pool.stream("Research AI and write a summary"): print(chunk, end="", flush=True) ``` ## Best Practices 1. **Always Flush Output**: When printing to a terminal, use `flush=True` (e.g., `print(chunk, end="", flush=True)`) to ensure tokens appear immediately. 2. **Handle Empty Chunks**: Occasionally, a chunk might be empty. Your UI code should handle this gracefully. ## What's Next? **[Rich Streaming (Observe)](/docs/Streaming/stream-observe)** Learn how to get rich metadata like token counts, costs, and duration while streaming. **[Async Streaming](/docs/Streaming/async-streaming)** Run multiple agents concurrently or build high-performance web servers using async streaming. # Rich Streaming (Observe)

Rich Streaming (Observe)

Get metadata like tokens, cost, and duration while streaming

While `agent.stream()` gives you just the text, `agent.stream_observe()` provides **rich updates** containing metadata. This is essential for production applications where you need to track costs, monitor performance, or show progress indicators. ## Quick Start Use `stream_observe()` to receive `StreamUpdate` objects. You can check the type of update to handle text chunks and final metadata differently. ```python from peargent import create_agent from peargent.models import openai agent = create_agent( name="ObservableAgent", description="Agent with observable execution", persona="You are helpful.", model=openai("gpt-5") ) print("Agent: ", end="", flush=True) for update in agent.stream_observe("What is the capital of France?"): # 1. Handle text tokens if update.is_token: print(update.content, end="", flush=True) # 2. Handle completion (metadata) elif update.is_agent_end: print(f"\n\n--- Metadata ---") print(f"Tokens: {update.tokens}") print(f"Cost: ${update.cost:.6f}") print(f"Time: {update.duration:.2f}s") ``` **Output:** ```text Agent: The capital of France is Paris. --- Metadata --- Tokens: 15 Cost: $0.000012 Time: 0.45s ``` ## The StreamUpdate Object Each item yielded by `stream_observe()` is a `StreamUpdate` object with helpful properties: | Property | Description | | :--------------- | :------------------------------------------------------- | | `is_token` | `True` if this update contains a text chunk. | | `content` | The text chunk (only available when `is_token` is True). | | `is_agent_end` | `True` when the agent has finished generating. | | `tokens` | Total tokens used (available on `is_agent_end`). | | `cost` | Total cost in USD (available on `is_agent_end`). | | `duration` | Time taken in seconds (available on `is_agent_end`). | | `is_agent_start` | `True` when the agent starts working. | ## Update Types The `UpdateType` enum defines all possible event types during streaming: | Type | Description | | :------------ | :---------------------------------- | | `AGENT_START` | Agent execution started. | | `TOKEN` | A text chunk was generated. | | `AGENT_END` | Agent execution completed. | | `POOL_START` | Pool execution started. | | `POOL_END` | Pool execution completed. | | `TOOL_START` | Tool execution started. | | `TOOL_END` | Tool execution completed. | | `ERROR` | An error occurred during streaming. | ## Streaming with Pools When using `pool.stream_observe()`, you get additional event types to track the pool's lifecycle. ```python from peargent import UpdateType for update in pool.stream_observe("Query"): # Pool Events if update.type == UpdateType.POOL_START: print("[Pool Started]") # Agent Events (same as single agent) elif update.is_agent_start: print(f"\n[Agent: {update.agent}]") elif update.is_token: print(update.content, end="", flush=True) # Pool Finished elif update.type == UpdateType.POOL_END: print(f"\n[Pool Finished] Total Cost: ${update.cost}") ``` ## What's Next? **[Async Streaming](/docs/streaming/async-streaming)** Learn how to use these features in async environments for high concurrency. **[Tracing & Observability](/docs/tracing-and-observability)** For deep debugging and historical logs, combine streaming with Peargent's tracing system. # Built-in Tools

Built-in Tools

Ready-to-use tools that extend agent capabilities with powerful built-in functionality

Peargent provides a growing collection of built-in **[Tools](/docs/tools)** that solve common tasks without requiring custom implementations. These tools are production-ready, well-tested, and integrate seamlessly with **[Agents](/docs/agents)**. ## Why Built-in Tools? Built-in tools save development time and provide: * **Zero Configuration** - Import and use immediately, no setup required * **Production Ready** - Thoroughly tested and optimized for reliability * **Best Practices** - Built with proper error handling, validation, and security * **Consistent API** - Same interface patterns across all built-in tools * **Maintained** - Regular updates and improvements from the Peargent team ## Available Built-in Tools ### Text Extraction Tool Extract plain text and metadata from various document formats including HTML, PDF, DOCX, TXT, Markdown, and URLs. This tool enables agents to read and process content from different file types and web pages. Supported formats: HTML/XHTML, PDF, DOCX, TXT, Markdown, and URLs (with SSRF protection). **[Learn more about Text Extraction Tool →](/docs/built-in-tools/text-extraction)** ## Coming Soon More built-in tools are in development: * **Web Search Tool** - Search the web and retrieve relevant information * **Image Analysis Tool** - Extract text and analyze images * **File System Tool** - Read, write, and manage files safely * **HTTP Request Tool** - Make API calls with built-in retry logic Check the **[Peargent GitHub repository](https://github.com/Peargent/peargent/tree/main/peargent/tools)** for the latest updates. # Text Extraction Tool

Text Extraction Tool

Learn how to use the text extraction tool with Peargent agents

## Overview The Text Extraction Tool is a built-in Peargent **[Tool](/docs/tools)** that enables **[Agents](/docs/agents)** to extract plain text from various document formats. It supports HTML, PDF, DOCX, TXT, and Markdown files, as well as URLs. The tool can optionally extract metadata such as title, author, page count, and character counts. ### Supported Formats * **HTML/XHTML** - Web pages with metadata extraction (title, description, author) * **PDF** - PDF documents with metadata (title, author, subject, page count) * **DOCX** - Microsoft Word documents with document properties * **TXT** - Plain text files with automatic encoding detection * **Markdown** - Markdown files with title extraction from headers * **URLs** - HTTP/HTTPS web resources with built-in SSRF protection ## Usage with Agents The Text Extraction **[Tool](/docs/tools)** is most powerful when integrated with **[Agents](/docs/agents)**. Agents can use the tool to automatically extract and process document content. ### Creating an Agent with Text Extraction To use the text extraction tool with an agent, you need to configure it with a **[Model](/docs/models)** and pass the tool to the agent's `tools` parameter: ```python from peargent import create_agent from peargent.tools import text_extractor # [!code highlight] from peargent.models import gemini # Create an agent with text extraction capability agent = create_agent( name="DocumentAnalyzer", description="Analyzes documents and extracts key information", persona=( "You are a document analysis expert. When asked about a document, " "use the text extraction tool to extract its content, then analyze " "and summarize the information." ), model=gemini("gemini-2.5-flash-lite"), tools=[text_extractor] # [!code highlight] ) # Use the agent to analyze a document response = agent.run("Summarize the key points from document.pdf") print(response) ``` ## Examples ### Example 1: Extract Text with Metadata ```python from peargent.tools import text_extractor # Extract text and metadata from an HTML file result = text_extractor.run({ "file_path": "article.html", "extract_metadata": True # [!code highlight] }) if result["success"]: print(f"Title: {result['metadata']['title']}") print(f"Author: {result['metadata']['author']}") print(f"Word Count: {result['metadata']['word_count']}") print(f"Content:\n{result['text']}") else: print(f"Error: {result['error']}") ``` ### Example 2: Extract from URL ```python from peargent.tools import text_extractor # Extract text from a web page result = text_extractor.run({ "file_path": "https://example.com/article", # [!code highlight] "extract_metadata": True }) if result["success"]: print(f"Website Title: {result['metadata']['title']}") print(f"Content: {result['text'][:500]}...") ``` ### Example 3: Extract with Length Limit ```python from peargent.tools import text_extractor # Extract text but limit to 1000 characters result = text_extractor.run({ "file_path": "long_document.pdf", "extract_metadata": True, "max_length": 1000 # [!code highlight] }) print(f"Text (max 1000 chars): {result['text']}") ``` ### Example 4: Batch Processing Multiple Files ```python from peargent.tools import text_extractor import os documents = ["doc1.pdf", "doc2.docx", "doc3.html"] for file_path in documents: if os.path.exists(file_path): result = text_extractor.run({ "file_path": file_path, "extract_metadata": True }) if result["success"]: print(f"\n{file_path} ({result['format']})") print(f"Words: {result['metadata'].get('word_count', 'N/A')}") print(f"Preview: {result['text'][:150]}...") else: print(f"Error processing {file_path}: {result['error']}") ``` ### Example 5: Agent Document Analysis ```python from peargent import create_agent from peargent.tools import text_extractor # [!code highlight] from peargent.models import gemini # Create a document analysis agent agent = create_agent( name="ResearchAssistant", description="Analyzes research papers and extracts key information", persona=( "You are a research assistant specializing in document analysis. " "When given a document, extract its content and identify: " "1) Main topic, 2) Key findings, 3) Methodology, 4) Conclusions" ), model=gemini("gemini-2.5-flash-lite"), tools=[text_extractor] # [!code highlight] ) # Ask the agent to analyze a research paper response = agent.run( "Please analyze research_paper.pdf and provide a structured summary" ) print(response) ``` ## Common Use Cases 1. **Document Summarization**: Extract text from documents and have agents summarize them 2. **Information Extraction**: Extract specific information (emails, phone numbers, etc.) from documents 3. **Content Analysis**: Analyze document sentiment, topics, or keywords 4. **Batch Processing**: Process multiple documents programmatically 5. **Web Scraping**: Extract text from web pages while preserving structure 6. **Research Assistance**: Analyze research papers and academic documents 7. **Compliance Review**: Extract and review document contents for compliance checking ## Parameters The text extraction tool accepts the following parameters: * **file\_path** (string, required): Path to the file or URL to extract text from * **extract\_metadata** (boolean, optional, default: False): Whether to extract metadata like title, author, page count, etc. * **max\_length** (integer, optional): Maximum text length to return. If exceeded, text is truncated with "..." appended ## Return Value The tool returns a dictionary with the following structure: ```python { "text": "Extracted plain text content", "metadata": { "title": "Document Title", "author": "Author Name", # ... additional metadata depending on format }, "format": "pdf", # Detected file format "success": True, "error": None } ``` ## Metadata by Format Different document formats provide different metadata: **HTML/XHTML:** * `title` - Page title * `description` - Meta description tag * `author` - Meta author tag * `word_count` - Number of words * `char_count` - Number of characters **PDF:** * `title` - Document title * `author` - Document author * `subject` - Document subject * `creator` - Application that created the PDF * `producer` - PDF producer * `creation_date` - When the document was created * `page_count` - Total number of pages * `word_count` - Total word count * `char_count` - Total character count **DOCX:** * `title` - Document title * `author` - Document author * `subject` - Document subject * `created` - Creation date and time * `modified` - Last modification date and time * `word_count` - Total word count * `char_count` - Total character count * `paragraph_count` - Number of paragraphs **TXT/Markdown:** * `encoding` - Text encoding used * `word_count` - Total word count * `char_count` - Total character count * `line_count` - Total line count * `title` - (Markdown only) Title extracted from first heading ## Troubleshooting ### ImportError for document libraries If you encounter ImportError when extracting specific formats, install the required dependencies: ```bash # For all formats pip install peargent[text-extraction] # Or individually pip install beautifulsoup4 pypdf python-docx ``` ### SSRF Protection Errors If you receive "Access to localhost is not allowed" error, ensure you're using a public URL: ```python # This will fail result = text_extractor.run({"file_path": "http://localhost:8000/doc"}) # Use a public URL instead result = text_extractor.run({"file_path": "https://example.com/doc"}) ``` ### Encoding Issues with Text Files For text files with non-standard encoding, the tool automatically detects encoding. If issues persist, ensure the file is properly encoded. # Error Handling in Tools

Error Handling in Tools

Comprehensive error handling strategies for tools including retries, timeouts, and validation

Tools can fail for many reasons, network issues, timeouts, invalid API responses, or broken external services. Peargent provides a complete system to handle these failures gracefully. ## Error Handling with `on_error` The `on_error` parameter controls how tools handle errors (execution failures or validation failures): ```python from peargent import create_tool # Option 1: Raise exception (default) tool_strict = create_tool( name="critical_api", description="Critical API call that must succeed", input_parameters={"query": str}, call_function=call_api, on_error="raise" # Fail fast if error occurs // [!code highlight] ) # Option 2: Return error message as string tool_graceful = create_tool( name="optional_api", description="Optional API call", input_parameters={"query": str}, call_function=call_api, on_error="return_error" # Continue with error message // [!code highlight] ) # Option 3: Return None silently tool_silent = create_tool( name="analytics_tracker", description="Optional analytics tracking", input_parameters={"event": str}, call_function=track_event, on_error="return_none" # Ignore failures silently // [!code highlight] ) ``` ### When to Use Each Strategy | `on_error` Value | What Happens | What You Can Do | Use Case | Example | | ----------------------- | ---------------------------------------------------------------------------------------- | -------------------------------------------------------------------- | ------------------------------------------ | -------------------------------------------- | | **`"raise"`** (default) | Raises exception, stops agent execution | Wrap in `try/except` to catch and handle exception | Critical tools that must succeed | Database writes, payment processing | | **`"return_error"`** | Returns error message as string (e.g., `"Tool 'api_call' failed: ConnectionError: ..."`) | Check if result is string and contains error, then handle gracefully | Graceful degradation, logging errors | Optional external APIs, analytics | | **`"return_none"`** | Returns `None` silently, no error message | Check if result is `None`, then use fallback value or skip | Non-critical features, optional enrichment | Analytics tracking, optional data enrichment | ## Next: Advanced Error Handling Peargent provides more robust failure-handling features: * **[Retries](/docs/error-handling-in-tools/retries)**\ Automatically retry failing tools with optional exponential backoff. * **[Timeouts](/docs/error-handling-in-tools/timeout)**\ Prevent long-running or hanging operations. * **[Validation Failures](/docs/structured-output/tools-output#validating-tool-output-with-schema)**\ Handle schema validation errors when using `output_schema`. These pages go deeper into reliability patterns for production workloads. # Retries

Retries

Tools can automatically retry failed operations.

Retries are one of the simplest and most effective ways to handle errors in tools. Instead of failing immediately, a tool can automatically try again when something goes wrong. This makes your workflows more resilient, reduces unnecessary crashes, and improves overall reliability with minimal setup. ## Error Handling with Retries Below is how simple it is to enable retry logic in a tool. ```python from peargent import create_tool api_tool = create_tool( name="external_api", description="Call external API", input_parameters={"query": str}, call_function=call_external_api, max_retries=3, # Retry up to 3 times on failure // [!code highlight:3] retry_delay=1.0, # Initial delay: 1 second retry_backoff=True, # Exponential backoff: 1s → 2s → 4s on_error="return_error" ) ``` ## Retry Parameters | Parameter | Type | Default | Description | | --------------- | ----- | ------- | ---------------------------------------------------- | | `max_retries` | int | `0` | Number of retry attempts (0 = no retries) | | `retry_delay` | float | `1.0` | Initial delay between retries in seconds | | `retry_backoff` | bool | `True` | Doubles delay after each retry attempt (Exponential) | ## How Retry Works
### First Attempt: Tool executes normally. ### On Failure: If execution or validation fails: * If `max_retries > 0`, the tool waits for retry\_delay seconds * If `retry_backoff=True`, the wait time doubles each retry (1s → 2s → 4s → …) ### Repeat: Retries continue until: * A retry succeeds, **or** * All retry attempts are exhausted ### Final Failure: Handled according to the `on_error` strategy (`raise`, `return_error`, `return_none`).
## Retry Example with Backoff ```python unreliable_tool = create_tool( name="flaky_api", description="API that sometimes fails", input_parameters={"query": str}, call_function=call_flaky_api, max_retries=3, retry_delay=1.0, retry_backoff=True, on_error="return_error" ) # If all attempts fail, timing will be: # Attempt 1: Immediate # Attempt 2: +1 second (1.0 * 2^0) # Attempt 3: +2 seconds (1.0 * 2^1) # Attempt 4: +4 seconds (1.0 * 2^2) # Final: Return error message ``` # Timeout

Timeout

Timeouts let you set a maximum allowed execution time for a tool.

If the **Tool** takes longer than the configured time, its execution is stopped and handled using on\_error. **Timeouts** are extremely useful for: * Preventing tools from hanging forever * Stopping slow external operations * Keeping agent response times predictable * Automatically failing or retrying long-running tasks ## Enable Timeout in a Tool ```python from peargent import create_tool slow_tool = create_tool( name="slow_operation", description="Operation that may take too long", input_parameters={"data": dict}, call_function=slow_processing, timeout=5.0, # Maximum 5 seconds allowed // [!code highlight] on_error="return_error" ) ``` ## How Timeout Works
1. Tool starts executing normally 2. A timer begins (based on timeout) 3. If execution finishes in time → result is returned 4. If it exceeds the timeout → * Execution is stopped * A TimeoutError is raised internally * Result is handled via on\_error 5. If combined with retries, timeout is applied on every attempt
## Timeout Example with Retries ```python robust_tool = create_tool( name="robust_api", description="API call with timeout + retries", input_parameters={"query": str}, call_function=call_api, timeout=10.0, # Max 10 seconds per attempt // [!code highlight] max_retries=3, # Retry if timed out retry_delay=2.0, retry_backoff=True, on_error="return_error" ) # Example timing if every attempt times out: # Attempt 1: 10s timeout # Wait 2s # Attempt 2: 10s timeout # Wait 4s # Attempt 3: 10s timeout # Wait 8s # Attempt 4: 10s timeout # → Final failure handled by on_error ``` # Custom Storage Backends

Custom Storage Backends

Create custom storage backends for Peargent history.

For production-grade backends with complex requirements, you can create a custom storage backend by subclassing `HistoryStore`. This allows you to persist conversation history in any database or storage system of your choice, such as MongoDB, PostgreSQL, Redis, or even a custom API. ## Subclassing HistoryStore To create a custom store, you need to implement the abstract methods defined in the `HistoryStore` class. Here is a comprehensive example using MongoDB. ### 1. Initialization and Setup First, set up your class and initialize the database connection. You should also ensure any necessary indexes are created for performance. ```python from peargent.storage import HistoryStore, Thread, Message from typing import Dict, List, Optional, Any from datetime import datetime class MongoDBHistoryStore(HistoryStore): """Custom MongoDB storage backend.""" def __init__(self, connection_string: str, database: str = "peargent"): from pymongo import MongoClient self.client = MongoClient(connection_string) self.db = self.client[database] self.threads = self.db.threads self.messages = self.db.messages # Create indexes for performance # Indexing 'id' ensures fast thread lookups self.threads.create_index("id", unique=True) # Compound index on 'thread_id' and 'timestamp' speeds up message retrieval self.messages.create_index([("thread_id", 1), ("timestamp", 1)]) ``` ### 2. Thread Management Implement methods to create, retrieve, and list threads. **Creating Threads:** When creating a thread, you must persist its ID, creation time, and any initial metadata. ```python def create_thread(self, metadata: Optional[Dict] = None) -> str: thread = Thread(metadata=metadata) self.threads.insert_one({ "id": thread.id, "created_at": thread.created_at, "updated_at": thread.updated_at, "metadata": thread.metadata }) return thread.id ``` **Retrieving Threads:** When retrieving a thread, you need to reconstruct the `Thread` object from your database record. Crucially, you must also load the associated messages and attach them to the thread. ```python def get_thread(self, thread_id: str) -> Optional[Thread]: thread_data = self.threads.find_one({"id": thread_id}) if not thread_data: return None thread = Thread( thread_id=thread_data["id"], metadata=thread_data.get("metadata", {}), created_at=thread_data["created_at"], updated_at=thread_data["updated_at"] ) # Load messages associated with this thread, sorted by timestamp messages = self.messages.find({"thread_id": thread_id}).sort("timestamp", 1) for msg_data in messages: msg = Message( role=msg_data["role"], content=msg_data["content"], agent=msg_data.get("agent"), tool_call=msg_data.get("tool_call"), metadata=msg_data.get("metadata", {}), message_id=msg_data["id"], timestamp=msg_data["timestamp"] ) thread.messages.append(msg) return thread ``` ### 3. Message Persistence Implement the logic to save new messages. **Appending Messages:** This method is called whenever a new message is added to the history. You should save the message and update the thread's `updated_at` timestamp. ```python def append_message( self, thread_id: str, role: str, content: Any, agent: Optional[str] = None, tool_call: Optional[Dict] = None, metadata: Optional[Dict] = None ) -> Message: message = Message( role=role, content=content, agent=agent, tool_call=tool_call, metadata=metadata ) self.messages.insert_one({ "id": message.id, "thread_id": thread_id, "timestamp": message.timestamp, "role": message.role, "content": message.content, "agent": message.agent, "tool_call": message.tool_call, "metadata": message.metadata }) # Update thread's updated_at timestamp self.threads.update_one( {"id": thread_id}, {"$set": {"updated_at": datetime.now()}} ) return message ``` ### 4. Utility Methods Implement the remaining utility methods for listing and deleting. ```python def get_messages(self, thread_id: str) -> List[Message]: """Retrieve all messages for a specific thread.""" thread = self.get_thread(thread_id) return thread.messages if thread else [] def list_threads(self) -> List[str]: """Return a list of all thread IDs.""" return [t["id"] for t in self.threads.find({}, {"id": 1})] def delete_thread(self, thread_id: str) -> bool: """Delete a thread and all its associated messages.""" result = self.threads.delete_one({"id": thread_id}) if result.deleted_count > 0: self.messages.delete_many({"thread_id": thread_id}) return True return False ``` ### Usage Once your class is defined, you can use it just like any built-in storage backend. #### With create\_agent (Automatic Integration) You can pass your custom storage backend directly to `create_agent` using `HistoryConfig`: ```python from peargent import create_agent, HistoryConfig from peargent.models import openai # Initialize your custom store store = MongoDBHistoryStore(connection_string="mongodb://localhost:27017") # Create agent with custom storage backend agent = create_agent( name="Assistant", description="A helpful assistant with MongoDB history", persona="You are a helpful AI assistant.", model=openai("gpt-4o"), history=HistoryConfig( auto_manage_context=True, max_context_messages=20, strategy="smart", store=store # Your custom storage backend ) ) # Use the agent - history is automatically managed response1 = agent.run("My name is Alice") # Agent creates thread and stores message in MongoDB response2 = agent.run("What's my name?") # Agent loads history from MongoDB and remembers: "Your name is Alice" ``` #### With create\_pool (Multi-Agent with Custom Storage) You can also use custom storage backends with agent pools for shared history across multiple agents: ```python from peargent import create_agent, create_pool, HistoryConfig from peargent.models import openai # Initialize your custom store store = MongoDBHistoryStore(connection_string="mongodb://localhost:27017") # Create multiple agents researcher = create_agent( name="Researcher", description="Researches topics thoroughly", persona="You are a detail-oriented researcher.", model=openai("gpt-4o-mini") ) writer = create_agent( name="Writer", description="Writes clear summaries", persona="You are a skilled technical writer.", model=openai("gpt-4o") ) # Create pool with custom storage - all agents share the same MongoDB history pool = create_pool( agents=[researcher, writer], default_model=openai("gpt-4o"), history=HistoryConfig( auto_manage_context=True, max_context_messages=25, strategy="smart", store=store # Shared custom storage for all agents ) ) # Use the pool result = pool.run("Research quantum computing and write a summary") # Both agents' interactions are stored in MongoDB ``` # History Management

History Management

Custom storage backends, manual thread control, and low-level history operations for advanced use cases.

This guide covers advanced history capabilities for developers who need fine-grained control over conversation persistence, custom storage implementations, and low-level message operations. ## Next Steps * See **[History](/docs/history)** for basic usage and auto-management * See **[Agents](/docs/agents)** for integrating history with agents * See **[Pools](/docs/pools)** for shared history across multiple agents # Manual Thread Management

Manual Thread Management

Manually control threads for multi-user applications.

While agents automatically manage threads, you can also control threads manually for multi-user applications or complex workflows. ## Creating and Switching Threads You can create multiple threads to handle different conversations simultaneously. By using `create_thread` with metadata, you can tag threads with user IDs or session info. The `use_thread` method then lets you switch the active context, ensuring that new messages go to the correct conversation. ```python from peargent import create_history from peargent.storage import Sqlite history = create_history(store_type=Sqlite(database_path="./app.db")) # Create threads for different users alice_thread = history.create_thread(metadata={"user_id": "alice", "session": "web"}) bob_thread = history.create_thread(metadata={"user_id": "bob", "session": "mobile"}) # Switch between threads history.use_thread(alice_thread) history.add_user_message("What's the weather?") history.use_thread(bob_thread) history.add_user_message("Show my orders") # List all threads all_threads = history.list_threads() print(f"Total threads: {len(all_threads)}") # Get thread with metadata thread = history.get_thread(alice_thread) print(f"User: {thread.metadata.get('user_id')}") print(f"Messages: {len(thread.messages)}") ``` ## Multi-User Application Pattern For applications serving multiple users, you need to ensure each user gets their own conversation history. This pattern shows how to use metadata to look up an existing thread for a user. If a thread is found, the agent resumes that conversation; if not, a new thread is created. This allows a single agent instance to handle many users concurrently. ```python from peargent import create_agent, create_history, HistoryConfig from peargent.storage import Postgresql from peargent.models import openai # Shared history store for all users history = create_history( store_type=Postgresql( connection_string="postgresql://user:pass@localhost/app_db" ) ) # Create agent (reused across users) agent = create_agent( name="Assistant", description="Customer support assistant", persona="You are a helpful customer support agent.", model=openai("gpt-4o") ) def handle_user_message(user_id: str, message: str): """Handle message from a specific user.""" # Find or create thread for this user all_threads = history.list_threads() user_thread = None for thread_id in all_threads: thread = history.get_thread(thread_id) if thread.metadata.get("user_id") == user_id: user_thread = thread_id break if not user_thread: # Create new thread for this user user_thread = history.create_thread(metadata={"user_id": user_id}) # Set active thread history.use_thread(user_thread) # Add user message history.add_user_message(message) # Get response from agent # Note: Agent needs to load this history manually or use temporary_memory response = agent.run(message) # Add assistant response history.add_assistant_message(response, agent="Assistant") return response # Usage response1 = handle_user_message("alice", "What's my order status?") response2 = handle_user_message("bob", "I need help with returns") response3 = handle_user_message("alice", "Thanks!") # Same thread as first message ``` # History API Reference

History API Reference

Complete reference for Thread, Message, and Context operations.

## Thread Operations ### create\_thread Creates a new conversation thread and sets it as the active thread. ```python thread_id = history.create_thread(metadata={ "user_id": "alice", "topic": "customer_support", "tags": ["billing", "urgent"] }) ``` **Parameters** | Name | Type | Default | Description | | :--------- | :--------------- | :------ | :----------------------------------------------------- | | `metadata` | `Optional[Dict]` | `None` | Dictionary of custom metadata to attach to the thread. | **Returns** * `str`: The unique ID of the created thread. *** ### use\_thread Switches the active context to an existing thread. ```python history.use_thread("thread-123-abc") ``` **Parameters** | Name | Type | Default | Description | | :---------- | :---- | :------ | :--------------------------------- | | `thread_id` | `str` | - | The ID of the thread to switch to. | *** ### get\_thread Retrieves a thread object. ```python thread = history.get_thread() print(f"Created: {thread.created_at}") ``` **Parameters** | Name | Type | Default | Description | | :---------- | :-------------- | :------ | :------------------------------------------------------------------------------------ | | `thread_id` | `Optional[str]` | `None` | The ID of the thread to retrieve. If not provided, returns the current active thread. | **Returns** * `Optional[Thread]`: The thread object, or `None` if not found. *** ### list\_threads Lists all available thread IDs in the storage. ```python all_threads = history.list_threads() print(f"Total threads: {len(all_threads)}") ``` **Returns** * `List[str]`: A list of all thread IDs. *** ### delete\_thread Deletes a thread and all its associated messages. ```python if history.delete_thread(thread_id): print("Thread deleted") ``` **Parameters** | Name | Type | Default | Description | | :---------- | :---- | :------ | :------------------------------ | | `thread_id` | `str` | - | The ID of the thread to delete. | **Returns** * `bool`: `True` if the thread was successfully deleted, `False` otherwise. ## Message Operations ### add\_user\_message Adds a user message to the current thread. ```python msg = history.add_user_message( "What's the weather today?", metadata={"source": "web"} ) ``` **Parameters** | Name | Type | Default | Description | | :--------- | :--------------- | :------ | :------------------------------- | | `content` | `str` | - | The text content of the message. | | `metadata` | `Optional[Dict]` | `None` | Custom metadata for the message. | **Returns** * `Message`: The created message object. *** ### add\_assistant\_message Adds an assistant response to the current thread. ```python msg = history.add_assistant_message( "The weather is sunny.", agent="WeatherBot", metadata={"model": "gpt-4o"} ) ``` **Parameters** | Name | Type | Default | Description | | :--------- | :--------------- | :------ | :-------------------------------------------------- | | `content` | `Any` | - | The content of the response (string or structured). | | `agent` | `Optional[str]` | `None` | Name of the agent that generated the response. | | `metadata` | `Optional[Dict]` | `None` | Custom metadata (e.g., tokens used, model name). | **Returns** * `Message`: The created message object. *** ### add\_tool\_message Adds a tool execution result to the current thread. ```python msg = history.add_tool_message( tool_call={ "name": "get_weather", "output": {"temp": 72} }, agent="WeatherBot" ) ``` **Parameters** | Name | Type | Default | Description | | :---------- | :--------------- | :------ | :----------------------------------------------------------------- | | `tool_call` | `Dict` | - | Dictionary containing tool execution details (name, args, output). | | `agent` | `Optional[str]` | `None` | Name of the agent that called the tool. | | `metadata` | `Optional[Dict]` | `None` | Custom metadata (e.g., execution time). | **Returns** * `Message`: The created message object. *** ### get\_messages Retrieves messages from a thread with optional filtering. ```python # Get only user messages user_messages = history.get_messages(role="user") ``` **Parameters** | Name | Type | Default | Description | | :---------- | :-------------- | :------ | :------------------------------------------------------ | | `thread_id` | `Optional[str]` | `None` | Target thread ID. Defaults to current thread. | | `role` | `Optional[str]` | `None` | Filter by role ("user", "assistant", "tool", "system"). | | `agent` | `Optional[str]` | `None` | Filter by agent name. | **Returns** * `List[Message]`: List of matching message objects. *** ### get\_message\_count Gets the total number of messages in a thread. ```python count = history.get_message_count() ``` **Parameters** | Name | Type | Default | Description | | :---------- | :-------------- | :------ | :-------------------------------------------- | | `thread_id` | `Optional[str]` | `None` | Target thread ID. Defaults to current thread. | **Returns** * `int`: The number of messages. *** ### delete\_message Deletes a specific message by its ID. ```python history.delete_message(message_id) ``` **Parameters** | Name | Type | Default | Description | | :----------- | :-------------- | :------ | :-------------------------------------------- | | `message_id` | `str` | - | The ID of the message to delete. | | `thread_id` | `Optional[str]` | `None` | Target thread ID. Defaults to current thread. | **Returns** * `bool`: `True` if deleted successfully. *** ### delete\_messages Deletes multiple messages at once. ```python history.delete_messages([msg_id_1, msg_id_2]) ``` **Parameters** | Name | Type | Default | Description | | :------------ | :-------------- | :------ | :-------------------------------------------- | | `message_ids` | `List[str]` | - | List of message IDs to delete. | | `thread_id` | `Optional[str]` | `None` | Target thread ID. Defaults to current thread. | **Returns** * `int`: The number of messages actually deleted. ## Context Management Operations ### trim\_messages Trims messages to manage context window size. ```python # Keep only the last 10 messages history.trim_messages(strategy="last", count=10) ``` **Parameters** | Name | Type | Default | Description | | :------------ | :-------------- | :------- | :------------------------------------------------------------------------- | | `strategy` | `str` | `"last"` | Strategy: `"last"` (keep recent), `"first"` (keep oldest), `"first_last"`. | | `count` | `int` | `10` | Number of messages to keep. | | `keep_system` | `bool` | `True` | If `True`, system messages are never deleted. | | `thread_id` | `Optional[str]` | `None` | Target thread ID. | **Returns** * `int`: Number of messages removed. *** ### summarize\_messages Summarizes a range of messages using an LLM and replaces them with a summary. ```python history.summarize_messages( model=groq("llama-3.1-8b-instant"), keep_recent=5 ) ``` **Parameters** | Name | Type | Default | Description | | :------------ | :-------------- | :------ | :------------------------------------------------- | | `model` | `Any` | - | LLM model instance for generating summary. | | `start_index` | `int` | `0` | Start index for summarization. | | `end_index` | `Optional[int]` | `None` | End index (defaults to `len - keep_recent`). | | `keep_recent` | `int` | `5` | Number of recent messages to exclude from summary. | | `thread_id` | `Optional[str]` | `None` | Target thread ID. | **Returns** * `Message`: The newly created summary message. *** ### manage\_context\_window Automatically manages context window when messages exceed a threshold. ```python history.manage_context_window( model=groq("llama-3.1-8b-instant"), max_messages=20, strategy="smart" ) ``` **Parameters** | Name | Type | Default | Description | | :------------- | :-------------- | :-------- | :----------------------------------------------------------------- | | `model` | `Any` | - | LLM model (required for "summarize" and "smart"). | | `max_messages` | `int` | `20` | Threshold to trigger management. | | `strategy` | `str` | `"smart"` | Strategy: `"smart"`, `"trim_last"`, `"trim_first"`, `"summarize"`. | | `thread_id` | `Optional[str]` | `None` | Target thread ID. | ## Data Models ### Message Object Represents a single message in the conversation history. | Property | Type | Description | | :---------- | :--------------- | :---------------------------------------------------------- | | `id` | `str` | Unique UUID for the message. | | `timestamp` | `datetime` | Time when the message was created. | | `role` | `str` | Role of the sender (`user`, `assistant`, `tool`, `system`). | | `content` | `Any` | The content of the message. | | `agent` | `Optional[str]` | Name of the agent (for assistant messages). | | `tool_call` | `Optional[Dict]` | Tool execution details (for tool messages). | | `metadata` | `Dict` | Custom metadata dictionary. | ### Thread Object Represents a conversation thread containing multiple messages. | Property | Type | Description | | :----------- | :-------------- | :-------------------------------------- | | `id` | `str` | Unique UUID for the thread. | | `created_at` | `datetime` | Time when the thread was created. | | `updated_at` | `datetime` | Time when the thread was last modified. | | `metadata` | `Dict` | Custom metadata dictionary. | | `messages` | `List[Message]` | List of messages in the thread. | # Optimizing for Cost

Optimizing for Cost

Strategies to control token usage and reduce API costs

Running LLM agents in production can be expensive if not managed carefully. This guide outlines practical strategies to keep your costs under control without sacrificing the quality of your agent's responses. ## 1. Choose the Right Model Not every task requires the most powerful model. Match your model choice to the task complexity. ### Model Selection Strategy ```python from peargent import create_agent from peargent.models import groq # Use smaller models for simple tasks classifier_agent = create_agent( name="Classifier", description="Classifies user intent", persona="You classify user messages into categories: support, sales, or general.", model=groq("llama-3.1-8b") # [!code highlight] - Smaller, cheaper model ) # Use larger models only for complex tasks reasoning_agent = create_agent( name="Reasoner", description="Solves complex problems", persona="You solve complex reasoning and coding problems step by step.", model=groq("llama-3.3-70b-versatile") # [!code highlight] - Larger model when needed ) ``` **Guidelines:** * **Simple tasks** (classification, extraction, summarization): Use smaller models (8B parameters) * **Complex tasks** (reasoning, coding, analysis): Use larger models (70B+ parameters) * **Test different models** on your specific use case to find the best cost/quality balance ### Track Model Costs with Custom Pricing If you're using custom or local models, add their pricing to track costs accurately: ```python from peargent.observability import enable_tracing, get_tracer tracer = enable_tracing() # Add custom pricing for your model (prices per million tokens) tracer.add_custom_pricing( # [!code highlight] model="my-fine-tuned-model", prompt_price=1.50, # $1.50 per million prompt tokens completion_price=3.00 # $3.00 per million completion tokens ) # Now cost tracking works for your custom model agent = create_agent( name="CustomAgent", model=my_custom_model, persona="You are helpful.", tracing=True ) ``` ## 2. Control Context with History Management The context window is your biggest cost driver. Every message in the conversation history is re-sent with each request. ### Configure Automatic Context Management Use `HistoryConfig` to automatically manage conversation history: ```python from peargent import create_agent, HistoryConfig from peargent.storage import InMemory from peargent.models import groq agent = create_agent( name="CostOptimizedAgent", description="Agent with automatic history management", persona="You are a helpful assistant.", model=groq("llama-3.3-70b-versatile"), history=HistoryConfig( # [!code highlight] auto_manage_context=True, # Enable automatic management max_context_messages=10, # Keep only last 10 messages strategy="trim_last", # Remove oldest messages when limit reached store=InMemory() ) ) ``` ### Available Context Strategies Peargent supports 5 context management strategies: | Strategy | How It Works | Use When | Cost Impact | | -------------- | ----------------------------------- | --------------------------- | ----------------------------- | | `"trim_last"` | Removes oldest messages | Simple conversations | ✅ Low - fast, no LLM calls | | `"trim_first"` | Keeps oldest messages | Important initial context | ✅ Low - fast, no LLM calls | | `"first_last"` | Keeps first and last messages | Preserving original context | ✅ Low - fast, no LLM calls | | `"summarize"` | Summarizes old messages | Complex conversations | ⚠️ Medium - requires LLM call | | `"smart"` | Chooses best strategy automatically | General purpose | ⚠️ Variable - may use LLM | **Example: Trim Strategy (Recommended for Cost)** ```python # Most cost-effective - no LLM calls for management history=HistoryConfig( auto_manage_context=True, max_context_messages=10, # [!code highlight] - Only keep 10 messages strategy="trim_last", # [!code highlight] - Drop oldest messages store=InMemory() ) ``` **Example: Summarize Strategy (Better Context Retention)** ```python # Uses LLM to summarize old messages - costs more but retains context history=HistoryConfig( auto_manage_context=True, max_context_messages=20, strategy="summarize", # [!code highlight] - Summarize old messages summarize_model=groq("llama-3.1-8b"), # [!code highlight] - Use cheap model for summaries store=InMemory() ) ``` **Example: Smart Strategy (Balanced)** ```python # Automatically chooses between trim and summarize history=HistoryConfig( auto_manage_context=True, max_context_messages=15, strategy="smart", # [!code highlight] - Automatically adapts store=InMemory() ) ``` ## 3. Limit Output Length with max\_tokens Control how much the agent can generate by setting `max_tokens` in model parameters: ```python from peargent import create_agent from peargent.models import groq # Limit output to reduce costs agent = create_agent( name="BriefAgent", description="Gives brief responses", persona="You provide concise, brief answers. Maximum 2-3 sentences.", model=groq( "llama-3.3-70b-versatile", parameters={ "max_tokens": 150, # [!code highlight] - Limit to ~150 tokens output "temperature": 0.7 } ) ) response = agent.run("Explain quantum computing") # Agent cannot generate more than 150 tokens ``` **Guidelines:** * Short answers: `max_tokens=150` (\~100 words) * Medium answers: `max_tokens=500` (\~350 words) * Long answers: `max_tokens=2000` (\~1500 words) * Code generation: `max_tokens=4096` or higher ### Move Examples to Tool Descriptions Instead of putting examples in the persona, put them in tool descriptions: ```python from peargent import create_agent, create_tool def search_database(query: str) -> str: # Implementation... return "Results found" agent = create_agent( name="ProductAgent", persona="You help with product inquiries.", # Short persona model=groq("llama-3.3-70b-versatile"), tools=[create_tool( name="search_database", description="""Searches the product database for matching items. Use this tool when users ask about products, inventory, or availability. Examples: "Do we have red shirts?" → use this tool with query="red shirts" "Check stock for item #123" → use this tool with query="item 123" """, # [!code highlight] - Examples in tool description, not persona input_parameters={"query": str}, call_function=search_database )] ) ``` ## 4. Control Temperature for Deterministic Outputs Lower temperature reduces token usage for tasks that need deterministic outputs: ```python from peargent import create_agent from peargent.models import groq # For deterministic tasks (extraction, classification) extraction_agent = create_agent( name="Extractor", description="Extracts structured data", persona="Extract the requested information exactly as it appears.", model=groq( "llama-3.3-70b-versatile", parameters={ "temperature": 0.0, # [!code highlight] - Deterministic, shorter outputs "max_tokens": 500 } ) ) # For creative tasks (writing, brainstorming) creative_agent = create_agent( name="Writer", description="Writes creative content", persona="You write engaging, creative content.", model=groq( "llama-3.3-70b-versatile", parameters={ "temperature": 0.9, # [!code highlight] - More creative, longer outputs "max_tokens": 2000 } ) ) ``` ## 5. Monitor Costs with Tracing You can't optimize what you can't measure. Use Peargent's observability features to track costs. ### Enable Cost Tracking ```python from peargent import create_agent from peargent.observability import enable_tracing from peargent.storage import Sqlite from peargent.models import groq # Enable tracing with database storage tracer = enable_tracing( store_type=Sqlite(connection_string="sqlite:///./traces.db") ) agent = create_agent( name="TrackedAgent", description="Agent with cost tracking", persona="You are helpful.", model=groq("llama-3.3-70b-versatile"), tracing=True # [!code highlight] - Enable tracing for this agent ) # Use the agent response = agent.run("Hello") # Check costs traces = tracer.list_traces() latest = traces[-1] print(f"Cost: ${latest.total_cost:.6f}") print(f"Tokens: {latest.total_tokens}") print(f"Duration: {latest.duration_ms}ms") ``` ### Analyze Cost Patterns ```python from peargent.observability import get_tracer tracer = get_tracer() # Get aggregate statistics stats = tracer.get_aggregate_stats() # [!code highlight] print(f"Total Traces: {stats['total_traces']}") print(f"Total Cost: ${stats['total_cost']:.6f}") print(f"Average Cost per Trace: ${stats['avg_cost_per_trace']:.6f}") print(f"Total Tokens: {stats['total_tokens']:,}") # Find expensive operations traces = tracer.list_traces() expensive_traces = sorted(traces, key=lambda t: t.total_cost, reverse=True)[:5] print("\nMost Expensive Operations:") for trace in expensive_traces: print(f" {trace.agent_name}: ${trace.total_cost:.6f} ({trace.total_tokens} tokens)") ``` ### Set Cost Alerts ```python from peargent.observability import get_tracer tracer = get_tracer() MAX_COST_PER_REQUEST = 0.01 # $0.01 limit for update in agent.stream_observe(user_input): if update.is_agent_end: if update.cost > MAX_COST_PER_REQUEST: # [!code highlight] print(f"⚠️ WARNING: Cost ${update.cost:.6f} exceeds limit!") # Log alert, notify admins, etc. ``` ### Track Costs by User ```python from peargent.observability import enable_tracing, set_user_id, get_tracer from peargent.storage import Postgresql tracer = enable_tracing( store_type=Postgresql(connection_string="postgresql://user:pass@localhost/db") ) # Set user ID before agent runs set_user_id("user_123") # [!code highlight] agent.run("Hello") # Get costs for specific user user_stats = tracer.get_aggregate_stats(user_id="user_123") # [!code highlight] print(f"User 123 total cost: ${user_stats['total_cost']:.6f}") ``` ## 6. Use Streaming to Show Progress While streaming doesn't reduce costs, it improves perceived performance, making slower/cheaper models feel faster: ```python from peargent import create_agent from peargent.models import groq # Use cheaper model with streaming agent = create_agent( name="StreamingAgent", description="Shows progress immediately", persona="You are helpful.", model=groq("llama-3.1-8b") # Cheaper model ) # Stream response - user sees first token in ~200ms print("Agent: ", end="", flush=True) for chunk in agent.stream("Explain AI"): # [!code highlight] print(chunk, end="", flush=True) ``` **Benefit:** Cheaper models feel faster with streaming, reducing pressure to use expensive models. ## 7. Count Tokens Before Sending Estimate costs before making expensive calls: ```python from peargent.observability import get_cost_tracker tracker = get_cost_tracker() # Count tokens in your prompt prompt = "Explain quantum computing in detail..." token_count = tracker.count_tokens(prompt, model="llama-3.3-70b-versatile") # [!code highlight] print(f"Prompt will use ~{token_count} tokens") # Estimate cost estimated_cost = tracker.calculate_cost( # [!code highlight] prompt_tokens=token_count, completion_tokens=500, # Estimate 500 token response model="llama-3.3-70b-versatile" ) print(f"Estimated cost: ${estimated_cost:.6f}") # Decide whether to proceed if estimated_cost > 0.01: print("Too expensive! Shortening prompt...") # Truncate or summarize prompt ``` ## Cost Optimization Checklist Use this checklist for production deployments: ### Model Selection * [ ] Using smallest viable model for each agent type * [ ] Tested cost vs quality tradeoff for your use case * [ ] Custom pricing configured for local/fine-tuned models ### Context Management * [ ] `HistoryConfig` configured with appropriate strategy * [ ] `max_context_messages` set to reasonable limit (10-20) * [ ] Using "trim\_last" for cost-sensitive applications * [ ] Cheaper model used for summarization if using "summarize" strategy ### Output Control * [ ] `max_tokens` set based on expected response length * [ ] Persona/system prompt optimized for brevity * [ ] Examples moved from persona to tool descriptions * [ ] Temperature set to 0.0 for deterministic tasks ### Monitoring * [ ] Tracing enabled in production * [ ] Cost tracking configured with accurate pricing * [ ] Regular analysis of aggregate statistics * [ ] Alerts set for expensive operations * [ ] Per-user cost tracking implemented ### Implementation * [ ] Token counting used for cost estimation * [ ] Streaming enabled for better UX with cheaper models * [ ] Cost limits enforced in application logic * [ ] Regular review of most expensive operations ## Summary **Biggest Cost Savings:** 1. **History Management** - Use `trim_last` with `max_context_messages=10` (saves 50-80% on tokens) 2. **Model Selection** - Use smaller models for simple tasks (saves 50-90% on costs) 3. **Persona Optimization** - Short personas (saves 5-10% per request) 4. **max\_tokens** - Limit output length (saves 20-40% on completion tokens) **Essential Monitoring:** * Enable tracing in production * Track costs per user/session * Analyze aggregate statistics weekly * Set alerts for expensive operations Start with history management and model selection for the biggest impact! # Creating Custom Tooling

Creating Custom Tooling

Best practices for building robust, reusable tools for your agents

Tools are the hands of your agent—they allow it to interact with the outside world, from searching the web to querying databases. While Peargent comes with built-in tools, the real power lies in creating custom tools tailored to your specific needs. ## The Anatomy of a Tool Every tool in Peargent has four essential components: ```python from peargent import create_tool def search_database(query: str) -> str: # Your implementation here return "Results found" tool = create_tool( name="search_database", # [!code highlight] - Tool identifier description="""Searches the product database for matching items. Use this tool when users ask about products, inventory, or availability.""", # [!code highlight] - What LLM sees input_parameters={"query": str}, # [!code highlight] - Expected arguments call_function=search_database # [!code highlight] - Function to execute ) ``` **The Four Components:** 1. **name**: Unique identifier the LLM uses to call the tool 2. **description**: Tells the LLM *what* the tool does and *when* to use it 3. **input\_parameters**: Dict mapping parameter names to their types 4. **call\_function**: The Python function that implements the tool logic ## The Golden Rules of Tool Building ### 1. Descriptive Descriptions are Mandatory The LLM uses the tool's `description` parameter to understand *what* the tool does and *when* to use it. Be verbose and precise. #### Bad: Vague Description ```python tool = create_tool( name="fetch_user", description="Gets user info.", # ❌ Too vague! input_parameters={"user_id": str}, call_function=fetch_user ) ``` #### Good: Clear and Specific ```python tool = create_tool( name="fetch_user_data", description="""Retrieves detailed profile information for a specific user from the database. Use this tool when you need to look up: - User's email address - Phone number - Account status (active/suspended) - Registration date Do NOT use this for: - Searching users by name (use search_users instead) - Updating user data (use update_user instead) Returns: Dict with keys: user_id, email, phone, status, registered_at""", # ✅ Clear when to use it! input_parameters={"user_id": str}, call_function=fetch_user_data ) ``` **Why This Matters:** The LLM decides which tool to call based solely on the description. A clear description = correct tool selection. ### 2. Type Hinting is Critical Peargent uses type hints to generate the JSON schema for the LLM. Always type your arguments and return values. #### Bad: No Type Hints ```python def calculate_tax(amount, region): # ❌ LLM can't infer types! return amount * 0.1 ``` #### Good: Full Type Hints ```python def calculate_tax(amount: float, region: str) -> float: # ✅ Clear types! """Calculates tax based on amount and region.""" tax_rates = {"CA": 0.0725, "NY": 0.08875, "TX": 0.0625} return amount * tax_rates.get(region, 0.05) ``` **Supported Types:** * Primitives: `str`, `int`, `float`, `bool` * Collections: `list`, `dict`, `list[str]`, `dict[str, int]` * Pydantic Models: For complex structured inputs (see below) ### 3. Handle Errors Gracefully Tools should not crash the agent. Use the `on_error` parameter to control failure behavior. ```python from peargent import create_tool def search_database(query: str) -> str: """Searches the database for results.""" try: results = db.execute(query) return str(results) except Exception as e: return f"Error executing query: {str(e)}. Please check your syntax." # Three error handling strategies: # Strategy 1: RAISE (Default) - Strict mode critical_tool = create_tool( name="critical_operation", description="Must succeed - handles payment", input_parameters={"amount": float}, call_function=process_payment, on_error="raise" # [!code highlight] - Crash if fails (default) ) # Strategy 2: RETURN_ERROR - Graceful mode optional_tool = create_tool( name="get_recommendations", description="Optional product recommendations", input_parameters={"user_id": str}, call_function=get_recommendations, on_error="return_error" # [!code highlight] - Return error message as string ) # Strategy 3: RETURN_NONE - Silent mode analytics_tool = create_tool( name="log_event", description="Logs analytics events", input_parameters={"event": str}, call_function=log_analytics, on_error="return_none" # [!code highlight] - Return None silently ) ``` **When to Use Each Strategy:** | Strategy | Use Case | Example | | ---------------- | ------------------------------------------- | ------------------------------------------------ | | `"raise"` | Critical operations that must succeed | Authentication, payments, database writes | | `"return_error"` | Optional features that shouldn't break flow | Recommendations, third-party APIs, cache lookups | | `"return_none"` | Nice-to-have features | Analytics, logging, notifications | ### 4. Keep It Simple (Idempotency) Ideally, tools should be **idempotent**—calling them multiple times with the same arguments should produce the same result. Avoid tools that rely heavily on hidden state. #### Bad: Stateful Tool ```python counter = 0 # ❌ Hidden state! def increment() -> int: """Increments counter.""" global counter counter += 1 return counter ``` #### Good: Stateless Tool ```python def get_user_count() -> int: """Gets current user count from database.""" return db.query("SELECT COUNT(*) FROM users").scalar() # ✅ Same input = same output ``` ## Advanced Features ### Complex Input with Pydantic For tools with many parameters or nested data, use Pydantic models for input. ```python from pydantic import BaseModel, Field from peargent import create_tool class TicketInput(BaseModel): title: str = Field(..., description="Brief summary of the issue") priority: str = Field(..., description="Ticket priority") description: str = Field(..., description="Detailed description") category: str = Field(default="general", description="Ticket category") # Validation def __init__(self, **data): # Validate priority if data.get("priority") not in ["LOW", "MEDIUM", "HIGH"]: raise ValueError("Priority must be LOW, MEDIUM, or HIGH") super().__init__(**data) def create_support_ticket(data: TicketInput) -> str: """ Creates a new support ticket in the system. Required fields: - title: Brief summary (e.g., "Cannot login") - priority: LOW, MEDIUM, or HIGH - description: Detailed explanation Optional fields: - category: Ticket category (default: "general") """ # Access validated fields ticket_id = db.create_ticket( title=data.title, priority=data.priority, description=data.description, category=data.category ) return f"Ticket #{ticket_id} created with priority {data.priority}" ticket_tool = create_tool( name="create_ticket", description="""Creates a new support ticket in the system. Required fields: - title: Brief summary (e.g., "Cannot login") - priority: LOW, MEDIUM, or HIGH - description: Detailed explanation Optional fields: - category: Ticket category (default: "general")""", input_parameters={"data": TicketInput}, # [!code highlight] - Use Pydantic model call_function=create_support_ticket ) ``` **Benefits:** * Clear parameter structure * Built-in validation * Default values * Nested objects supported ## Real-World Examples ### Example 1: Database Query Tool ```python from peargent import create_tool import sqlite3 def query_products( category: str, min_price: float = 0.0, max_price: float = 999999.99 ) -> str: """ Searches the product database by category and price range. Use this tool when users ask about: - Products in a specific category - Products within a price range - Available inventory Examples: - "Show me electronics under $500" → category="electronics", max_price=500 - "What furniture do you have?" → category="furniture" Returns: Formatted list of matching products with prices """ try: conn = sqlite3.connect("products.db") cursor = conn.cursor() query = """ SELECT name, price, stock FROM products WHERE category = ? AND price BETWEEN ? AND ? ORDER BY price """ results = cursor.fetchall() conn.close() if not results: return f"No products found in '{category}' between ${min_price} and ${max_price}" # Format results output = f"Found {len(results)} products in '{category}':\n\n" for name, price, stock in results: output += f"- {name}: ${price:.2f} ({stock} in stock)\n" return output except Exception as e: return f"Database error: {str(e)}" product_tool = create_tool( name="query_products", description="""Searches the product database by category and price range. Use this tool when users ask about: - Products in a specific category - Products within a price range - Available inventory Examples: - "Show me electronics under $500" → category="electronics", max_price=500 - "What furniture do you have?" → category="furniture" Returns: Formatted list of matching products with prices""", input_parameters={ "category": str, "min_price": float, "max_price": float }, call_function=query_products, on_error="return_error" # Don't crash on DB errors ) ``` ### Example 2: External API ```python from peargent import create_tool import requests def fetch_weather(city: str) -> str: """ Fetches current weather data from OpenWeatherMap API. Use this tool when users ask about: - Current weather conditions - Temperature - Weather forecasts Returns: Human-readable weather description """ api_key = os.getenv("OPENWEATHER_API_KEY") url = f"https://api.openweathermap.org/data/2.5/weather?q={city}&appid={api_key}" response = requests.get(url, timeout=5) if response.status_code == 404: return f"City '{city}' not found. Please check spelling." if response.status_code != 200: raise Exception(f"API returned status {response.status_code}") data = response.json() temp_f = (data["main"]["temp"] - 273.15) * 9/5 + 32 condition = data["weather"][0]["description"] return f"Weather in {city}: {condition.capitalize()}, {temp_f:.1f}°F" weather_tool = create_tool( name="get_weather", description="""Fetches current weather data from OpenWeatherMap API. Use this tool when users ask about: - Current weather conditions - Temperature - Weather forecasts Returns: Human-readable weather description""", input_parameters={"city": str}, call_function=fetch_weather, on_error="return_error" # Return error message, don't crash ) ``` ### Example 3: File Processing ```python from peargent import create_tool import os def analyze_file(filepath: str) -> dict: """ Analyzes a text file and returns metadata and preview. Use this tool when users want to: - Check file size - See file contents preview - Count lines in a file Returns: Dict with file metadata """ if not os.path.exists(filepath): raise FileNotFoundError(f"File '{filepath}' does not exist") size = os.path.getsize(filepath) with open(filepath, 'r', encoding='utf-8') as f: lines = f.readlines() preview = ''.join(lines[:5]) return { "filename": os.path.basename(filepath), "size_bytes": size, "line_count": len(lines), "preview": preview } file_tool = create_tool( name="analyze_file", description="""Analyzes a text file and returns metadata and preview. Use this tool when users want to: - Check file size - See file contents preview - Count lines in a file Returns: Dict with file metadata""", input_parameters={"filepath": str}, call_function=analyze_file, on_error="return_error" ) # Usage result = file_tool.run({"filepath": "/path/to/file.txt"}) print(f"File: {result['filename']}") print(f"Size: {result['size_bytes']} bytes") print(f"Lines: {result['line_count']}") ``` ## Tool Building Checklist Use this checklist when creating production tools: ### Design Phase * [ ] Tool has a single, clear responsibility * [ ] Input parameters are minimal and well-typed * [ ] Docstring explains **when** to use the tool, not just what it does * [ ] Examples included in docstring for clarity ### Implementation Phase * [ ] All parameters have type hints * [ ] Function handles errors gracefully (try-except) * [ ] Tool is idempotent (same input = same output) * [ ] No hidden global state ### Configuration Phase * [ ] `on_error` strategy chosen based on criticality ### Testing Phase * [ ] Tool works with valid inputs * [ ] Tool handles invalid inputs gracefully ## Common Patterns ### Pattern 1: Multi-Step Tool ```python from pydantic import BaseModel, Field from peargent import create_tool class OrderResult(BaseModel): order_id: str status: str total: float items: list[str] def process_order(customer_id: str, items: list[str]) -> OrderResult: """ Processes a customer order through multiple steps. Steps: 1. Validate customer exists 2. Check inventory for all items 3. Calculate total with taxes 4. Create order record 5. Update inventory Returns: OrderResult with order details """ # Step 1: Validate customer customer = db.get_customer(customer_id) if not customer: raise ValueError(f"Customer {customer_id} not found") # Step 2: Check inventory for item in items: if not inventory.check_availability(item): raise ValueError(f"Item {item} out of stock") # Step 3: Calculate total total = sum(catalog.get_price(item) for item in items) tax = total * 0.0725 final_total = total + tax # Step 4: Create order order_id = db.create_order(customer_id, items, final_total) # Step 5: Update inventory for item in items: inventory.decrement(item) return OrderResult( order_id=order_id, status="completed", total=final_total, items=items ) order_tool = create_tool( name="process_order", description="""Processes a customer order through multiple steps. Steps: 1. Validate customer exists 2. Check inventory for all items 3. Calculate total with taxes 4. Create order record 5. Update inventory Returns: OrderResult with order details""", input_parameters={ "customer_id": str, "items": list }, call_function=process_order, on_error="raise" # Critical operation - must succeed ) ``` ### Pattern 2: Tool Composition Break complex operations into smaller tools: ```python # Small, focused tools search_tool = create_tool( name="search_products", description="Searches products by keyword", input_parameters={"query": str}, call_function=search_products ) filter_tool = create_tool( name="filter_by_price", description="Filters products by price range", input_parameters={"products": list, "min_price": float, "max_price": float}, call_function=filter_by_price ) sort_tool = create_tool( name="sort_products", description="Sorts products by field", input_parameters={"products": list, "sort_by": str}, call_function=sort_products ) # Agent uses tools in sequence agent = create_agent( name="ProductAgent", persona="You help users find products. Use multiple tools in sequence.", tools=[search_tool, filter_tool, sort_tool] ) # User: "Show me laptops under $1000, sorted by price" # Agent will: # 1. Call search_tool(query="laptops") # 2. Call filter_tool(products=results, max_price=1000) # 3. Call sort_tool(products=filtered, sort_by="price") ``` ## When to Create Custom Tools ### ✅ Good Use Cases: * **Domain-Specific Logic**: Calculating pricing based on your company's unique rules * **Internal APIs**: Fetching data from your private microservices * **Database Operations**: Querying your application's database * **Complex Workflows**: Triggering multi-step processes (e.g., "onboard\_new\_employee") * **External Integrations**: Calling third-party APIs (Stripe, Twilio, etc.) * **File Operations**: Reading, writing, or processing files * **Business Logic**: Tax calculations, shipping estimates, etc. ### ❌ Poor Use Cases: * **Generic Operations**: Use existing tools (web search, calculations) * **One-Off Tasks**: Write regular Python code instead * **State Management**: Don't use tools to track conversation state (use history instead) * **Pure Computation**: Simple math doesn't need a tool (LLM can do it) ## Summary **Building Great Tools:** 1. **Clear Docstrings** - Explain *when* to use the tool, not just *what* it does 2. **Type Everything** - Full type hints for parameters and returns 3. **Handle Errors** - Choose appropriate `on_error` strategy 4. **Keep It Simple** - One responsibility per tool, idempotent when possible **Essential Parameters:** * `name`, `description`, `input_parameters`, `call_function` - Always required * `on_error` - `"raise"`, `"return_error"`, or `"return_none"` Start with simple tools and add advanced features as needed! # Writing Effective Personas

Writing Effective Personas

Learn how to craft powerful system prompts that define your agent's behavior

The `persona` parameter in Peargent is your agent's system prompt—the instructions that define how your agent behaves, speaks, and solves problems. A well-crafted persona can mean the difference between a generic bot and a highly effective specialist. **Your persona is sent with EVERY request**, so make it count! ## The Anatomy of a Strong Persona A good persona should cover three key areas: **Identity**, **Capabilities**, and **Constraints**. ### 1. Identity (Who are you?) Define the agent's role, expertise, and communication style. #### Weak Identity ```python agent = create_agent( name="Assistant", persona="You are a helpful assistant.", # ❌ Too generic! model=groq("llama-3.3-70b-versatile") ) ``` #### Strong Identity ```python agent = create_agent( name="DevOpsExpert", persona="""You are a Senior DevOps Engineer with 10 years of experience in Kubernetes, AWS, and CI/CD pipelines. Communication Style: - Speak concisely and technically - Use industry-standard terminology - Always prioritize security and reliability - Provide specific commands and configurations when relevant""", # ✅ Clear role & expertise! model=groq("llama-3.3-70b-versatile") ) ``` **Key Elements:** * **Expertise**: Define the agent's domain knowledge * **Experience Level**: Senior, junior, specialist, generalist * **Communication Style**: Concise, verbose, technical, friendly * **Priorities**: What matters most (security, speed, cost, UX) ### 2. Capabilities (What can you do?) Explicitly state what the agent is good at and how it should approach tasks. ```python persona = """You are a Python Code Reviewer specializing in performance optimization. Your Capabilities: - Analyze code for time and space complexity - Identify bottlenecks and inefficiencies - Suggest algorithmic improvements - Recommend appropriate data structures - Profile memory usage patterns Your Approach: 1. Read the code thoroughly 2. Identify the most critical performance issues first 3. Provide specific, actionable recommendations 4. Include code examples for complex changes 5. Explain the performance impact of each suggestion""" ``` **Benefits:** * Agent knows exactly what it's supposed to do * Consistent behavior across requests * Clear scope of responsibilities ### 3. Constraints (What should you NOT do?) Set boundaries to prevent hallucinations, unwanted behaviors, or scope creep. ```python persona = """You are a Database Administrator assistant. Constraints: - Do NOT execute any DELETE or DROP commands without explicit user confirmation - Do NOT modify production databases directly—always suggest backup strategies first - Do NOT recommend solutions using databases you're unfamiliar with - If unsure about a command's impact, ask for clarification instead of guessing - Never assume data can be recovered—always confirm backup procedures exist""" ``` **Common Constraints:** * Safety: "Never execute destructive commands without confirmation" * Scope: "Do not provide legal or medical advice" * Accuracy: "If unsure, say 'I don't know' instead of guessing" * Tool Usage: "Always use tools instead of making up data" ## Real-World Persona Examples ### Example 1: Code Reviewer Agent ```python from peargent import create_agent from peargent.models import groq code_reviewer = create_agent( name="CodeReviewBot", description="Expert Python code reviewer", persona="""You are an expert Python Code Reviewer with expertise in: - Software architecture and design patterns - Performance optimization - Security vulnerabilities (SQL injection, XSS, etc.) - PEP 8 style compliance - Testing best practices Your Review Process: 1. **Security First**: Flag any potential security risks immediately 2. **Correctness**: Identify logic errors and edge cases 3. **Performance**: Suggest optimizations for slow code 4. **Readability**: Recommend style improvements 5. **Testing**: Highlight untested code paths Your Communication: - Be constructive, not critical - Explain *why* a change is needed, not just *what* to change - Provide code snippets showing the fix - Use examples to illustrate concepts - Prioritize issues: Critical > Major > Minor Constraints: - Do not rewrite entire files unless absolutely necessary - Focus on the specific function or block in question - Do not suggest libraries that don't exist - If code is correct, say so—don't invent issues""", model=groq("llama-3.3-70b-versatile") ) # Usage review = code_reviewer.run(""" def calculate_total(prices): total = 0 for price in prices: total = total + price return total """) ``` ### Example 2: Customer Support Agent ```python support_agent = create_agent( name="SupportBot", description="Friendly customer support specialist", persona="""You are a friendly and empathetic Customer Support Specialist. Your Mission: Help customers resolve issues quickly while maintaining a positive experience. Your Personality: - Warm and approachable - Patient with frustrated customers - Proactive in offering solutions - Clear and non-technical in explanations Response Structure: 1. Acknowledge the customer's issue with empathy 2. Ask clarifying questions if needed 3. Provide step-by-step solution 4. Verify the solution worked 5. Offer additional help if needed Constraints: - Never promise refunds without checking policy - Do not share other customers' information - Escalate to human support for: billing disputes, legal threats, abuse - Do not make up features that don't exist - If you can't help, admit it and escalate Example Tone: "I understand how frustrating that must be! Let's get this sorted out for you. Can you tell me what error message you're seeing?" NOT: "Error detected. Follow these steps: ..." """, model=groq("llama-3.3-70b-versatile") ) ``` ### Example 3: Data Analyst Agent ```python analyst_agent = create_agent( name="DataAnalyst", description="Statistical data analysis expert", persona="""You are an expert Data Analyst with strong statistical and analytical skills. Your Expertise: - Statistical analysis (mean, median, variance, correlation) - Data visualization recommendations - Trend identification and forecasting - Hypothesis testing - Data quality assessment Your Analysis Process: 1. Understand the business question 2. Examine data structure and quality 3. Perform relevant statistical calculations 4. Identify patterns, trends, and anomalies 5. Provide actionable insights with confidence levels Your Deliverables: - Executive summary (1-2 sentences) - Key findings (bullet points) - Statistical evidence (numbers, percentages) - Visualizations recommendations - Next steps or recommendations Communication Style: - Explain statistics in business terms - Always include context with numbers - Highlight uncertainty and confidence levels - Use analogies for complex concepts Constraints: - Do not claim causation without proper analysis - Always mention sample size and data quality - Flag potential biases in the data - If data is insufficient, say so clearly""", model=groq("llama-3.3-70b-versatile") ) ``` ### Example 4: Creative Writer Agent ```python writer_agent = create_agent( name="CreativeWriter", description="Imaginative storyteller", persona="""You are a creative writer who weaves compelling and imaginative stories. Your Writing Style: - Vivid, descriptive language - Engaging narrative hooks - Strong character development - Unexpected plot twists - Emotional depth Story Structure: 1. Hook: Grab attention immediately 2. Setting: Paint the scene 3. Conflict: Introduce tension 4. Development: Build the narrative 5. Resolution: Satisfying conclusion Your Approach: - Show, don't tell - Use sensory details (sight, sound, smell, touch, taste) - Vary sentence length for rhythm - Create memorable characters with distinct voices - End with impact Constraints: - Keep stories appropriate for general audiences unless specified - Respect character consistency - Do not break the fourth wall unless intentional - Avoid clichés and overused tropes Respond directly with your story—do not use tools or JSON formatting.""", model=groq("llama-3.3-70b-versatile") ) ``` ### Example 5: Research Specialist ```python researcher_agent = create_agent( name="Researcher", description="Meticulous data researcher", persona="""You are a meticulous data researcher who specializes in gathering comprehensive information. Your Mission: Collect relevant, accurate, and well-organized data from available sources. Research Methodology: 1. Understand the research question 2. Use all available tools to gather data 3. Verify information from multiple sources when possible 4. Organize findings logically 5. Cite sources and provide context Data Collection: - Use tools systematically (don't skip available resources) - Extract key facts, numbers, and quotes - Note data quality and reliability - Highlight conflicting information - Preserve source attribution Presentation Style: - Structured and organized - Factual and objective - Comprehensive but not verbose - Clear headings and bullet points - Source citations Constraints: - Focus purely on data collection—do not analyze or interpret - Do not editorialize or inject opinions - If data is unavailable, state this clearly - Do not fabricate or guess data points - Analysis is for other specialists, not you""", model=groq("llama-3.3-70b-versatile") ) ``` ## Persona Optimization Strategies ### Strategy 1: Be Specific About Tool Usage If your agent has tools, tell it exactly when and how to use them. ```python persona = """You are a helpful assistant with access to a product database. Tool Usage Guidelines: - ALWAYS use the search_database tool when users ask about products - Do NOT make up product information - If search returns no results, tell the user—don't invent products - Use tools multiple times if needed to answer complex questions - Combine tool results in your response Example: User: "Do you have red shirts?" You: Call search_database(query="red shirts") then report findings""" ``` ### Strategy 2: Use Markdown for Structure LLMs understand structured text well. Use formatting in your persona string. ```python persona = """You are a Technical Writer. ## Your Mission Create clear, concise documentation for software developers. ## Writing Principles - **Clarity**: Use simple language - **Completeness**: Cover all edge cases - **Consistency**: Follow established patterns - **Code Examples**: Always include working code ## Document Structure 1. Overview (what and why) 2. Prerequisites 3. Step-by-step instructions 4. Code examples 5. Troubleshooting ## Tone Professional but approachable, like a helpful colleague.""" ``` ### Strategy 3: Provide Examples in Context Show the agent what good responses look like. ```python persona = """You are a Python tutor who explains concepts clearly. Example Interaction: Student: "What is a list comprehension?" Good Response: "A list comprehension is a concise way to create lists in Python. Basic syntax: [expression for item in iterable] Example: # Traditional loop squares = [] for x in range(5): squares.append(x**2) # List comprehension (same result) squares = [x**2 for x in range(5)] Result: [0, 1, 4, 9, 16] Use list comprehensions when you want to transform each item in a list." Bad Response: "List comprehensions create lists efficiently." Always explain like the good example: concept + syntax + code + result.""" ``` ### Strategy 4: Iterate and Refine Don't expect perfection on the first try. Test your persona and refine it. ```python # Version 1: Too vague persona_v1 = "You are a helpful coding assistant." # Version 2: More specific persona_v2 = "You are a Python expert who helps with code debugging." # Version 3: Comprehensive (final) persona_v3 = """You are a Python debugging expert. When users share code with errors: 1. Identify the error type and line number 2. Explain what's causing the error 3. Show the corrected code 4. Explain why the fix works Always: - Test your suggested fixes mentally before sharing - Explain in simple terms - Provide complete, runnable code - Highlight the changes you made""" ``` ## Common Persona Pitfalls ### Pitfall 1: Too Generic ```python # Bad: Agent doesn't know its purpose persona = "You are a helpful AI assistant." # Good: Clear purpose and expertise persona = "You are a SQL database expert who helps optimize queries for PostgreSQL." ``` ### Pitfall 2: Too Verbose Remember: Your persona is sent with **every single request**! ```python # Bad: 300+ tokens wasted per request persona = """You are a highly knowledgeable and extremely helpful AI assistant who always strives to provide the most comprehensive and detailed answers possible. You should always be polite, courteous, and respectful in your interactions. You have expertise in many domains including science, technology, arts, history, mathematics, literature, philosophy, psychology, sociology, and more. You should always think carefully before responding and provide well-structured answers. You should use examples when appropriate and explain concepts clearly and thoroughly. You should be patient with users and never make them feel bad for not understanding something...""" # ❌ ~200 tokens! # Good: Concise and focused persona = """You are a helpful assistant with expertise in science and technology. Be clear, concise, and accurate. Provide examples when helpful.""" # ✅ ~20 tokens! ``` **Cost Impact:** 180 tokens saved × 1000 requests = 180,000 tokens saved! ### Pitfall 3: Conflicting Instructions ```python # Bad: Contradictory instructions persona = """You are a concise assistant. Always provide detailed, comprehensive explanations with examples. Keep responses brief.""" # ❌ Confusing! # Good: Clear, consistent instructions persona = """You are a concise technical assistant. Provide brief but complete answers with one example when needed.""" # ✅ Clear! ``` ### Pitfall 4: No Constraints ```python # Bad: No safety boundaries persona = "You are a database admin who helps with SQL queries." # Agent might run DROP TABLE without confirmation! # Good: Safety constraints persona = """You are a database admin assistant. Safety Rules: - NEVER execute DELETE, DROP, or TRUNCATE without explicit confirmation - Always suggest BACKUP before destructive operations - Test queries on small datasets first - Explain potential impact of each command""" ``` ## Multi-Agent Persona Design When building agent pools, each agent should have a distinct, focused persona. ```python from peargent import create_agent, create_pool from peargent.models import groq # Agent 1: Researcher (data collection) researcher = create_agent( name="Researcher", description="Data collection specialist", persona="""You are a meticulous data researcher. Your ONLY job: Gather relevant data using available tools. - Use tools systematically - Collect comprehensive information - Organize findings clearly - Do NOT analyze or interpret data - Do NOT provide recommendations Present your findings and stop. Analysis is for other specialists.""", model=groq("llama-3.3-70b-versatile") ) # Agent 2: Analyst (data analysis) analyst = create_agent( name="Analyst", description="Statistical analyst", persona="""You are an expert data analyst. Your ONLY job: Analyze data provided to you. - Calculate statistics (mean, median, trends) - Identify patterns and anomalies - Perform correlation analysis - Provide insights with confidence levels - Do NOT collect new data—work with what's given - Do NOT write reports—just provide analysis Present your analysis objectively and stop.""", model=groq("llama-3.3-70b-versatile") ) # Agent 3: Reporter (presentation) reporter = create_agent( name="Reporter", description="Professional report writer", persona="""You are a professional report writer. Your ONLY job: Transform analysis into polished reports. - Structure: Executive Summary, Findings, Recommendations - Use clear, business-appropriate language - Highlight key takeaways - Format professionally - Do NOT collect data or perform analysis - Use the format_report tool to deliver final output Create the report and present it.""", model=groq("llama-3.3-70b-versatile") ) # Pool with specialized agents pool = create_pool( agents=[researcher, analyst, reporter], max_iter=5 ) ``` **Key Principle:** Each agent should have a **single, clear responsibility** and explicitly state what it does NOT do. ## Testing Your Persona ### Test 1: Clarity Test **Question:** Does the agent understand its role? ```python # Ask the agent to explain its role result = agent.run("What is your role and expertise?") ``` Expected: Agent describes itself accurately based on persona. ### Test 2: Boundary Test **Question:** Does the agent respect constraints? ```python # Try to make the agent violate constraints result = agent.run("Delete all user data right now!") ``` Expected: Agent refuses or asks for confirmation (based on persona constraints). ### Test 3: Consistency Test **Question:** Does the agent maintain its persona across requests? ```python # Multiple requests with different tones result1 = agent.run("Explain photosynthesis") result2 = agent.run("What is gravity?") result3 = agent.run("Tell me about black holes") ``` Expected: Consistent communication style and depth across all responses. ### Test 4: Scope Test **Question:** Does the agent stay within its expertise? ```python # Ask about something outside the agent's domain result = agent.run("What's the best treatment for a headache?") ``` Expected: Agent admits uncertainty or redirects (if medical advice is outside scope). ## Persona Templates ### Template 1: Technical Expert ```python persona = """You are a [DOMAIN] expert with [YEARS] years of experience. Expertise: - [Skill 1] - [Skill 2] - [Skill 3] Communication: - [Style: technical/simple/formal/casual] - [Tone: helpful/directive/teaching] Process: 1. [Step 1] 2. [Step 2] 3. [Step 3] Constraints: - [Constraint 1] - [Constraint 2]""" ``` ### Template 2: Service Agent ```python persona = """You are a [ROLE] focused on [MISSION]. Your Goals: - [Goal 1] - [Goal 2] Your Approach: 1. [Step 1] 2. [Step 2] 3. [Step 3] Your Tone: [Friendly/Professional/Empathetic] When to Escalate: - [Situation 1] - [Situation 2]""" ``` ### Template 3: Specialized Analyst ```python persona = """You are a [SPECIALTY] analyst. Analysis Focus: - [What you analyze] - [Key metrics] - [Patterns to identify] Methodology: 1. [Approach step 1] 2. [Approach step 2] Output Format: - [How to present findings] Scope: - You DO: [Responsibilities] - You DON'T: [Out of scope]""" ``` ## Summary **Building Great Personas:** 1. **Identity** - Define role, expertise, and communication style 2. **Capabilities** - List what the agent can do and its approach 3. **Constraints** - Set clear boundaries and safety rules 4. **Conciseness** - Keep it brief (personas are sent with every request) 5. **Structure** - Use markdown, bullet points, and headers 6. **Examples** - Show what good responses look like 7. **Specificity** - Be precise about tool usage and behavior **Persona Checklist:** * [ ] Role and expertise clearly defined * [ ] Communication style specified * [ ] Capabilities explicitly listed * [ ] Constraints and boundaries set * [ ] Tool usage guidelines included (if applicable) * [ ] Examples provided for complex behaviors * [ ] Concise (\< 150 tokens for cost optimization) * [ ] Tested with various inputs **Remember:** Your persona is the foundation of agent behavior. Invest time in crafting it well, and your agent will perform consistently and reliably! # History Best Practices ## Use Metadata for Search and Filtering ```python # Tag threads with searchable metadata history.create_thread(metadata={ "user_id": "alice", "topic": "technical_support", "priority": "high", "created_by": "web_app", "tags": ["billing", "bug_report"] }) # Later: Find threads by metadata all_threads = history.list_threads() high_priority = [] for thread_id in all_threads: thread = history.get_thread(thread_id) if thread.metadata.get("priority") == "high": high_priority.append(thread_id) ``` ## Use Message Metadata for Tracking ```python history.add_user_message( "Process this document", metadata={ "source": "api", "user_ip": "192.168.1.1", "request_id": "req_123", "timestamp_ms": 1234567890 } ) history.add_assistant_message( "Document processed successfully", agent="DocumentProcessor", metadata={ "model": "gpt-4o", "tokens_used": 1250, "latency_ms": 2340, "cost_usd": 0.025 } ) ``` ## Implement Cleanup Policies ```python from datetime import datetime, timedelta def cleanup_old_threads(history, days=30): """Delete threads older than specified days.""" cutoff = datetime.now() - timedelta(days=days) all_threads = history.list_threads() deleted = 0 for thread_id in all_threads: thread = history.get_thread(thread_id) if thread.created_at < cutoff: history.delete_thread(thread_id) deleted += 1 return deleted # Run cleanup deleted_count = cleanup_old_threads(history, days=90) print(f"Deleted {deleted_count} old threads") ``` ## Export Conversations ```python import json def export_thread(history, thread_id, filename): """Export thread to JSON file.""" thread = history.get_thread(thread_id) if not thread: return False with open(filename, 'w') as f: json.dump(thread.to_dict(), f, indent=2) return True # Export specific conversation export_thread(history, thread_id, "conversation_export.json") # Import conversation def import_thread(history, filename): """Import thread from JSON file.""" from peargent.storage import Thread with open(filename, 'r') as f: data = json.load(f) thread = Thread.from_dict(data) # Create thread in history new_thread_id = history.create_thread(metadata=thread.metadata) # Add all messages for msg in thread.messages: if msg.role == "user": history.add_user_message(msg.content, metadata=msg.metadata) elif msg.role == "assistant": history.add_assistant_message(msg.content, agent=msg.agent, metadata=msg.metadata) elif msg.role == "tool": history.add_tool_message(msg.tool_call, agent=msg.agent, metadata=msg.metadata) return new_thread_id ``` ## Context Window Monitoring ```python def should_manage_context(history, threshold=20): """Check if context management is needed.""" count = history.get_message_count() if count > threshold: print(f"⚠️ Context window full: {count}/{threshold} messages") return True else: print(f"✓ Context OK: {count}/{threshold} messages") return False # Monitor before agent runs if should_manage_context(history, threshold=25): history.manage_context_window( model=groq("llama-3.1-8b-instant"), max_messages=25, strategy="smart" ) ``` # Practical Playbooks

Practical Playbooks

Actionable guides and strategies for building real-world agents

Welcome to the Practical Playbooks. This section contains hands-on guides to help you master specific aspects of Peargent development. ## Available Playbooks ### **[Writing Effective Personas](/docs/practical-playbooks/effective-personas)** Learn how to craft powerful system prompts that define your agent's behavior, tone, and constraints. ### **[Creating Custom Tooling](/docs/practical-playbooks/custom-tooling)** Best practices for building robust, reusable tools that LLMs can use reliably. ### **[Optimizing for Cost](/docs/practical-playbooks/cost-optimization)** Strategies to control token usage and reduce API costs without sacrificing performance. # Agent Output

Structured Output for Agent

Build agents that return reliable, schema-validated structured responses.

Structured output means the **[Agent](/docs/agents)** returns its answer in a format you define, such as a dictionary, JSON-like object, or typed schema. Instead of replying with free-form text, the agent fills the exact fields and structure you specify. ## Why Structured Output? Structured output is useful because: * It gives consistent and predictable responses * Your code can easily read and use the output without parsing text * It prevents the model from adding extra text or changing the format * It makes agents reliable for automation, APIs, databases, UI generation, and workflows ## Enforcing a simple schema We will be using pydantic package to create the output schema. Pydantic is a data validation and settings management using Python type annotations. More about pydantic: [https://docs.pydantic.dev/latest/](https://docs.pydantic.dev/latest/) Make sure to install pydantic package `pip install pydantic`. ```python from pydantic import BaseModel, Field from peargent import create_agent from peargent.models import openai # 1. Define your schema// [!code highlight:4] class Summary(BaseModel): title: str = Field(description="Short title for the summary") points: list[str] = Field(description="Key points extracted from the text") # 2. Create agent with structured output agent = create_agent( name="Summarizer", description="Summarizes long text into structured key points.", persona="You are a precise summarizer.", model=openai("gpt-4o"), output_schema=Summary, # ← enforce structured output // [!code highlight] ) # 3. Run the agent result = agent.run("Long text to summarize") print(result) ``` Output (always structured): ```json { "title": "Understanding Black Holes", "points": [ "They form when massive stars collapse.", "Their gravity is extremely strong.", "Nothing can escape once inside the event horizon." ] } ``` ## Schema with Custom Validators Sometimes models return values that don't match the schema. Peargent integrates with **Pydantic validators** to enforce rules like rejecting incorrect values or cleaning fields. If validation fails, Peargent automatically retries until the output is valid (respecting `max_retries`). This is particularly useful for ensuring that generated data meets strict business logic requirements, such as validating email formats, checking price ranges, or ensuring strings meet specific length constraints. ```python from pydantic import BaseModel, Field, field_validator from peargent import create_agent from peargent.models import openai # Define a simple structured output with validators class Product(BaseModel): name: str = Field(description="Product name") price: float = Field(description="Price in USD", ge=0) category: str = Field(description="Product category") # Validator: ensure product name is not too generic // [!code highlight:22] @field_validator("name") @classmethod def name_not_generic(cls, v): """ Ensure the product name is not overly generic (e.g., 'item', 'product', 'thing'). Helps maintain meaningful and descriptive product naming. """ forbidden = ["item", "product", "thing"] if v.lower() in forbidden: raise ValueError("Product name is too generic") return v # Validator: enforce category capitalization @field_validator("category") @classmethod def category_must_be_titlecase(cls, v): """ Automatically convert the product category into Title Case to maintain consistent formatting across all entries. """ return v.title() # Create agent with structured output validation agent = create_agent( name="ProductGenerator", description="Generates product details", persona="You describe products clearly and accurately.", model=openai("gpt-4o"), output_schema=Product, max_retries=3 ) product = agent.run("Generate a new gadget idea for travelers") print(product) ``` **Why validator docstrings matter:** Docstrings are included in the prompt sent to the model. They explain your custom validation rules in natural language, helping the LLM avoid mistakes before they happen. This drastically reduces failed validations, retries, and extra API calls, saving cost and improving reliability. ## Nested Output Schema You can also nest multiple Pydantic models inside each other. This allows your agent to return clean, hierarchical, and well-organized structured output, perfect for complex data like profiles, products, events, or summaries. ```python from pydantic import BaseModel, Field from typing import List from peargent import create_agent from peargent.models import openai # ----- Nested Models ----- // [!code highlight:5] class Address(BaseModel): street: str = Field(description="Street name") city: str = Field(description="City name") country: str = Field(description="Country name") class UserProfile(BaseModel): name: str = Field(description="Full name of the user") age: int = Field(description="Age of the user", ge=0, le=120) email: str = Field(description="Email address") # Nesting Address schema // [!code highlight:2] address: Address = Field(description="Residential address") hobbies: List[str] = Field(description="List of hobbies") # ----- Create Agent with Nested Schema ----- agent = create_agent( name="ProfileBuilder", description="Builds structured user profiles", persona="You extract and organize user information accurately.", model=openai("gpt-4o"), output_schema=UserProfile ) profile = agent.run("Generate a profile for John Doe who lives in London.") print(profile) ``` **Output shape:** ```json { "name": "John Doe", "age": 32, "email": "john.doe@example.com", "address": { "street": "221B Baker Street", "city": "London", "country": "United Kingdom" }, "hobbies": ["reading", "cycling"] } ``` ## How Structured Output Works
### Output Schema Is Extracted **Agent** first reads your **output\_schema** (Pydantic model) and extracts field names, types, required fields, and constraints (e.g., min\_length, ge, le). This forms the core **JSON schema** that the model must follow. ### Validator Docstrings Are Collected Next, **Agent** scans your **Pydantic validators** and collects the **docstrings** you wrote inside them. These docstrings describe custom rules in natural language, such as “Name must not be generic” or “Price must be realistic”. These **docstrings** are critical because: * The LLM understands natural language rules * It reduces retries (→ lower cost) * It helps the model produce valid JSON on the first attempt ### Schema + Validator Rules Are Combined and Sent to the Model **Agent** merges the JSON schema, field constraints, and validator docstrings into a single structured prompt. At this point, the **Model** receives: * The complete structure it must output * Every validation rule it must follow * Clear natural-language constraints This ensures the **Model** is fully aware of what the final response should look like. ### Model Generates a Response The Model now returns a JSON object that attempts to satisfy the full schema and all validation rules. ### Pydantic Validates the Response Agent parses the JSON into your Pydantic model (`output_schema`), performing type checks, verifying missing fields, and running validator functions. If validation fails, **Agent** asks the **Model** to correct the response. ### Retry Loop (Until Valid Output) If a validator rejects the response: * Agent sends the error back to the **Model** * The **Model** tries again * Loop continues until max retries are reached ### Final Clean Pydantic Object Returned After validation succeeds, Agent returns a fully typed, fully validated, and safe-to-use object.
# Tools Output

Structured Output for Tools

Build tools that return reliable, schema-validated structured outputs.

Structured output means the **[Tool](/docs/tools)** returns its result in a format you define, such as a dictionary, JSON-like object, or typed schema. Instead of returning raw data, the tool validates and returns typed Pydantic model instances with guaranteed structure and correctness. ## Why Structured Output for Tools? Structured output is useful because: * It ensures tools return consistent, validated data * Your agents can reliably use tool outputs without parsing errors * It catches malformed API responses, database records, or external data early * It provides type safety and IDE autocomplete for tool results * It makes tools reliable for production systems, APIs, and complex workflows ## Validating Tool Output with Schema We will be using pydantic package to validate tool outputs. Pydantic is a data validation and settings management using Python type annotations. More about pydantic: [https://docs.pydantic.dev/latest/](https://docs.pydantic.dev/latest/) Make sure to install pydantic package `pip install pydantic`. ```python from pydantic import BaseModel, Field from peargent import create_tool # 1. Define your output schema// [!code highlight:4] class WeatherData(BaseModel): temperature: float = Field(description="Temperature in Fahrenheit") condition: str = Field(description="Weather condition (e.g., Sunny, Cloudy)") humidity: int = Field(description="Humidity percentage", ge=0, le=100) # 2. Create tool with output validation def get_weather(city: str) -> dict: # Simulated API call return { "temperature": 72.5, "condition": "Sunny", "humidity": 45 } weather_tool = create_tool( name="get_weather", description="Get current weather for a city", input_parameters={"city": str}, call_function=get_weather, output_schema=WeatherData, # ← validate tool output // [!code highlight] ) # 3. Run the tool result = weather_tool.run({"city": "San Francisco"}) print(result) ``` Output (validated Pydantic model): ```python WeatherData(temperature=72.5, condition='Sunny', humidity=45) # Access with type safety print(result.temperature) # 72.5 print(result.condition) # "Sunny" print(result.humidity) # 45 ``` ## Schema with Constraints and Validation Tools can enforce strict validation rules using Pydantic field constraints. If the tool's raw output violates these constraints, validation will fail and the error will be handled based on the `on_error` parameter. This is particularly useful for validating external API responses, database queries, or any tool that returns data from untrusted sources. ```python from pydantic import BaseModel, Field, field_validator from peargent import create_tool # Define schema with constraints class UserProfile(BaseModel): user_id: int = Field(description="Unique user ID", gt=0) username: str = Field(description="Username", min_length=3, max_length=20) email: str = Field(description="Email address") age: int = Field(description="User age", ge=0, le=150) premium: bool = Field(description="Premium subscription status") # Custom validator: email must contain @ symbol // [!code highlight:10] @field_validator("email") @classmethod def validate_email(cls, v): """ Ensure email contains @ symbol. This catches malformed email addresses from database or API. """ if "@" not in v: raise ValueError("Invalid email format") return v # Tool that fetches user data def fetch_user(user_id: int) -> dict: # Simulated database query return { "user_id": user_id, "username": "john_doe", "email": "john@example.com", "age": 28, "premium": True } user_tool = create_tool( name="fetch_user", description="Fetch user profile from database", input_parameters={"user_id": int}, call_function=fetch_user, output_schema=UserProfile, on_error="return_error" # Gracefully handle validation failures ) # Use the tool result = user_tool.run({"user_id": 123}) print(result) ``` ## Nested Output Schema You can nest multiple Pydantic models inside each other for complex tool outputs. This is perfect for validating API responses, database records with relationships, or any hierarchical data structure. ```python from pydantic import BaseModel, Field from typing import List from peargent import create_tool # ----- Nested Models ----- // [!code highlight:10] class Address(BaseModel): street: str = Field(description="Street address") city: str = Field(description="City name") state: str = Field(description="State code") zip_code: str = Field(description="ZIP code") class PhoneNumber(BaseModel): type: str = Field(description="Phone type: mobile, home, or work") number: str = Field(description="Phone number") class ContactInfo(BaseModel): name: str = Field(description="Full name") email: str = Field(description="Email address") # Nested schemas // [!code highlight:3] address: Address = Field(description="Mailing address") phone_numbers: List[PhoneNumber] = Field(description="Contact phone numbers") notes: str = Field(description="Additional notes", default="") # ----- Create Tool with Nested Schema ----- def fetch_contact(contact_id: int) -> dict: # Simulated CRM API call return { "name": "Alice Johnson", "email": "alice@example.com", "address": { "street": "123 Main St", "city": "San Francisco", "state": "CA", "zip_code": "94102" }, "phone_numbers": [ {"type": "mobile", "number": "415-555-1234"}, {"type": "work", "number": "415-555-5678"} ], "notes": "Preferred contact method: email" } contact_tool = create_tool( name="fetch_contact", description="Fetch contact information from CRM", input_parameters={"contact_id": int}, call_function=fetch_contact, output_schema=ContactInfo ) contact = contact_tool.run({"contact_id": 456}) print(contact) ``` **Output shape:** ```json { "name": "Alice Johnson", "email": "alice@example.com", "address": { "street": "123 Main St", "city": "San Francisco", "state": "CA", "zip_code": "94102" }, "phone_numbers": [ {"type": "mobile", "number": "415-555-1234"}, {"type": "work", "number": "415-555-5678"} ], "notes": "Preferred contact method: email" } ``` ## How Structured Output Works for Tools
### Tool Executes **Tool** calls the `call_function()` which returns raw data (dict, object, etc.) from an API, database, or computation. ### Output Schema Is Checked If an **output\_schema** is provided, **Tool** proceeds to validation. Otherwise, the raw output is returned as-is. ### Pydantic Validates the Output **Tool** attempts to convert the raw output into the Pydantic model, performing: * Type checking (str, int, float, bool, etc.) * Required field verification * Constraint validation (ge, le, min\_length, max\_length) * Custom validator execution (@field\_validator) If the output is already a Pydantic model instance of the correct type, it passes validation immediately. ### Validation Success or Failure **If validation succeeds:** * **Tool** returns the validated Pydantic model instance * Type-safe, guaranteed structure * Ready to use in agent workflows **If validation fails:** * Error is handled based on `on_error` parameter * If `max_retries > 0`, tool automatically retries execution * Validation runs again on each retry * See **[Error Handling](/docs/error-handling-in-tools)** for details ### Final Validated Output Returned After successful validation, **Tool** returns a fully typed, fully validated Pydantic object ready for use by agents or downstream code.
# Accessing Traces

Accessing Traces

Retrieve and analyze trace data from your agents and pools

After running agents with tracing enabled, you can access the trace data to analyze execution details, costs, performance, and errors. ## Getting the Tracer Get the global tracer instance to access stored data. ```python from peargent.observability import get_tracer tracer = get_tracer() ``` ## Listing Traces Retrieve a list of traces with optional filtering. ```python traces = tracer.list_traces( agent_name: str = None, # Filter by agent name session_id: str = None, # Filter by session ID user_id: str = None, # Filter by user ID limit: int = 100 # Max number of traces to return ) ``` ## Getting a Single Trace Retrieve a full trace object by its unique ID. ```python trace = tracer.get_trace(trace_id: str) ``` ## Trace Object Structure The `Trace` object contains the following properties: | Property | Type | Description | | :------------- | :----------- | :--------------------------------------- | | `id` | `str` | Unique identifier for the trace. | | `agent_name` | `str` | Name of the agent that executed. | | `session_id` | `str` | Session ID (if set). | | `user_id` | `str` | User ID (if set). | | `input_data` | `Any` | Input provided to the agent. | | `output` | `Any` | Final output from the agent. | | `start_time` | `datetime` | When execution started. | | `end_time` | `datetime` | When execution ended. | | `duration_ms` | `float` | Total duration in milliseconds. | | `total_tokens` | `int` | Total tokens used (prompt + completion). | | `total_cost` | `float` | Total cost in USD. | | `error` | `str` | Error message if execution failed. | | `spans` | `List[Span]` | List of operations within the trace. | **Example:** ```python print(f"Trace ID: {trace.id}") ``` ## Span Object Structure The `Span` object represents a single operation (LLM call, tool execution, etc.): | Property | Type | Description | | :------------ | :--------- | :--------------------------------------------- | | `span_type` | `str` | Type of span: `"llm"`, `"tool"`, or `"agent"`. | | `name` | `str` | Name of the model or tool. | | `start_time` | `datetime` | Start timestamp. | | `end_time` | `datetime` | End timestamp. | | `duration_ms` | `float` | Duration in milliseconds. | | `cost` | `float` | Cost of this specific operation. | **Example:** ```python print(f"Span duration: {span.duration_ms}ms") ``` **LLM Spans only:** | Property | Type | Description | | :------------------ | :---- | :---------------------------- | | `llm_model` | `str` | Model name (e.g., "gpt-4o"). | | `llm_prompt` | `str` | The prompt sent to the model. | | `llm_response` | `str` | The response received. | | `prompt_tokens` | `int` | Token count for prompt. | | `completion_tokens` | `int` | Token count for completion. | **Example:** ```python print(f"Model used: {span.llm_model}") ``` **Tool Spans only:** | Property | Type | Description | | :------------ | :----- | :---------------------------- | | `tool_name` | `str` | Name of the tool executed. | | `tool_args` | `dict` | Arguments passed to the tool. | | `tool_output` | `str` | Output returned by the tool. | **Example:** ```python print(f"Tool output: {span.tool_output}") ``` ## Printing Traces Print traces to the console for debugging. ```python tracer.print_traces( limit: int = 10, # Number of traces to print format: str = "table" # "table", "json", "markdown", or "terminal" ) ``` ## Printing Summary Print a high-level summary of usage and costs. ```python tracer.print_summary( agent_name: str = None, session_id: str = None, user_id: str = None, limit: int = None ) ``` ## Aggregate Statistics Get a dictionary of aggregated metrics programmatically. ```python stats = tracer.get_aggregate_stats( agent_name: str = None, session_id: str = None, user_id: str = None, limit: int = None ) ``` **Returned Stats Dictionary:** | Key | Type | Description | | :--------------------- | :---------- | :--------------------------------------- | | `total_traces` | `int` | Total number of traces matching filters. | | `total_cost` | `float` | Total cost in USD. | | `total_tokens` | `int` | Total tokens used. | | `total_duration` | `float` | Total duration in ms. | | `total_llm_calls` | `int` | Total number of LLM calls. | | `total_tool_calls` | `int` | Total number of tool executions. | | `avg_cost_per_trace` | `float` | Average cost per trace. | | `avg_tokens_per_trace` | `float` | Average tokens per trace. | | `avg_duration_ms` | `float` | Average duration per trace. | | `agents_used` | `List[str]` | List of unique agent names found. | **Example:** ```python print(f"Total cost: ${stats['total_cost']}") ``` ## What's Next? **[Cost Tracking](/docs/tracing-and-observability/cost-tracking)** Deep dive into cost analysis, token counting, and optimization strategies. # Controlling Tracing

Controlling Tracing

Enable or disable tracing at the global and agent level

Peargent provides flexible control over tracing behavior. You can enable tracing globally with `enable_tracing()` and selectively opt agents in or out using the `tracing` parameter. ## How Tracing Works The interaction between `enable_tracing()` and the `tracing` parameter determines whether an agent is traced: | Global (`enable_tracing()`) | Agent/Pool (`tracing=`) | Result | Explanation | | --------------------------- | ----------------------- | ------------ | --------------------------------------------------- | | ❌ Not called | Not specified | ❌ No tracing | Default: tracing disabled | | ❌ Not called | `tracing=True` | ❌ No tracing | Agent wants tracing but no global tracer configured | | ✅ Called | Not specified | ✅ Traced | Agent inherits global tracing | | ✅ Called | `tracing=True` | ✅ Traced | Agent explicitly opts in | | ✅ Called | `tracing=False` | ❌ No tracing | Agent explicitly opts out | | ✅ Called (`enabled=False`) | `tracing=True` | ❌ No tracing | Global `enabled=False` takes precedence | ## Global Control Use `enable_tracing()` to control the master switch. ```python from peargent.observability import enable_tracing # Enable globally (default) tracer = enable_tracing() # [!code highlight] # Disable globally (master switch OFF) tracer = enable_tracing(enabled=False) # [!code highlight] ``` **Important:** If `enabled=False` globally, NO agents will be traced, even if they explicitly set `tracing=True`. ## Agent-Level Control You can opt specific agents in or out of tracing: ```python # Opt-in (redundant if global is enabled, but good for clarity) agent1 = create_agent(..., tracing=True) # [!code highlight] # Opt-out (skip tracing for this agent) agent2 = create_agent(..., tracing=False) # [!code highlight] ``` ## Pool-Level Control Pools also support the `tracing` parameter, which applies to all agents in the pool unless they have their own explicit setting. ```python # Enable tracing for the pool pool = create_pool(agents=[a1, a2], tracing=True) # [!code highlight] # Disable tracing for the pool pool = create_pool(agents=[a1, a2], tracing=False) # [!code highlight] ``` **Note:** An agent's explicit `tracing` setting always overrides the pool's setting. ## What's Next? **[Tracing Storage](/docs/tracing-and-observability/tracing-storage)** Configure persistent storage for your traces using SQLite, PostgreSQL, or Redis. # Cost Tracking

Cost Tracking

Track and optimize LLM API costs with automatic token counting and pricing

Peargent automatically tracks token usage and calculates costs for all LLM API calls. This helps you monitor spending, optimize prompts, and control costs in production. ## How Cost Tracking Works Cost tracking is automatic when tracing is enabled. Peargent counts tokens using `tiktoken` and calculates costs based on the model's pricing. ```python from peargent import create_agent from peargent.observability import enable_tracing # Enable tracing to start cost tracking tracer = enable_tracing() agent = create_agent(..., tracing=True) result = agent.run("What is 2+2?") # Check costs trace = tracer.list_traces()[-1] print(f"Total cost: ${trace.total_cost:.6f}") # [!code highlight] ``` ## Custom Model Pricing Add pricing for custom or new models: ```python tracer = enable_tracing() # Add custom pricing (prices per million tokens) tracer.add_custom_pricing( model="my-custom-model", prompt_price=1.50, # $1.50 per million tokens completion_price=3.00 # $3.00 per million tokens ) ``` ## Cost Calculation Formula Costs are calculated using this formula: ```python prompt_cost = (prompt_tokens / 1,000,000) * prompt_price completion_cost = (completion_tokens / 1,000,000) * completion_price total_cost = prompt_cost + completion_cost ``` ## Viewing Cost Information You can view costs per trace or get aggregate statistics: ```python # Per-trace costs for trace in tracer.list_traces(): print(f"{trace.agent_name}: ${trace.total_cost:.6f} ({trace.total_tokens} tokens)") # [!code highlight] # Summary statistics tracer.print_summary() ``` ## Best Practices 1. **Enable Tracing in Production**: Always track costs in live environments. 2. **Monitor Daily**: Use `tracer.print_summary()` to check daily spend. 3. **Set Alerts**: Implement budget alerts for cost spikes. 4. **Optimize Prompts**: Reduce token usage to lower costs. 5. **Use Cheaper Models**: Use smaller models (e.g., `gemini-2.0-flash`) for simple tasks. ## What's Next? **[Tracing Storage](/docs/tracing-and-observability/tracing-storage)** Set up persistent trace storage with SQLite, PostgreSQL, or Redis for long-term cost analysis and reporting. # Tracing and Observability

Tracing and Observability

Monitor agent performance, track costs, and debug with comprehensive tracing

Tracing gives you complete visibility into what your agents are doing. Track LLM calls, tool executions, token usage, API costs, and performance metrics in real-time. \| Tracing is fully optional and adds minimal overhead (usually \<10ms), making it safe for production. ## Why Tracing? Production AI applications need observability: * **Cost Control** - Track token usage and API costs per request * **Performance Monitoring** - Measure latency and identify bottlenecks * **Debugging** - See exactly what happened when something fails * **Usage Analytics** - Understand how agents and tools are being used ## Quick Start Enable tracing in one line: ```python from peargent import create_agent from peargent.observability import enable_tracing from peargent.models import openai # Enable tracing tracer = enable_tracing() # [!code highlight] # Create agent with tracing enabled agent = create_agent( name="Assistant", description="Helpful assistant", persona="You are helpful", model=openai("gpt-4o"), tracing=True ) # Run agent - traces are automatically captured result = agent.run("What is 2+2?") # View traces tracer.print_summary() ``` **Output:** ```text TRACE SUMMARY Total Traces: 1 Total Tokens: 127 Total Cost: $0.000082 ``` ## What Gets Traced? Every agent execution creates a **Trace** containing multiple **Spans**: ### **Trace** Represents a full agent execution. Includes: * **ID** – Unique identifier * **Agent** – Which agent ran * **Tokens & Cost** – Total usage and API cost * **Duration** – Total time taken ### **Spans** Individual operations inside a trace: * **LLM Call** – Model, tokens, cost, latency * **Tool Execution** – Tool name, inputs, outputs, duration * **Agent Logic** – Reasoning steps ## Storage Options Peargent supports multiple storage backends including **In-Memory** (default), **SQLite**, **PostgreSQL**, **Redis**, and **File-based**. See **[Tracing Storage](/docs/tracing-and-observability/tracing-storage)** for detailed setup and configuration. ## Viewing Traces List all traces or print a summary: ```python # List traces traces = tracer.list_traces() # [!code highlight] for trace in traces: print(f"Trace {trace.id}: {trace.total_cost:.6f}") # Print summary tracer.print_summary() # [!code highlight] ``` ## What's Next? **[Cost Tracking](/docs/tracing-and-observability/cost-tracking)** Learn about model pricing, cost calculation, and optimization strategies. **[Tracing Storage](/docs/tracing-and-observability/tracing-storage)** Set up SQLite, PostgreSQL, or Redis for persistent trace storage. # Session and User Context

Session and User Context

Tag traces with session and user IDs for multi-user applications

When building multi-user applications, you often need to track which user or session triggered each agent execution. Peargent provides context functions that automatically tag all traces with session and user IDs using **thread-local storage**. ## Context API Use these functions to manage context for the current thread: ```python from peargent.observability import ( set_session_id, get_session_id, set_user_id, get_user_id, clear_context ) # Set context set_session_id("session_123") # [!code highlight] set_user_id("user_456") # [!code highlight] # Get context print(f"Session: {get_session_id()}") # [!code highlight] print(f"User: {get_user_id()}") # [!code highlight] # Clear context (important for thread reuse) clear_context() ``` ## Web Application Integration ### Flask Middleware Automatically set context from request headers or session: ```python @app.before_request def before_request(): set_session_id(session.get('session_id')) set_user_id(request.headers.get('X-User-ID')) @app.after_request def after_request(response): clear_context() return response ``` ### FastAPI Middleware Use middleware to handle context for async requests: ```python @app.middleware("http") async def add_context(request: Request, call_next): set_session_id(request.headers.get("X-Session-ID")) set_user_id(request.headers.get("X-User-ID")) response = await call_next(request) clear_context() return response ``` ## Filtering by Context Once tagged, you can filter traces by session or user: ```python # Filter by session session_traces = tracer.list_traces(session_id="session_123") # [!code highlight] # Filter by user user_traces = tracer.list_traces(user_id="alice") # [!code highlight] ``` ## What's Next? **[Accessing Traces](/docs/tracing-and-observability/accessing-traces)** Learn how to retrieve, filter, and analyze your trace data programmatically. # Tracing Storage

Tracing Storage

Persist traces to SQLite or PostgreSQL for production-grade observability

Tracing storage persists traces beyond program execution, enabling historical analysis, cost reporting, and production monitoring. ## Storage Options Peargent provides five storage backends for traces: ### In-Memory (Default) **Best for:** Development and testing. Zero setup, but data is lost on exit. ```python from peargent.observability import enable_tracing tracer = enable_tracing() # [!code highlight] ``` ### File-Based Storage **Best for:** Small-scale apps and debugging. Simple JSON files. ```python from peargent.storage import File tracer = enable_tracing(store_type=File(storage_dir="./traces")) # [!code highlight] ``` ### SQLite Storage **Best for:** Local production. Single-file database, fast queries. ```python from peargent.storage import Sqlite tracer = enable_tracing(store_type=Sqlite(connection_string="sqlite:///./traces.db")) # [!code highlight] ``` ### PostgreSQL Storage **Best for:** Multi-server production. Scalable and powerful. ```python from peargent.storage import Postgresql tracer = enable_tracing( store_type=Postgresql(connection_string="postgresql://user:pass@localhost/dbname") # [!code highlight] ) ``` ### Redis Storage **Best for:** High-speed caching and distributed systems. ```python from peargent.storage import Redis tracer = enable_tracing( store_type=Redis(host="localhost", port=6379, key_prefix="my_app") # [!code highlight] ) ``` ## Storage Comparison | Feature | In-Memory | File | SQLite | PostgreSQL | Redis | | ----------------- | --------- | ------ | ---------------- | --------------- | --------------- | | Persistence | ❌ | ✅ | ✅ | ✅ | ⚠️ Optional | | Query Performance | Fastest | Slow | Fast | Fast | Fastest | | Concurrent Access | ❌ | ❌ | ⚠️ Limited | ✅ Excellent | ✅ Excellent | | Production Ready | ❌ | ❌ | ⚠️ Single-server | ✅ Yes | ✅ Yes | | Setup Required | ❌ None | ❌ None | ❌ None | ✅ Server needed | ✅ Server needed | ## Custom Table Names You can customize table names to avoid conflicts with your existing schema. ### SQLite & PostgreSQL Use `table_prefix` to namespace your tables (creates `{prefix}_traces` and `{prefix}_spans`). ```python tracer = enable_tracing( store_type=Sqlite( connection_string="sqlite:///./traces.db", table_prefix="my_app" ) ) ``` ### Redis Use `key_prefix` to namespace your Redis keys (creates `{prefix}:traces:*`). ```python tracer = enable_tracing( store_type=Redis( host="localhost", port=6379, key_prefix="my_app" ) ) ``` ## What's Next? **[Session and User Context](/docs/tracing-and-observability/session-context)** Learn how to tag traces with user and session IDs for better organization and analysis.