Agent Town Square

Building Your First Agent: A Technical Deep Dive

Building an AI agent requires understanding three core concepts: perception (how agents observe their environment), reasoning (how they make decisions), and action (how they execute tasks). Modern agents leverage large language models (LLMs) as their reasoning engine, combined with tool-calling capabilities to interact with external systems.

This tutorial walks through building a research agent using the ReAct pattern (Reasoning + Acting), which has become the de facto standard for LLM-based agents. We'll use LangChain as our framework, but the principles apply to any agent framework including AutoGen, CrewAI, or custom implementations.

Prerequisites

This tutorial assumes familiarity with Python, async/await patterns, and basic LLM concepts. You'll need Python 3.9+, an OpenAI API key (or compatible provider), and 30 minutes of focused time.

Agent Architecture: The Three-Layer Model

Production-grade agents follow a three-layer architecture that separates concerns and enables testing, debugging, and scaling. Understanding this architecture is crucial before writing any code.

1. Orchestration Layer

Manages the agent loop: perception → reasoning → action → observation. Handles retries, error recovery, and termination conditions.

2. Reasoning Layer

The LLM that decides what to do next. Takes current state + available tools, outputs next action. Implements ReAct, Chain-of-Thought, or custom prompting strategies.

3. Tool Layer

Executable functions the agent can call: APIs, databases, file systems, or external services. Each tool has a schema describing inputs/outputs.

This separation enables you to swap LLM providers, add new tools, or change orchestration logic without rewriting the entire system. It also makes testing dramatically easier—you can mock tools, test reasoning in isolation, or verify orchestration logic independently.

Implementation: Building a Research Agent

Let's build a research agent that can search the web, read articles, and synthesize findings into a report. This demonstrates all three layers in action and introduces key patterns you'll use in every agent you build.

Step 1: Define Your Tools

Tools are the agent's interface to the world. Each tool needs a clear name, description, and typed parameters. The LLM uses these descriptions to decide when and how to call each tool.

from langchain.tools import tool
from typing import List, Dict
import requests
from bs4 import BeautifulSoup

@tool
def web_search(query: str, num_results: int = 5) -> List[Dict[str, str]]:
    """
    Search the web for information on a given topic.
    
    Args:
        query: The search query string
        num_results: Number of results to return (default: 5)
    
    Returns:
        List of dicts with 'title', 'url', and 'snippet' keys
    """
    # In production, use a real search API (Serper, Brave, etc.)
    api_key = os.getenv("SERPER_API_KEY")
    response = requests.post(
        "https://google.serper.dev/search",
        headers={"X-API-KEY": api_key},
        json={"q": query, "num": num_results}
    )
    results = response.json().get("organic", [])
    
    return [
        {
            "title": r.get("title"),
            "url": r.get("link"),
            "snippet": r.get("snippet", "")
        }
        for r in results
    ]

@tool
def read_webpage(url: str) -> str:
    """
    Fetch and extract main content from a webpage.
    
    Args:
        url: The URL to fetch
    
    Returns:
        Extracted text content from the page
    """
    try:
        response = requests.get(url, timeout=10)
        soup = BeautifulSoup(response.content, 'html.parser')
        
        # Remove script and style elements
        for script in soup(["script", "style"]):
            script.decompose()
        
        # Get text and clean it
        text = soup.get_text()
        lines = (line.strip() for line in text.splitlines())
        chunks = (phrase.strip() for line in lines for phrase in line.split("  "))
        text = '\n'.join(chunk for chunk in chunks if chunk)
        
        # Truncate to avoid token limits
        return text[:8000]
    except Exception as e:
        return f"Error fetching {url}: {str(e)}"

@tool
def save_report(content: str, filename: str = "research_report.md") -> str:
    """
    Save research findings to a markdown file.
    
    Args:
        content: The report content in markdown format
        filename: Output filename (default: research_report.md)
    
    Returns:
        Confirmation message with file path
    """
    with open(filename, 'w') as f:
        f.write(content)
    return f"Report saved successfully to {filename}"

# Collect all tools
tools = [web_search, read_webpage, save_report]

Tool Design Best Practices

• Clear descriptions: The LLM only knows what you tell it. Be explicit about what each tool does.
• Type hints: Use Python type hints—LangChain converts these to JSON schemas automatically.
• Error handling: Tools should never crash. Return error messages as strings so the agent can adapt.
• Idempotency: Tools should be safe to call multiple times with the same inputs.

Sign in to try the code playground

Run Python code examples directly in your browser. Experiment with AI agent code without any setup.

Free Trial Period

The code playground is currently free to use during our trial period. This feature will require a paid subscription in the future. Take advantage now!

Step 2: Configure the Reasoning Engine

The reasoning engine is your LLM configured with a system prompt that defines the agent's behavior, capabilities, and constraints. This is where you encode the agent's "personality" and decision-making strategy.

from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_openai_functions_agent
from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder

# Initialize LLM
llm = ChatOpenAI(
    model="gpt-4-turbo-preview",
    temperature=0.1,  # Low temperature for consistent reasoning
    streaming=True    # Enable streaming for real-time feedback
)

# Define system prompt
system_prompt = """You are a research agent specialized in gathering and synthesizing information from the web.

Your capabilities:
- Search the web for relevant information
- Read and analyze webpage content
- Synthesize findings into clear, well-structured reports

Your process:
1. Break down the research question into searchable sub-topics
2. Search for relevant sources (aim for 3-5 high-quality sources)
3. Read and extract key information from each source
4. Synthesize findings into a coherent report with citations
5. Save the final report to a file

Guidelines:
- Always cite your sources with URLs
- Verify information across multiple sources when possible
- If you encounter errors, try alternative approaches
- Be concise but thorough in your reports
- Use markdown formatting for readability

When you've completed the research and saved the report, respond with "RESEARCH COMPLETE" followed by a summary."""

# Create prompt template
prompt = ChatPromptTemplate.from_messages([
    ("system", system_prompt),
    MessagesPlaceholder(variable_name="chat_history", optional=True),
    ("human", "{input}"),
    MessagesPlaceholder(variable_name="agent_scratchpad"),
])

# Create agent
agent = create_openai_functions_agent(
    llm=llm,
    tools=tools,
    prompt=prompt
)

Sign in to try interactive challenges

Complete coding challenges, earn XP, and track your progress through the tutorial.

Free Trial Period

Interactive challenges are currently free to use during our trial period. This feature will require a paid subscription in the future. Take advantage now!

Step 3: Implement the Orchestration Loop

The orchestration layer manages the agent's execution: calling the LLM, executing tools, handling errors, and deciding when to stop. LangChain's AgentExecutor provides this out of the box, but understanding what it does helps you customize behavior.

# Create executor with safety limits
agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    verbose=True,              # Print reasoning steps
    max_iterations=15,         # Prevent infinite loops
    max_execution_time=300,    # 5-minute timeout
    handle_parsing_errors=True, # Gracefully handle LLM errors
    return_intermediate_steps=True  # Return full execution trace
)

# Run the agent
async def research(topic: str) -> Dict:
    """
    Execute research on a given topic.
    
    Args:
        topic: The research question or topic
    
    Returns:
        Dict with 'output' (final answer) and 'steps' (execution trace)
    """
    try:
        result = await agent_executor.ainvoke({
            "input": f"Research the following topic and create a comprehensive report: {topic}"
        })
        
        return {
            "output": result["output"],
            "steps": result.get("intermediate_steps", []),
            "success": True
        }
    except Exception as e:
        return {
            "output": f"Research failed: {str(e)}",
            "steps": [],
            "success": False
        }

# Example usage
if __name__ == "__main__":
    import asyncio
    
    result = asyncio.run(research(
        "What are the latest developments in multi-agent AI systems?"
    ))
    
    print("\n=== FINAL RESULT ===")
    print(result["output"])
    print(f"\nCompleted in {len(result['steps'])} steps")

Production Considerations

This example works for prototyping, but production agents need additional safeguards:

• Cost controls: Track token usage and set per-request budgets
• Rate limiting: Respect API rate limits for tools and LLM providers
• Observability: Log all tool calls, LLM responses, and errors to a monitoring system
• Security: Validate tool inputs, sandbox code execution, and never expose API keys in prompts

Sign in to try interactive challenges

Complete coding challenges, earn XP, and track your progress through the tutorial.

Free Trial Period

Interactive challenges are currently free to use during our trial period. This feature will require a paid subscription in the future. Take advantage now!

Adding Memory: Making Agents Stateful

The agent above is stateless—it forgets everything after each run. Real-world agents need memory to maintain context across interactions, learn from past experiences, and build on previous work.

LangChain provides several memory types. For most agents, ConversationBufferMemory (stores full history) or ConversationSummaryMemory (stores compressed summaries) work well. For long-running agents, consider vector store memory that retrieves relevant past interactions.

from langchain.memory import ConversationBufferMemory
from langchain_community.chat_message_histories import RedisChatMessageHistory

# Option 1: In-memory (development)
memory = ConversationBufferMemory(
    memory_key="chat_history",
    return_messages=True,
    output_key="output"
)

# Option 2: Persistent (production)
def get_session_history(session_id: str):
    return RedisChatMessageHistory(
        session_id=session_id,
        url="redis://localhost:6379"
    )

memory = ConversationBufferMemory(
    chat_memory=get_session_history("user_123"),
    memory_key="chat_history",
    return_messages=True,
    output_key="output"
)

# Create agent with memory
agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    memory=memory,  # Add memory here
    verbose=True,
    max_iterations=15
)

# Now the agent remembers previous interactions
await agent_executor.ainvoke({
    "input": "Research quantum computing applications"
})

await agent_executor.ainvoke({
    "input": "Now compare those applications to classical computing"
})  # Agent remembers the previous research

Memory adds complexity—you need to manage session lifecycles, handle memory overflow (when history exceeds token limits), and decide what to remember vs. forget. For production systems, implement memory pruning strategies and consider using semantic search to retrieve only relevant past interactions.

Sign in to try the code playground

Run Python code examples directly in your browser. Experiment with AI agent code without any setup.

Free Trial Period

The code playground is currently free to use during our trial period. This feature will require a paid subscription in the future. Take advantage now!

Debugging & Error Handling

Agents fail in unique ways: the LLM might hallucinate tool names, tools might return unexpected data, or the agent might get stuck in loops. Robust error handling is what separates prototypes from production systems.

Common Failure Modes

Tool Hallucination

The LLM invents tool names or parameters that don't exist.

Solution: Set handle_parsing_errors=True and provide clear tool descriptions.

Infinite Loops

The agent repeats the same action without making progress.

Solution: Set max_iterations and implement loop detection in your orchestrator.

Tool Errors

External APIs fail, rate limits are hit, or network requests timeout.

Solution: Wrap tools in try/except, return error messages as strings, implement retries with exponential backoff.

Context Overflow

Agent history exceeds the LLM's context window, causing failures.

Solution: Use ConversationSummaryMemory or implement sliding window memory with pruning.

Debugging Techniques

# 1. Enable verbose logging
agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    verbose=True,  # Prints every step
    return_intermediate_steps=True
)

# 2. Use LangSmith for tracing (production)
import os
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = "your_key"
os.environ["LANGCHAIN_PROJECT"] = "research-agent"

# 3. Add custom callbacks for monitoring
from langchain.callbacks import StdOutCallbackHandler

class CustomCallback(StdOutCallbackHandler):
    def on_tool_start(self, serialized, input_str, **kwargs):
        print(f"\n🔧 Calling tool: {serialized['name']}")
        print(f"   Input: {input_str}")
    
    def on_tool_end(self, output, **kwargs):
        print(f"   Output: {output[:200]}...")
    
    def on_llm_error(self, error, **kwargs):
        print(f"\n❌ LLM Error: {error}")

agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    callbacks=[CustomCallback()],
    verbose=True
)

# 4. Implement health checks
async def health_check():
    """Test agent with a simple query to verify it's working."""
    try:
        result = await agent_executor.ainvoke({
            "input": "What is 2+2? Use the calculator tool."
        })
        return result["output"] == "4"
    except Exception as e:
        print(f"Health check failed: {e}")
        return False

Sign in to try interactive challenges

Complete coding challenges, earn XP, and track your progress through the tutorial.

Free Trial Period

Interactive challenges are currently free to use during our trial period. This feature will require a paid subscription in the future. Take advantage now!

Advanced Patterns

1. Human-in-the-Loop

For high-stakes decisions, pause agent execution and request human approval before proceeding. This is critical for agents that modify production systems, spend money, or interact with customers.

from langchain.tools import HumanInputTool

@tool
async def execute_database_query(query: str) -> str:
    """
    Execute a SQL query on the production database.
    REQUIRES HUMAN APPROVAL.
    """
    # Show query to human
    print(f"\n⚠️  APPROVAL REQUIRED ⚠️")
    print(f"Agent wants to execute: {query}")
    approval = input("Approve? (yes/no): ")
    
    if approval.lower() != "yes":
        return "Query rejected by human operator"
    
    # Execute query
    result = await db.execute(query)
    return f"Query executed successfully: {result}"

# Add to tools list
tools.append(execute_database_query)

2. Multi-Agent Collaboration

Complex tasks often benefit from multiple specialized agents working together. One agent might handle research, another writes code, and a third reviews quality. Frameworks like CrewAI and AutoGen specialize in this pattern.

# Simple multi-agent pattern with LangChain
researcher_agent = create_agent(
    llm=llm,
    tools=[web_search, read_webpage],
    system_prompt="You are a research specialist..."
)

writer_agent = create_agent(
    llm=llm,
    tools=[save_report],
    system_prompt="You are a technical writer..."
)

# Orchestrate agents
async def collaborative_research(topic: str):
    # Step 1: Research phase
    research_result = await researcher_agent.ainvoke({
        "input": f"Gather information on: {topic}"
    })
    
    # Step 2: Writing phase
    report_result = await writer_agent.ainvoke({
        "input": f"Write a report based on this research: {research_result['output']}"
    })
    
    return report_result["output"]

3. Structured Output

Instead of free-form text, force agents to return structured data (JSON, Pydantic models) for downstream processing. This is essential when agents feed into other systems or APIs.

from pydantic import BaseModel, Field
from typing import List

class ResearchReport(BaseModel):
    """Structured research report output."""
    title: str = Field(description="Report title")
    summary: str = Field(description="Executive summary (2-3 sentences)")
    key_findings: List[str] = Field(description="List of key findings")
    sources: List[str] = Field(description="List of source URLs")
    confidence: float = Field(description="Confidence score 0-1")

# Use with OpenAI function calling
llm_with_structure = llm.with_structured_output(ResearchReport)

# Agent now returns typed objects
result = await agent_executor.ainvoke({"input": "Research AI agents"})
report: ResearchReport = result["output"]

print(f"Confidence: {report.confidence}")
print(f"Found {len(report.sources)} sources")

Sign in to try interactive challenges

Complete coding challenges, earn XP, and track your progress through the tutorial.

Free Trial Period

Interactive challenges are currently free to use during our trial period. This feature will require a paid subscription in the future. Take advantage now!

Deploying to Production

Moving from prototype to production requires addressing scalability, reliability, and observability. Here's a production-ready architecture pattern used by companies running agents at scale.

Architecture Components

API Layer

FastAPI or Flask endpoint that receives requests, validates inputs, and queues agent tasks. Returns task IDs for async processing.

POST /api/research

Task Queue

Celery or RQ for async task processing. Handles retries, rate limiting, and worker scaling. Critical for long-running agents.

Redis + Celery

Agent Workers

Containerized agent executors (Docker/K8s) that pull tasks from queue, execute agents, and store results. Scale horizontally.

Kubernetes pods

Observability

LangSmith, Datadog, or custom logging for tracing agent executions, monitoring costs, and debugging failures.

LangSmith + Prometheus

Example Production Setup

# api.py - FastAPI endpoint
from fastapi import FastAPI, BackgroundTasks
from celery_app import research_task

app = FastAPI()

@app.post("/api/research")
async def create_research_task(
    topic: str,
    background_tasks: BackgroundTasks
):
    # Queue task
    task = research_task.delay(topic)
    
    return {
        "task_id": task.id,
        "status": "queued",
        "message": "Research task started"
    }

@app.get("/api/research/{task_id}")
async def get_research_status(task_id: str):
    task = research_task.AsyncResult(task_id)
    
    if task.ready():
        return {
            "status": "complete",
            "result": task.result
        }
    else:
        return {
            "status": "processing",
            "progress": task.info.get("progress", 0)
        }

# celery_app.py - Worker
from celery import Celery
from agent import research  # Your agent code

celery = Celery(
    "tasks",
    broker="redis://localhost:6379/0",
    backend="redis://localhost:6379/1"
)

@celery.task(bind=True, max_retries=3)
def research_task(self, topic: str):
    try:
        # Update progress
        self.update_state(
            state="PROGRESS",
            meta={"progress": 0, "status": "Starting research..."}
        )
        
        # Run agent
        result = asyncio.run(research(topic))
        
        return result
    except Exception as e:
        # Retry with exponential backoff
        raise self.retry(exc=e, countdown=60 * (2 ** self.request.retries))

# docker-compose.yml
version: '3.8'
services:
  redis:
    image: redis:alpine
  
  api:
    build: .
    ports:
      - "8000:8000"
    environment:
      - OPENAI_API_KEY=${OPENAI_API_KEY}
  
  worker:
    build: .
    command: celery -A celery_app worker --loglevel=info
    environment:
      - OPENAI_API_KEY=${OPENAI_API_KEY}
    deploy:
      replicas: 3  # Scale workers

Next Steps

You now have the foundation to build production-grade AI agents. Here's how to continue your learning journey:

Explore Agent Frameworks

Browse 42+ agent frameworks in our directory. Compare features, see real-world examples, and find the right tool for your use case.

Browse directory →

Deepen Your Understanding

Read our foundational article on AI agents to understand the theory behind autonomous systems and multi-agent architectures.

Read article →

Join the Community

Connect with other agent developers, share your projects, and get help troubleshooting. The Agent Town Square community is here to support you.

Coming soon

Learn About ATS Protocol

Discover how the ATS Protocol enables secure, zero-trust multi-agent orchestration for enterprise deployments.

Testing phase - Feb 2026 launch

Ready to Build?

Start building your first agent today. Browse our curated directory of frameworks, compare features, and find the perfect tools for your project.

Getting Started with AI Agents

Building Your First Agent: A Technical Deep Dive

Agent Architecture: The Three-Layer Model

1. Orchestration Layer

2. Reasoning Layer

3. Tool Layer

Implementation: Building a Research Agent

Step 1: Define Your Tools

Sign in to try the code playground

Step 2: Configure the Reasoning Engine

Sign in to try interactive challenges

Step 3: Implement the Orchestration Loop

Sign in to try interactive challenges

Adding Memory: Making Agents Stateful

Sign in to try the code playground

Debugging & Error Handling

Common Failure Modes

Debugging Techniques

Sign in to try interactive challenges

Advanced Patterns

1. Human-in-the-Loop

2. Multi-Agent Collaboration

3. Structured Output

Sign in to try interactive challenges

Deploying to Production

Architecture Components

API Layer

Task Queue

Agent Workers

Observability

Example Production Setup

Next Steps

Explore Agent Frameworks

Deepen Your Understanding

Join the Community

Learn About ATS Protocol

Ready to Build?