A hands-on technical guide to building your first autonomous AI agent—from architecture patterns to production deployment.
Building an AI agent requires understanding three core concepts: perception (how agents observe their environment), reasoning (how they make decisions), and action (how they execute tasks). Modern agents leverage large language models (LLMs) as their reasoning engine, combined with tool-calling capabilities to interact with external systems.
This tutorial walks through building a research agent using the ReAct pattern (Reasoning + Acting), which has become the de facto standard for LLM-based agents. We'll use LangChain as our framework, but the principles apply to any agent framework including AutoGen, CrewAI, or custom implementations.
Prerequisites
This tutorial assumes familiarity with Python, async/await patterns, and basic LLM concepts. You'll need Python 3.9+, an OpenAI API key (or compatible provider), and 30 minutes of focused time.
Production-grade agents follow a three-layer architecture that separates concerns and enables testing, debugging, and scaling. Understanding this architecture is crucial before writing any code.
Manages the agent loop: perception → reasoning → action → observation. Handles retries, error recovery, and termination conditions.
The LLM that decides what to do next. Takes current state + available tools, outputs next action. Implements ReAct, Chain-of-Thought, or custom prompting strategies.
Executable functions the agent can call: APIs, databases, file systems, or external services. Each tool has a schema describing inputs/outputs.
This separation enables you to swap LLM providers, add new tools, or change orchestration logic without rewriting the entire system. It also makes testing dramatically easier—you can mock tools, test reasoning in isolation, or verify orchestration logic independently.
Let's build a research agent that can search the web, read articles, and synthesize findings into a report. This demonstrates all three layers in action and introduces key patterns you'll use in every agent you build.
Tools are the agent's interface to the world. Each tool needs a clear name, description, and typed parameters. The LLM uses these descriptions to decide when and how to call each tool.
from langchain.tools import tool
from typing import List, Dict
import requests
from bs4 import BeautifulSoup
@tool
def web_search(query: str, num_results: int = 5) -> List[Dict[str, str]]:
"""
Search the web for information on a given topic.
Args:
query: The search query string
num_results: Number of results to return (default: 5)
Returns:
List of dicts with 'title', 'url', and 'snippet' keys
"""
# In production, use a real search API (Serper, Brave, etc.)
api_key = os.getenv("SERPER_API_KEY")
response = requests.post(
"https://google.serper.dev/search",
headers={"X-API-KEY": api_key},
json={"q": query, "num": num_results}
)
results = response.json().get("organic", [])
return [
{
"title": r.get("title"),
"url": r.get("link"),
"snippet": r.get("snippet", "")
}
for r in results
]
@tool
def read_webpage(url: str) -> str:
"""
Fetch and extract main content from a webpage.
Args:
url: The URL to fetch
Returns:
Extracted text content from the page
"""
try:
response = requests.get(url, timeout=10)
soup = BeautifulSoup(response.content, 'html.parser')
# Remove script and style elements
for script in soup(["script", "style"]):
script.decompose()
# Get text and clean it
text = soup.get_text()
lines = (line.strip() for line in text.splitlines())
chunks = (phrase.strip() for line in lines for phrase in line.split(" "))
text = '\n'.join(chunk for chunk in chunks if chunk)
# Truncate to avoid token limits
return text[:8000]
except Exception as e:
return f"Error fetching {url}: {str(e)}"
@tool
def save_report(content: str, filename: str = "research_report.md") -> str:
"""
Save research findings to a markdown file.
Args:
content: The report content in markdown format
filename: Output filename (default: research_report.md)
Returns:
Confirmation message with file path
"""
with open(filename, 'w') as f:
f.write(content)
return f"Report saved successfully to {filename}"
# Collect all tools
tools = [web_search, read_webpage, save_report]Tool Design Best Practices
Run Python code examples directly in your browser. Experiment with AI agent code without any setup.
Free Trial Period
The code playground is currently free to use during our trial period. This feature will require a paid subscription in the future. Take advantage now!
The reasoning engine is your LLM configured with a system prompt that defines the agent's behavior, capabilities, and constraints. This is where you encode the agent's "personality" and decision-making strategy.
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_openai_functions_agent
from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder
# Initialize LLM
llm = ChatOpenAI(
model="gpt-4-turbo-preview",
temperature=0.1, # Low temperature for consistent reasoning
streaming=True # Enable streaming for real-time feedback
)
# Define system prompt
system_prompt = """You are a research agent specialized in gathering and synthesizing information from the web.
Your capabilities:
- Search the web for relevant information
- Read and analyze webpage content
- Synthesize findings into clear, well-structured reports
Your process:
1. Break down the research question into searchable sub-topics
2. Search for relevant sources (aim for 3-5 high-quality sources)
3. Read and extract key information from each source
4. Synthesize findings into a coherent report with citations
5. Save the final report to a file
Guidelines:
- Always cite your sources with URLs
- Verify information across multiple sources when possible
- If you encounter errors, try alternative approaches
- Be concise but thorough in your reports
- Use markdown formatting for readability
When you've completed the research and saved the report, respond with "RESEARCH COMPLETE" followed by a summary."""
# Create prompt template
prompt = ChatPromptTemplate.from_messages([
("system", system_prompt),
MessagesPlaceholder(variable_name="chat_history", optional=True),
("human", "{input}"),
MessagesPlaceholder(variable_name="agent_scratchpad"),
])
# Create agent
agent = create_openai_functions_agent(
llm=llm,
tools=tools,
prompt=prompt
)Complete coding challenges, earn XP, and track your progress through the tutorial.
Free Trial Period
Interactive challenges are currently free to use during our trial period. This feature will require a paid subscription in the future. Take advantage now!
The orchestration layer manages the agent's execution: calling the LLM, executing tools, handling errors, and deciding when to stop. LangChain's AgentExecutor provides this out of the box, but understanding what it does helps you customize behavior.
# Create executor with safety limits
agent_executor = AgentExecutor(
agent=agent,
tools=tools,
verbose=True, # Print reasoning steps
max_iterations=15, # Prevent infinite loops
max_execution_time=300, # 5-minute timeout
handle_parsing_errors=True, # Gracefully handle LLM errors
return_intermediate_steps=True # Return full execution trace
)
# Run the agent
async def research(topic: str) -> Dict:
"""
Execute research on a given topic.
Args:
topic: The research question or topic
Returns:
Dict with 'output' (final answer) and 'steps' (execution trace)
"""
try:
result = await agent_executor.ainvoke({
"input": f"Research the following topic and create a comprehensive report: {topic}"
})
return {
"output": result["output"],
"steps": result.get("intermediate_steps", []),
"success": True
}
except Exception as e:
return {
"output": f"Research failed: {str(e)}",
"steps": [],
"success": False
}
# Example usage
if __name__ == "__main__":
import asyncio
result = asyncio.run(research(
"What are the latest developments in multi-agent AI systems?"
))
print("\n=== FINAL RESULT ===")
print(result["output"])
print(f"\nCompleted in {len(result['steps'])} steps")Production Considerations
This example works for prototyping, but production agents need additional safeguards:
Complete coding challenges, earn XP, and track your progress through the tutorial.
Free Trial Period
Interactive challenges are currently free to use during our trial period. This feature will require a paid subscription in the future. Take advantage now!
The agent above is stateless—it forgets everything after each run. Real-world agents need memory to maintain context across interactions, learn from past experiences, and build on previous work.
LangChain provides several memory types. For most agents, ConversationBufferMemory (stores full history) or ConversationSummaryMemory (stores compressed summaries) work well. For long-running agents, consider vector store memory that retrieves relevant past interactions.
from langchain.memory import ConversationBufferMemory
from langchain_community.chat_message_histories import RedisChatMessageHistory
# Option 1: In-memory (development)
memory = ConversationBufferMemory(
memory_key="chat_history",
return_messages=True,
output_key="output"
)
# Option 2: Persistent (production)
def get_session_history(session_id: str):
return RedisChatMessageHistory(
session_id=session_id,
url="redis://localhost:6379"
)
memory = ConversationBufferMemory(
chat_memory=get_session_history("user_123"),
memory_key="chat_history",
return_messages=True,
output_key="output"
)
# Create agent with memory
agent_executor = AgentExecutor(
agent=agent,
tools=tools,
memory=memory, # Add memory here
verbose=True,
max_iterations=15
)
# Now the agent remembers previous interactions
await agent_executor.ainvoke({
"input": "Research quantum computing applications"
})
await agent_executor.ainvoke({
"input": "Now compare those applications to classical computing"
}) # Agent remembers the previous researchMemory adds complexity—you need to manage session lifecycles, handle memory overflow (when history exceeds token limits), and decide what to remember vs. forget. For production systems, implement memory pruning strategies and consider using semantic search to retrieve only relevant past interactions.
Run Python code examples directly in your browser. Experiment with AI agent code without any setup.
Free Trial Period
The code playground is currently free to use during our trial period. This feature will require a paid subscription in the future. Take advantage now!
Agents fail in unique ways: the LLM might hallucinate tool names, tools might return unexpected data, or the agent might get stuck in loops. Robust error handling is what separates prototypes from production systems.
Tool Hallucination
The LLM invents tool names or parameters that don't exist.
Solution: Set handle_parsing_errors=True and provide clear tool descriptions.
Infinite Loops
The agent repeats the same action without making progress.
Solution: Set max_iterations and implement loop detection in your orchestrator.
Tool Errors
External APIs fail, rate limits are hit, or network requests timeout.
Solution: Wrap tools in try/except, return error messages as strings, implement retries with exponential backoff.
Context Overflow
Agent history exceeds the LLM's context window, causing failures.
Solution: Use ConversationSummaryMemory or implement sliding window memory with pruning.
# 1. Enable verbose logging
agent_executor = AgentExecutor(
agent=agent,
tools=tools,
verbose=True, # Prints every step
return_intermediate_steps=True
)
# 2. Use LangSmith for tracing (production)
import os
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = "your_key"
os.environ["LANGCHAIN_PROJECT"] = "research-agent"
# 3. Add custom callbacks for monitoring
from langchain.callbacks import StdOutCallbackHandler
class CustomCallback(StdOutCallbackHandler):
def on_tool_start(self, serialized, input_str, **kwargs):
print(f"\n🔧 Calling tool: {serialized['name']}")
print(f" Input: {input_str}")
def on_tool_end(self, output, **kwargs):
print(f" Output: {output[:200]}...")
def on_llm_error(self, error, **kwargs):
print(f"\n❌ LLM Error: {error}")
agent_executor = AgentExecutor(
agent=agent,
tools=tools,
callbacks=[CustomCallback()],
verbose=True
)
# 4. Implement health checks
async def health_check():
"""Test agent with a simple query to verify it's working."""
try:
result = await agent_executor.ainvoke({
"input": "What is 2+2? Use the calculator tool."
})
return result["output"] == "4"
except Exception as e:
print(f"Health check failed: {e}")
return FalseComplete coding challenges, earn XP, and track your progress through the tutorial.
Free Trial Period
Interactive challenges are currently free to use during our trial period. This feature will require a paid subscription in the future. Take advantage now!
For high-stakes decisions, pause agent execution and request human approval before proceeding. This is critical for agents that modify production systems, spend money, or interact with customers.
from langchain.tools import HumanInputTool
@tool
async def execute_database_query(query: str) -> str:
"""
Execute a SQL query on the production database.
REQUIRES HUMAN APPROVAL.
"""
# Show query to human
print(f"\n⚠️ APPROVAL REQUIRED ⚠️")
print(f"Agent wants to execute: {query}")
approval = input("Approve? (yes/no): ")
if approval.lower() != "yes":
return "Query rejected by human operator"
# Execute query
result = await db.execute(query)
return f"Query executed successfully: {result}"
# Add to tools list
tools.append(execute_database_query)Complex tasks often benefit from multiple specialized agents working together. One agent might handle research, another writes code, and a third reviews quality. Frameworks like CrewAI and AutoGen specialize in this pattern.
# Simple multi-agent pattern with LangChain
researcher_agent = create_agent(
llm=llm,
tools=[web_search, read_webpage],
system_prompt="You are a research specialist..."
)
writer_agent = create_agent(
llm=llm,
tools=[save_report],
system_prompt="You are a technical writer..."
)
# Orchestrate agents
async def collaborative_research(topic: str):
# Step 1: Research phase
research_result = await researcher_agent.ainvoke({
"input": f"Gather information on: {topic}"
})
# Step 2: Writing phase
report_result = await writer_agent.ainvoke({
"input": f"Write a report based on this research: {research_result['output']}"
})
return report_result["output"]Instead of free-form text, force agents to return structured data (JSON, Pydantic models) for downstream processing. This is essential when agents feed into other systems or APIs.
from pydantic import BaseModel, Field
from typing import List
class ResearchReport(BaseModel):
"""Structured research report output."""
title: str = Field(description="Report title")
summary: str = Field(description="Executive summary (2-3 sentences)")
key_findings: List[str] = Field(description="List of key findings")
sources: List[str] = Field(description="List of source URLs")
confidence: float = Field(description="Confidence score 0-1")
# Use with OpenAI function calling
llm_with_structure = llm.with_structured_output(ResearchReport)
# Agent now returns typed objects
result = await agent_executor.ainvoke({"input": "Research AI agents"})
report: ResearchReport = result["output"]
print(f"Confidence: {report.confidence}")
print(f"Found {len(report.sources)} sources")Complete coding challenges, earn XP, and track your progress through the tutorial.
Free Trial Period
Interactive challenges are currently free to use during our trial period. This feature will require a paid subscription in the future. Take advantage now!
Moving from prototype to production requires addressing scalability, reliability, and observability. Here's a production-ready architecture pattern used by companies running agents at scale.
FastAPI or Flask endpoint that receives requests, validates inputs, and queues agent tasks. Returns task IDs for async processing.
POST /api/researchCelery or RQ for async task processing. Handles retries, rate limiting, and worker scaling. Critical for long-running agents.
Redis + CeleryContainerized agent executors (Docker/K8s) that pull tasks from queue, execute agents, and store results. Scale horizontally.
Kubernetes podsLangSmith, Datadog, or custom logging for tracing agent executions, monitoring costs, and debugging failures.
LangSmith + Prometheus# api.py - FastAPI endpoint
from fastapi import FastAPI, BackgroundTasks
from celery_app import research_task
app = FastAPI()
@app.post("/api/research")
async def create_research_task(
topic: str,
background_tasks: BackgroundTasks
):
# Queue task
task = research_task.delay(topic)
return {
"task_id": task.id,
"status": "queued",
"message": "Research task started"
}
@app.get("/api/research/{task_id}")
async def get_research_status(task_id: str):
task = research_task.AsyncResult(task_id)
if task.ready():
return {
"status": "complete",
"result": task.result
}
else:
return {
"status": "processing",
"progress": task.info.get("progress", 0)
}
# celery_app.py - Worker
from celery import Celery
from agent import research # Your agent code
celery = Celery(
"tasks",
broker="redis://localhost:6379/0",
backend="redis://localhost:6379/1"
)
@celery.task(bind=True, max_retries=3)
def research_task(self, topic: str):
try:
# Update progress
self.update_state(
state="PROGRESS",
meta={"progress": 0, "status": "Starting research..."}
)
# Run agent
result = asyncio.run(research(topic))
return result
except Exception as e:
# Retry with exponential backoff
raise self.retry(exc=e, countdown=60 * (2 ** self.request.retries))
# docker-compose.yml
version: '3.8'
services:
redis:
image: redis:alpine
api:
build: .
ports:
- "8000:8000"
environment:
- OPENAI_API_KEY=${OPENAI_API_KEY}
worker:
build: .
command: celery -A celery_app worker --loglevel=info
environment:
- OPENAI_API_KEY=${OPENAI_API_KEY}
deploy:
replicas: 3 # Scale workersYou now have the foundation to build production-grade AI agents. Here's how to continue your learning journey:
Browse 42+ agent frameworks in our directory. Compare features, see real-world examples, and find the right tool for your use case.
Browse directory →Read our foundational article on AI agents to understand the theory behind autonomous systems and multi-agent architectures.
Read article →Connect with other agent developers, share your projects, and get help troubleshooting. The Agent Town Square community is here to support you.
Coming soonDiscover how the ATS Protocol enables secure, zero-trust multi-agent orchestration for enterprise deployments.
Testing phase - Feb 2026 launch