Back to blog

LangChain

Agents

Tutorial

Python

LLM

2026

How to Build AI Agents with LangChain: Complete 2026 Tutorial

Mouhssine LakhiliJanuary 30, 202612 min read

Step-by-step tutorial to build production-ready AI agents with LangChain. From setup to deployment with tools, memory, evaluation, and error handling.

How to Build AI Agents with LangChain: Complete 2026 Tutorial

What You'll Build

By the end of this tutorial, you'll have a working AI agent that can:

Search the web for real-time information
Query a database for structured data
Maintain conversation memory across sessions
Handle errors gracefully with retry logic
Deploy to production with proper monitoring

This is not a "hello world" demo. This is a production-ready agent pattern you can adapt for real applications.

"The best way to learn AI agents is to build one that solves a real problem."

Prerequisites:

Python 3.10+
Basic understanding of LLMs and prompts
OpenAI API key (or Anthropic/other providers)
30 minutes of focused time

Step 1: Project Setup

First, create your project structure and install dependencies.

# Create project directory
mkdir langchain-agent && cd langchain-agent

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install langchain langchain-openai langchain-community
pip install python-dotenv tavily-python chromadb

Create your .env file for API keys:

# .env
OPENAI_API_KEY=your-openai-key
TAVILY_API_KEY=your-tavily-key  # For web search

Create the main project structure:

langchain-agent/
├── .env
├── agent.py           # Main agent logic
├── tools.py           # Custom tools
├── memory.py          # Memory configuration
├── evaluation.py      # Testing and evaluation
└── requirements.txt

Step 2: Your First Agent

Let's build a simple agent that can answer questions using web search.

# agent.py
import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_react_agent
from langchain_community.tools.tavily_search import TavilySearchResults
from langchain_core.prompts import PromptTemplate

load_dotenv()

# Initialize the LLM
llm = ChatOpenAI(
    model="gpt-4o",
    temperature=0,
    max_tokens=1000
)

# Initialize tools
search_tool = TavilySearchResults(
    max_results=3,
    search_depth="advanced"
)
tools = [search_tool]

# Define the agent prompt
template = """You are a helpful AI assistant with access to web search.

Answer the user's question using the tools available. Always cite your sources.

You have access to the following tools:
{tools}

Use the following format:
Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: {input}
{agent_scratchpad}
"""

prompt = PromptTemplate.from_template(template)

# Create the agent
agent = create_react_agent(llm, tools, prompt)

# Create the executor with error handling
agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    verbose=True,
    handle_parsing_errors=True,
    max_iterations=5
)

# Test the agent
if __name__ == "__main__":
    result = agent_executor.invoke({
        "input": "What are the latest developments in AI agents as of January 2026?"
    })
    print(result["output"])

Run your agent:

python agent.py

You should see the agent reasoning through the problem, searching the web, and providing a sourced answer.

Step 3: Add Custom Tools

Real agents need custom tools for specific tasks. Let's add a database query tool.

# tools.py
from langchain.tools import tool
from typing import Optional
import json

# Simulated database (replace with real database connection)
MOCK_DATABASE = {
    "users": [
        {"id": 1, "name": "Alice", "role": "engineer", "projects": 5},
        {"id": 2, "name": "Bob", "role": "designer", "projects": 3},
        {"id": 3, "name": "Charlie", "role": "manager", "projects": 8},
    ],
    "projects": [
        {"id": 1, "name": "AI Dashboard", "status": "active", "team_size": 4},
        {"id": 2, "name": "Data Pipeline", "status": "completed", "team_size": 3},
        {"id": 3, "name": "Mobile App", "status": "planning", "team_size": 2},
    ]
}

@tool
def query_database(query: str) -> str:
    """
    Query the internal database for user and project information.

    Args:
        query: Natural language query like "list all engineers" or "active projects"

    Returns:
        JSON string with query results
    """
    query_lower = query.lower()

    # Simple query routing (in production, use SQL or proper query parsing)
    if "user" in query_lower or "engineer" in query_lower or "designer" in query_lower:
        if "engineer" in query_lower:
            results = [u for u in MOCK_DATABASE["users"] if u["role"] == "engineer"]
        elif "designer" in query_lower:
            results = [u for u in MOCK_DATABASE["users"] if u["role"] == "designer"]
        else:
            results = MOCK_DATABASE["users"]
        return json.dumps({"type": "users", "count": len(results), "data": results})

    if "project" in query_lower:
        if "active" in query_lower:
            results = [p for p in MOCK_DATABASE["projects"] if p["status"] == "active"]
        elif "completed" in query_lower:
            results = [p for p in MOCK_DATABASE["projects"] if p["status"] == "completed"]
        else:
            results = MOCK_DATABASE["projects"]
        return json.dumps({"type": "projects", "count": len(results), "data": results})

    return json.dumps({"error": "Query not understood. Try asking about users or projects."})


@tool
def calculate_metrics(data_type: str) -> str:
    """
    Calculate metrics for users or projects.

    Args:
        data_type: Either "users" or "projects"

    Returns:
        JSON string with calculated metrics
    """
    if data_type == "users":
        users = MOCK_DATABASE["users"]
        total_projects = sum(u["projects"] for u in users)
        avg_projects = total_projects / len(users)
        return json.dumps({
            "total_users": len(users),
            "total_projects_assigned": total_projects,
            "avg_projects_per_user": round(avg_projects, 2),
            "roles": list(set(u["role"] for u in users))
        })

    if data_type == "projects":
        projects = MOCK_DATABASE["projects"]
        total_team = sum(p["team_size"] for p in projects)
        statuses = {}
        for p in projects:
            statuses[p["status"]] = statuses.get(p["status"], 0) + 1
        return json.dumps({
            "total_projects": len(projects),
            "total_team_members": total_team,
            "status_breakdown": statuses
        })

    return json.dumps({"error": "Unknown data type. Use 'users' or 'projects'."})

Update your agent to use these tools:

# Updated agent.py
from tools import query_database, calculate_metrics

# Add custom tools to the tools list
tools = [search_tool, query_database, calculate_metrics]

Step 4: Implement Memory

Agents without memory forget everything between conversations. Let's add persistent memory.

# memory.py
from langchain.memory import ConversationBufferWindowMemory
from langchain_community.chat_message_histories import SQLChatMessageHistory

def create_memory(session_id: str, window_size: int = 10):
    """
    Create a memory instance with conversation history.

    Args:
        session_id: Unique identifier for the conversation
        window_size: Number of recent messages to keep in context

    Returns:
        Configured memory instance
    """
    # Use SQLite for persistent storage (can use PostgreSQL, Redis, etc.)
    message_history = SQLChatMessageHistory(
        session_id=session_id,
        connection_string="sqlite:///chat_history.db"
    )

    memory = ConversationBufferWindowMemory(
        chat_memory=message_history,
        k=window_size,
        return_messages=True,
        memory_key="chat_history"
    )

    return memory


def clear_memory(session_id: str):
    """Clear conversation history for a session."""
    message_history = SQLChatMessageHistory(
        session_id=session_id,
        connection_string="sqlite:///chat_history.db"
    )
    message_history.clear()

Update your agent to use memory:

# agent_with_memory.py
from memory import create_memory

# Create memory for this session
memory = create_memory(session_id="user_123")

# Update the prompt to include chat history
template_with_memory = """You are a helpful AI assistant with access to web search and internal databases.

Previous conversation:
{chat_history}

Answer the user's question using the tools available. Always cite your sources.

You have access to the following tools:
{tools}

Use the following format:
Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: {input}
{agent_scratchpad}
"""

# Create executor with memory
agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    memory=memory,
    verbose=True,
    handle_parsing_errors=True,
    max_iterations=5
)

Step 5: Add RAG (Retrieval-Augmented Generation)

For agents that need to answer questions from documents, add RAG capabilities.

# rag.py
from langchain_community.document_loaders import TextLoader, PDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
from langchain.tools import tool

# Initialize embeddings and vector store
embeddings = OpenAIEmbeddings()
vectorstore = None

def initialize_vectorstore(documents_path: str):
    """Load documents and create vector store."""
    global vectorstore

    # Load documents (adjust loader based on file type)
    loader = TextLoader(documents_path)
    documents = loader.load()

    # Split into chunks
    text_splitter = RecursiveCharacterTextSplitter(
        chunk_size=1000,
        chunk_overlap=200
    )
    splits = text_splitter.split_documents(documents)

    # Create vector store
    vectorstore = Chroma.from_documents(
        documents=splits,
        embedding=embeddings,
        persist_directory="./chroma_db"
    )
    return vectorstore


@tool
def search_documents(query: str) -> str:
    """
    Search internal documents for relevant information.

    Args:
        query: The search query

    Returns:
        Relevant document excerpts
    """
    if vectorstore is None:
        return "Document store not initialized."

    results = vectorstore.similarity_search(query, k=3)

    if not results:
        return "No relevant documents found."

    formatted_results = []
    for i, doc in enumerate(results, 1):
        formatted_results.append(f"[{i}] {doc.page_content[:500]}...")

    return "\n\n".join(formatted_results)

Step 6: Error Handling and Retry Logic

Production agents need robust error handling.

# error_handling.py
import time
from functools import wraps
from typing import Callable, Any

def retry_with_backoff(
    max_retries: int = 3,
    base_delay: float = 1.0,
    max_delay: float = 60.0,
    exceptions: tuple = (Exception,)
):
    """
    Decorator for retry logic with exponential backoff.
    """
    def decorator(func: Callable) -> Callable:
        @wraps(func)
        def wrapper(*args, **kwargs) -> Any:
            retries = 0
            while retries < max_retries:
                try:
                    return func(*args, **kwargs)
                except exceptions as e:
                    retries += 1
                    if retries == max_retries:
                        raise e

                    delay = min(base_delay * (2 ** retries), max_delay)
                    print(f"Attempt {retries} failed: {e}. Retrying in {delay}s...")
                    time.sleep(delay)

            return None
        return wrapper
    return decorator


class AgentError(Exception):
    """Base exception for agent errors."""
    pass


class ToolError(AgentError):
    """Error during tool execution."""
    pass


class MemoryError(AgentError):
    """Error with memory operations."""
    pass


def safe_tool_execution(tool_func: Callable, *args, **kwargs) -> dict:
    """
    Safely execute a tool with error capture.
    """
    try:
        result = tool_func(*args, **kwargs)
        return {"success": True, "result": result}
    except Exception as e:
        return {
            "success": False,
            "error": str(e),
            "error_type": type(e).__name__
        }

Step 7: Evaluation and Testing

Never ship an agent without testing. Here's how to evaluate your agent.

# evaluation.py
import json
from typing import List, Dict
from dataclasses import dataclass

@dataclass
class TestCase:
    input: str
    expected_action: str  # Expected tool to be called
    expected_keywords: List[str]  # Keywords expected in output
    min_confidence: float = 0.7

TEST_CASES = [
    TestCase(
        input="How many engineers do we have?",
        expected_action="query_database",
        expected_keywords=["engineer", "user"]
    ),
    TestCase(
        input="What is the latest news about LangChain?",
        expected_action="tavily_search",
        expected_keywords=["LangChain"]
    ),
    TestCase(
        input="Calculate user metrics",
        expected_action="calculate_metrics",
        expected_keywords=["total", "average"]
    ),
]


def run_evaluation(agent_executor, test_cases: List[TestCase]) -> Dict:
    """
    Run evaluation suite on the agent.
    """
    results = {
        "total": len(test_cases),
        "passed": 0,
        "failed": 0,
        "details": []
    }

    for i, test in enumerate(test_cases):
        print(f"\nRunning test {i+1}/{len(test_cases)}: {test.input[:50]}...")

        try:
            response = agent_executor.invoke({"input": test.input})
            output = response.get("output", "").lower()

            # Check if expected keywords are present
            keywords_found = sum(
                1 for kw in test.expected_keywords
                if kw.lower() in output
            )
            keyword_score = keywords_found / len(test.expected_keywords)

            passed = keyword_score >= test.min_confidence

            results["details"].append({
                "test": test.input,
                "passed": passed,
                "keyword_score": keyword_score,
                "output_preview": output[:200]
            })

            if passed:
                results["passed"] += 1
            else:
                results["failed"] += 1

        except Exception as e:
            results["failed"] += 1
            results["details"].append({
                "test": test.input,
                "passed": False,
                "error": str(e)
            })

    results["pass_rate"] = results["passed"] / results["total"]
    return results


if __name__ == "__main__":
    from agent import agent_executor

    results = run_evaluation(agent_executor, TEST_CASES)
    print(f"\n{'='*50}")
    print(f"Evaluation Results:")
    print(f"Pass Rate: {results['pass_rate']*100:.1f}%")
    print(f"Passed: {results['passed']}/{results['total']}")

Step 8: Deploy to Production

Here's a simple FastAPI deployment.

# api.py
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from typing import Optional
import uvicorn

from agent import agent_executor
from memory import create_memory, clear_memory

app = FastAPI(title="AI Agent API")

class QueryRequest(BaseModel):
    query: str
    session_id: Optional[str] = "default"

class QueryResponse(BaseModel):
    response: str
    session_id: str

@app.post("/query", response_model=QueryResponse)
async def query_agent(request: QueryRequest):
    """Send a query to the AI agent."""
    try:
        result = agent_executor.invoke({
            "input": request.query
        })
        return QueryResponse(
            response=result["output"],
            session_id=request.session_id
        )
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

@app.post("/clear-memory/{session_id}")
async def clear_session_memory(session_id: str):
    """Clear conversation memory for a session."""
    clear_memory(session_id)
    return {"message": f"Memory cleared for session {session_id}"}

@app.get("/health")
async def health_check():
    """Health check endpoint."""
    return {"status": "healthy"}

if __name__ == "__main__":
    uvicorn.run(app, host="0.0.0.0", port=8000)

Deploy with:

# Local testing
python api.py

# Production (use gunicorn)
pip install gunicorn
gunicorn api:app -w 4 -k uvicorn.workers.UvicornWorker --bind 0.0.0.0:8000

Common Errors and Fixes

Error: "Agent stopped due to max iterations"

Cause: The agent is stuck in a loop or taking too many steps.

Fix: Increase max_iterations or improve your prompt to be more specific:

agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    max_iterations=10,  # Increase from default 5
    early_stopping_method="generate"  # Let the agent decide when to stop
)

Error: "Could not parse LLM output"

Cause: The LLM response doesn't match the expected format.

Fix: Enable error handling and add better prompt instructions:

agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    handle_parsing_errors=True,  # Auto-retry on parse errors
    verbose=True
)

Error: "Rate limit exceeded"

Cause: Too many API calls in a short time.

Fix: Add rate limiting and caching:

from langchain.cache import SQLiteCache
from langchain.globals import set_llm_cache

# Enable caching
set_llm_cache(SQLiteCache(database_path=".langchain.db"))

Error: "Context length exceeded"

Cause: Too much text in the prompt (memory + tools + query).

Fix: Use windowed memory and summarization:

from langchain.memory import ConversationSummaryBufferMemory

memory = ConversationSummaryBufferMemory(
    llm=llm,
    max_token_limit=2000,  # Summarize when exceeding this
    return_messages=True
)

FAQ

Q: Which LLM should I use?

For agents, use models with strong reasoning: GPT-4o, Claude 3.5 Sonnet, or Gemini 1.5 Pro. Avoid smaller models for complex agent tasks.

Q: How do I reduce costs?

Cache responses with set_llm_cache()
Use smaller models for simple tools
Limit memory window size
Set max_iterations appropriately

Q: Can I use local models?

Yes! Use Ollama or vLLM:

from langchain_community.llms import Ollama
llm = Ollama(model="llama3.1:70b")

Q: How do I add authentication?

Use FastAPI middleware:

from fastapi import Depends, HTTPException
from fastapi.security import HTTPBearer

security = HTTPBearer()

@app.post("/query")
async def query_agent(request: QueryRequest, token = Depends(security)):
    # Validate token
    pass

Complete GitHub Template

Get the full code with additional features:

git clone https://github.com/LMouhssine/langchain-agent-template
cd langchain-agent-template
pip install -r requirements.txt
cp .env.example .env
# Add your API keys to .env
python agent.py

What's Next?

You now have a production-ready AI agent. Here's where to go from here:

Add more tools - Connect to your specific APIs and databases
Implement guardrails - Add safety checks for sensitive operations
Set up monitoring - Track usage, errors, and costs
Build a UI - Connect to a frontend or Slack/Discord

Continue your AI agent journey:

Why AI Agents Fail (And How to Fix Them) — Debug production agent failures with this checklist.
The AI Orchestrator Battle Guide 2026 — Level up from single agents to multi-agent systems.
AI-Powered Developer Workflows — Compare tools for building with AI.

If this tutorial helped you build your first agent, share it with another developer. The best way to learn is to build.

Build with AI and ship with confidence

Need a developer who can turn ideas into production work?

I help teams ship React, Next.js, Node.js, AI, and automation work with clear scope, practical guardrails, and fast execution.

View profile See freelance offer Contact me

Share this article

AIAgentsArchitecture

How AI Agents Actually Work: Architecture, Memory, Tools, and the Agent Loop

A technical walkthrough of AI agent architecture: the agent loop, tool use, memory (RAG/vector DBs), evaluation, and common production failure modes.

February 9, 202613 min read

AIAgentsLLM

Why AI Agents Fail (And How to Fix Them)

A practical guide to AI agent failures in production and how to fix them with better prompts, memory design, tool gating, evaluation, UX, and security.

January 13, 202610 min read

AIAgentsAutomation

AI Coding Agent Guardrails: Use Cursor, Claude Code, and Codex Without Breaking Production

A practical workflow for using AI coding agents on production codebases with sandboxing, scoped credentials, approvals, CI gates, audit logs, and rollback discipline.

May 4, 202610 min read