spinny:~/writing $ cat building-ai-agents-claude-agent-sdk.md

Building AI Agents with the Claude Agent SDK: A Practical Guide

2026-04-12 · 9 min read · Filippo Spinella · AI, Claude, Developer Tools, MCP

The Claude Agent SDK gives you programmatic access to the same agent loop that powers Claude Code. Your agents can read files, execute shell commands, search the web, edit code, call external APIs through MCP servers, and orchestrate sub-agents - all from a few lines of TypeScript or Python.

Unlike the standard Anthropic Client SDK where you build your own tool loop, the Agent SDK handles tool execution, context management, retries, and orchestration internally. You describe what you want, provide the tools, and the agent figures out the rest.

Architecture

The SDK follows a simple loop: gather context, take action, verify, repeat.

The core entry point is query(), which returns an async iterator that streams messages as the agent works. Each message tells you what the agent is doing: reasoning, calling a tool, receiving a result, or delivering the final output.

Getting Started

Installation

# TypeScript
npm install @anthropic-ai/claude-agent-sdk

# Python
pip install claude-agent-sdk

You need an Anthropic API key set as ANTHROPIC_API_KEY in your environment.

Your First Agent

import { query } from "@anthropic-ai/claude-agent-sdk";

const conversation = query({
  prompt: "Find all TODO comments in the codebase and create a summary",
  options: {
    allowedTools: ["Read", "Glob", "Grep"],
  },
});

for await (const message of conversation) {
  if (message.type === "assistant") {
    process.stdout.write(message.content);
  }
  if (message.type === "result" && message.subtype === "success") {
    console.log("\nDone:", message.result);
  }
}

That's it. The agent will use Glob to find files, Grep to search for TODO patterns, Read to inspect matches, and return a structured summary. You don't write the orchestration logic - the SDK handles it.

Python Equivalent

from claude_agent_sdk import query

async for message in query(
    prompt="Find all TODO comments in the codebase and create a summary",
    options={"allowed_tools": ["Read", "Glob", "Grep"]},
):
    if message.type == "assistant":
        print(message.content, end="")
    if message.type == "result" and message.subtype == "success":
        print(f"\nDone: {message.result}")

Built-in Tools

The SDK ships with the same tools available in Claude Code:

Tool	Description
Read	Read file contents
Write	Create new files
Edit	Make targeted edits to existing files
Bash	Execute shell commands
Glob	Find files by pattern
Grep	Search file contents with regex
WebSearch	Search the web
WebFetch	Fetch a URL and return its contents
AskUserQuestion	Prompt the user for input

You control which tools the agent can use through allowedTools. If a tool is not in the list, the agent cannot call it.

Permission Modes

Since agents execute real commands on real systems, permissions matter.

Mode	Behavior	Use Case
`default`	Custom `canUseTool` callback decides per-call	Fine-grained control
`acceptEdits`	Auto-approve file operations, prompt for Bash	Development workflows
`dontAsk`	Deny anything not in allowedTools	Restricted agents
`bypassPermissions`	Approve everything automatically	Trusted sandboxed environments
`auto`	Model classifier decides safety	Balanced automation

const conversation = query({
  prompt: "Refactor the auth module to use JWT",
  options: {
    allowedTools: ["Read", "Edit", "Glob", "Grep", "Bash"],
    permissionMode: "acceptEdits",
  },
});

For production use, always run agents in sandboxed environments (containers, VMs) and use the most restrictive permission mode that still allows the agent to do its job.

Building Custom Tools with MCP

The real power of the SDK comes from extending agents with your own tools. Custom tools are defined as in-process MCP servers - no subprocess management, no network overhead.

Example: Weather Tool

import { tool, createSdkMcpServer, query } from "@anthropic-ai/claude-agent-sdk";
import { z } from "zod";

const getTemperature = tool(
  "get_temperature",
  "Get the current temperature at a location",
  {
    latitude: z.number().describe("Latitude"),
    longitude: z.number().describe("Longitude"),
  },
  async ({ latitude, longitude }) => {
    const res = await fetch(
      `https://api.open-meteo.com/v1/forecast?latitude=${latitude}&longitude=${longitude}&current=temperature_2m&temperature_unit=celsius`
    );
    const data = await res.json();
    return {
      content: [
        {
          type: "text",
          text: `Current temperature: ${data.current.temperature_2m}C`,
        },
      ],
    };
  }
);

const weatherServer = createSdkMcpServer({
  name: "weather",
  version: "1.0.0",
  tools: [getTemperature],
});

for await (const message of query({
  prompt: "What's the weather like in Rome?",
  options: {
    mcpServers: { weather: weatherServer },
    allowedTools: ["mcp__weather__get_temperature"],
  },
})) {
  if (message.type === "result" && message.subtype === "success") {
    console.log(message.result);
  }
}

Custom tools follow the naming convention mcp__{server_name}__{tool_name}. You can use wildcards in allowedTools: "mcp__weather__*" allows all tools from the weather server.

Example: Database Query Tool

const queryDb = tool(
  "query_database",
  "Run a read-only SQL query against the application database",
  {
    sql: z.string().describe("SQL SELECT query to execute"),
  },
  async ({ sql }) => {
    // Validate: only allow SELECT queries
    if (!sql.trim().toUpperCase().startsWith("SELECT")) {
      return {
        content: [{ type: "text", text: "Error: Only SELECT queries are allowed." }],
      };
    }

    const result = await pool.query(sql);
    return {
      content: [
        {
          type: "text",
          text: JSON.stringify(result.rows, null, 2),
        },
      ],
    };
  }
);

Connecting External MCP Servers

Beyond in-process tools, you can connect to any existing MCP server - the same servers that work with Claude Desktop, Cursor, and other MCP clients.

for await (const message of query({
  prompt: "Check the latest issues in the frontend repo and summarize them",
  options: {
    mcpServers: {
      github: {
        command: "npx",
        args: ["-y", "@modelcontextprotocol/server-github"],
        env: { GITHUB_PERSONAL_ACCESS_TOKEN: process.env.GITHUB_TOKEN },
      },
    },
    allowedTools: ["mcp__github__*"],
  },
})) {
  // ...
}

You can combine multiple MCP servers. The agent sees all tools from all connected servers and uses them as needed.

Multi-Agent Orchestration

For complex workflows, you can define specialized sub-agents that the parent agent delegates to. Each sub-agent has its own prompt, tools, and focus area.

for await (const message of query({
  prompt: "Review the PR, check for security issues, and update the changelog",
  options: {
    allowedTools: ["Read", "Edit", "Bash", "Glob", "Grep", "Agent"],
    agents: [
      {
        name: "security-reviewer",
        description: "Reviews code for security vulnerabilities",
        prompt: "You are a security expert. Analyze code for OWASP Top 10 vulnerabilities.",
        allowedTools: ["Read", "Glob", "Grep"],
      },
      {
        name: "changelog-writer",
        description: "Updates the CHANGELOG.md file based on recent changes",
        prompt: "You maintain the project changelog. Follow Keep a Changelog format.",
        allowedTools: ["Read", "Edit", "Bash"],
      },
    ],
  },
})) {
  // The parent agent will:
  // 1. Read the PR diff
  // 2. Delegate security review to security-reviewer
  // 3. Delegate changelog update to changelog-writer
  // 4. Synthesize results
}

Include "Agent" in the parent's allowedTools to enable delegation. Sub-agents run with their own tools and cannot access the parent's tools unless explicitly granted.

Sessions and Continuity

Agents can maintain context across multiple queries using sessions. Capture the session_id from the first interaction and pass it in resume for subsequent queries.

let sessionId: string | undefined;

// First query
for await (const message of query({
  prompt: "Read the project structure and understand the architecture",
  options: { allowedTools: ["Read", "Glob", "Grep"] },
})) {
  if (message.type === "init") {
    sessionId = message.session_id;
  }
}

// Follow-up query (same session, full context preserved)
for await (const message of query({
  prompt: "Now refactor the auth module based on what you learned",
  resume: sessionId,
  options: { allowedTools: ["Read", "Edit", "Bash"] },
})) {
  // Agent remembers the full project context from the first query
}

Claude Managed Agents

If you don't want to host the agent infrastructure yourself, Claude Managed Agents (launched April 2026) provides a fully managed cloud service. Anthropic runs the containers, handles scaling, and provides a streaming API.

The key difference: with the Agent SDK, you run the agent loop in your own infrastructure. With Managed Agents, Anthropic hosts and runs the agent for you. You interact through a session-based API and receive events via Server-Sent Events.

Pricing:

Agent SDK: standard Claude API token rates only. You handle hosting.
Managed Agents: token rates plus $0.08 per session-hour (billed per millisecond).

Production Best Practices

1. Always Sandbox

Never run agents with unrestricted permissions on a production machine. Use containers (Docker, Fly.io, Modal) or sandboxed environments (E2B, Vercel Sandbox).

2. Limit Tool Access

Follow the principle of least privilege. An agent that generates reports does not need Bash or Write.

// Too permissive
allowedTools: ["Read", "Write", "Edit", "Bash", "Glob", "Grep"]

// Better: only what's needed
allowedTools: ["Read", "Glob", "Grep"]

3. Use Hooks for Guardrails

Hooks let you intercept tool calls before and after execution. Use them for logging, validation, and rate limiting.

const conversation = query({
  prompt: "Analyze the codebase",
  options: {
    allowedTools: ["Read", "Glob", "Grep"],
    hooks: {
      PreToolUse: async (toolName, input) => {
        console.log(`Tool call: ${toolName}`, input);
        // Return false to block the call
        if (toolName === "Bash" && input.command.includes("rm")) {
          return false;
        }
        return true;
      },
    },
  },
});

4. Handle Errors Gracefully

The agent loop can produce errors - tool failures, API rate limits, context window overflow. Always check message types.

for await (const message of conversation) {
  switch (message.type) {
    case "assistant":
      // Agent reasoning
      break;
    case "tool_use":
      // Agent is calling a tool
      break;
    case "result":
      if (message.subtype === "error") {
        console.error("Agent failed:", message.error);
      }
      break;
  }
}

5. Monitor Token Usage

Agent loops can consume significant tokens, especially with large codebases. The SDK includes automatic context compaction, but you should still monitor usage.

Conclusion

The Claude Agent SDK turns an LLM from a question-answering machine into something closer to a junior developer. Your agents can read, write, execute, verify, and iterate - the same workflow a human follows.

Start small: build an agent with a few built-in tools. Then add custom MCP tools for your specific domain. Scale up to multi-agent orchestration when your workflows require specialization.

The agent loop is the same one that powers Claude Code. If it can build software, your agents can too.

Getting Started Checklist:

Install the SDK (npm install @anthropic-ai/claude-agent-sdk)

Set ANTHROPIC_API_KEY in your environment

Build a simple agent with built-in tools (Read, Glob, Grep)

Add a custom tool via in-process MCP server

Connect an external MCP server (GitHub, PostgreSQL, etc.)

Implement multi-agent orchestration with sub-agents

Set up a sandboxed environment for production

Add hooks for logging and guardrails