The Claude Agent SDK gives you programmatic access to the same agent loop that powers Claude Code. Your agents can read files, execute shell commands, search the web, edit code, call external APIs through MCP servers, and orchestrate sub-agents - all from a few lines of TypeScript or Python.
Unlike the standard Anthropic Client SDK where you build your own tool loop, the Agent SDK handles tool execution, context management, retries, and orchestration internally. You describe what you want, provide the tools, and the agent figures out the rest.
Architecture
The SDK follows a simple loop: gather context, take action, verify, repeat.
The core entry point is query(), which returns an async iterator that streams messages as the agent works. Each message tells you what the agent is doing: reasoning, calling a tool, receiving a result, or delivering the final output.
Getting Started
Installation
# TypeScript npm install @anthropic-ai/claude-agent-sdk # Python pip install claude-agent-sdk
You need an Anthropic API key set as ANTHROPIC_API_KEY in your environment.
Your First Agent
import { query } from "@anthropic-ai/claude-agent-sdk"; const conversation = query({ prompt: "Find all TODO comments in the codebase and create a summary", options: { allowedTools: ["Read", "Glob", "Grep"], }, }); for await (const message of conversation) { if (message.type === "assistant") { process.stdout.write(message.content); } if (message.type === "result" && message.subtype === "success") { console.log("\nDone:", message.result); } }
That's it. The agent will use Glob to find files, Grep to search for TODO patterns, Read to inspect matches, and return a structured summary. You don't write the orchestration logic - the SDK handles it.
Python Equivalent
from claude_agent_sdk import query async for message in query( prompt="Find all TODO comments in the codebase and create a summary", options={"allowed_tools": ["Read", "Glob", "Grep"]}, ): if message.type == "assistant": print(message.content, end="") if message.type == "result" and message.subtype == "success": print(f"\nDone: {message.result}")
Built-in Tools
The SDK ships with the same tools available in Claude Code:
| Tool | Description |
|---|---|
| Read | Read file contents |
| Write | Create new files |
| Edit | Make targeted edits to existing files |
| Bash | Execute shell commands |
| Glob | Find files by pattern |
| Grep | Search file contents with regex |
| WebSearch | Search the web |
| WebFetch | Fetch a URL and return its contents |
| AskUserQuestion | Prompt the user for input |
You control which tools the agent can use through allowedTools. If a tool is not in the list, the agent cannot call it.
Permission Modes
Since agents execute real commands on real systems, permissions matter.
| Mode | Behavior | Use Case |
|---|---|---|
default | Custom canUseTool callback decides per-call | Fine-grained control |
acceptEdits | Auto-approve file operations, prompt for Bash | Development workflows |
dontAsk | Deny anything not in allowedTools | Restricted agents |
bypassPermissions | Approve everything automatically | Trusted sandboxed environments |
auto | Model classifier decides safety | Balanced automation |
const conversation = query({ prompt: "Refactor the auth module to use JWT", options: { allowedTools: ["Read", "Edit", "Glob", "Grep", "Bash"], permissionMode: "acceptEdits", }, });
For production use, always run agents in sandboxed environments (containers, VMs) and use the most restrictive permission mode that still allows the agent to do its job.
Building Custom Tools with MCP
The real power of the SDK comes from extending agents with your own tools. Custom tools are defined as in-process MCP servers - no subprocess management, no network overhead.
Example: Weather Tool
import { tool, createSdkMcpServer, query } from "@anthropic-ai/claude-agent-sdk"; import { z } from "zod"; const getTemperature = tool( "get_temperature", "Get the current temperature at a location", { latitude: z.number().describe("Latitude"), longitude: z.number().describe("Longitude"), }, async ({ latitude, longitude }) => { const res = await fetch( `https://api.open-meteo.com/v1/forecast?latitude=${latitude}&longitude=${longitude}¤t=temperature_2m&temperature_unit=celsius` ); const data = await res.json(); return { content: [ { type: "text", text: `Current temperature: ${data.current.temperature_2m}C`, }, ], }; } ); const weatherServer = createSdkMcpServer({ name: "weather", version: "1.0.0", tools: [getTemperature], }); for await (const message of query({ prompt: "What's the weather like in Rome?", options: { mcpServers: { weather: weatherServer }, allowedTools: ["mcp__weather__get_temperature"], }, })) { if (message.type === "result" && message.subtype === "success") { console.log(message.result); } }
Custom tools follow the naming convention mcp__{server_name}__{tool_name}. You can use wildcards in allowedTools: "mcp__weather__*" allows all tools from the weather server.
Example: Database Query Tool
const queryDb = tool( "query_database", "Run a read-only SQL query against the application database", { sql: z.string().describe("SQL SELECT query to execute"), }, async ({ sql }) => { // Validate: only allow SELECT queries if (!sql.trim().toUpperCase().startsWith("SELECT")) { return { content: [{ type: "text", text: "Error: Only SELECT queries are allowed." }], }; } const result = await pool.query(sql); return { content: [ { type: "text", text: JSON.stringify(result.rows, null, 2), }, ], }; } );
Connecting External MCP Servers
Beyond in-process tools, you can connect to any existing MCP server - the same servers that work with Claude Desktop, Cursor, and other MCP clients.
for await (const message of query({ prompt: "Check the latest issues in the frontend repo and summarize them", options: { mcpServers: { github: { command: "npx", args: ["-y", "@modelcontextprotocol/server-github"], env: { GITHUB_PERSONAL_ACCESS_TOKEN: process.env.GITHUB_TOKEN }, }, }, allowedTools: ["mcp__github__*"], }, })) { // ... }
You can combine multiple MCP servers. The agent sees all tools from all connected servers and uses them as needed.
Multi-Agent Orchestration
For complex workflows, you can define specialized sub-agents that the parent agent delegates to. Each sub-agent has its own prompt, tools, and focus area.
for await (const message of query({ prompt: "Review the PR, check for security issues, and update the changelog", options: { allowedTools: ["Read", "Edit", "Bash", "Glob", "Grep", "Agent"], agents: [ { name: "security-reviewer", description: "Reviews code for security vulnerabilities", prompt: "You are a security expert. Analyze code for OWASP Top 10 vulnerabilities.", allowedTools: ["Read", "Glob", "Grep"], }, { name: "changelog-writer", description: "Updates the CHANGELOG.md file based on recent changes", prompt: "You maintain the project changelog. Follow Keep a Changelog format.", allowedTools: ["Read", "Edit", "Bash"], }, ], }, })) { // The parent agent will: // 1. Read the PR diff // 2. Delegate security review to security-reviewer // 3. Delegate changelog update to changelog-writer // 4. Synthesize results }
Include "Agent" in the parent's allowedTools to enable delegation. Sub-agents run with their own tools and cannot access the parent's tools unless explicitly granted.
Sessions and Continuity
Agents can maintain context across multiple queries using sessions. Capture the session_id from the first interaction and pass it in resume for subsequent queries.
let sessionId: string | undefined; // First query for await (const message of query({ prompt: "Read the project structure and understand the architecture", options: { allowedTools: ["Read", "Glob", "Grep"] }, })) { if (message.type === "init") { sessionId = message.session_id; } } // Follow-up query (same session, full context preserved) for await (const message of query({ prompt: "Now refactor the auth module based on what you learned", resume: sessionId, options: { allowedTools: ["Read", "Edit", "Bash"] }, })) { // Agent remembers the full project context from the first query }
Claude Managed Agents
If you don't want to host the agent infrastructure yourself, Claude Managed Agents (launched April 2026) provides a fully managed cloud service. Anthropic runs the containers, handles scaling, and provides a streaming API.
The key difference: with the Agent SDK, you run the agent loop in your own infrastructure. With Managed Agents, Anthropic hosts and runs the agent for you. You interact through a session-based API and receive events via Server-Sent Events.
Pricing:
- Agent SDK: standard Claude API token rates only. You handle hosting.
- Managed Agents: token rates plus $0.08 per session-hour (billed per millisecond).
Production Best Practices
1. Always Sandbox
Never run agents with unrestricted permissions on a production machine. Use containers (Docker, Fly.io, Modal) or sandboxed environments (E2B, Vercel Sandbox).
2. Limit Tool Access
Follow the principle of least privilege. An agent that generates reports does not need Bash or Write.
// Too permissive allowedTools: ["Read", "Write", "Edit", "Bash", "Glob", "Grep"] // Better: only what's needed allowedTools: ["Read", "Glob", "Grep"]
3. Use Hooks for Guardrails
Hooks let you intercept tool calls before and after execution. Use them for logging, validation, and rate limiting.
const conversation = query({ prompt: "Analyze the codebase", options: { allowedTools: ["Read", "Glob", "Grep"], hooks: { PreToolUse: async (toolName, input) => { console.log(`Tool call: ${toolName}`, input); // Return false to block the call if (toolName === "Bash" && input.command.includes("rm")) { return false; } return true; }, }, }, });
4. Handle Errors Gracefully
The agent loop can produce errors - tool failures, API rate limits, context window overflow. Always check message types.
for await (const message of conversation) { switch (message.type) { case "assistant": // Agent reasoning break; case "tool_use": // Agent is calling a tool break; case "result": if (message.subtype === "error") { console.error("Agent failed:", message.error); } break; } }
5. Monitor Token Usage
Agent loops can consume significant tokens, especially with large codebases. The SDK includes automatic context compaction, but you should still monitor usage.
Conclusion
The Claude Agent SDK turns an LLM from a question-answering machine into something closer to a junior developer. Your agents can read, write, execute, verify, and iterate - the same workflow a human follows.
Start small: build an agent with a few built-in tools. Then add custom MCP tools for your specific domain. Scale up to multi-agent orchestration when your workflows require specialization.
The agent loop is the same one that powers Claude Code. If it can build software, your agents can too.
Getting Started Checklist:
- Install the SDK (
npm install @anthropic-ai/claude-agent-sdk)- Set
ANTHROPIC_API_KEYin your environment- Build a simple agent with built-in tools (Read, Glob, Grep)
- Add a custom tool via in-process MCP server
- Connect an external MCP server (GitHub, PostgreSQL, etc.)
- Implement multi-agent orchestration with sub-agents
- Set up a sandboxed environment for production
- Add hooks for logging and guardrails