spinny:~/writing $ less building-ai-agents-claude-agent-sdk.md
12The Claude Agent SDK gives you programmatic access to the same agent loop that powers Claude Code. Your agents can read files, execute shell commands, search the web, edit code, call external APIs through MCP servers, and orchestrate sub-agents - all from a few lines of TypeScript or Python.34Unlike the standard Anthropic Client SDK where you build your own tool loop, the Agent SDK handles tool execution, context management, retries, and orchestration internally. You describe what you want, provide the tools, and the agent figures out the rest.56## Architecture78The SDK follows a simple loop: **gather context, take action, verify, repeat**.910```mermaid11graph TD12 Input[User Prompt] --> Loop[Agent Loop]13 Loop --> Reason[Claude Reasons]14 Reason --> Tool[Call Tool]15 Tool --> Result[Tool Result]16 Result --> Loop17 Reason --> Done[Final Response]1819 subgraph "Built-in Tools"20 T1[Read / Write / Edit]21 T2[Bash / Terminal]22 T3[Glob / Grep]23 T4[WebSearch / WebFetch]24 end2526 subgraph "Custom Tools via MCP"27 M1[Your API]28 M2[Database]29 M3[Slack / GitHub / etc.]30 end3132 Tool --> T133 Tool --> M134```3536The core entry point is `query()`, which returns an async iterator that streams messages as the agent works. Each message tells you what the agent is doing: reasoning, calling a tool, receiving a result, or delivering the final output.3738## Getting Started3940### Installation4142```bash43# TypeScript44npm install @anthropic-ai/claude-agent-sdk4546# Python47pip install claude-agent-sdk48```4950You need an Anthropic API key set as `ANTHROPIC_API_KEY` in your environment.5152### Your First Agent5354```typescript55import { query } from "@anthropic-ai/claude-agent-sdk";5657const conversation = query({58 prompt: "Find all TODO comments in the codebase and create a summary",59 options: {60 allowedTools: ["Read", "Glob", "Grep"],61 },62});6364for await (const message of conversation) {65 if (message.type === "assistant") {66 process.stdout.write(message.content);67 }68 if (message.type === "result" && message.subtype === "success") {69 console.log("\nDone:", message.result);70 }71}72```7374That's it. The agent will use Glob to find files, Grep to search for TODO patterns, Read to inspect matches, and return a structured summary. You don't write the orchestration logic - the SDK handles it.7576### Python Equivalent7778```python79from claude_agent_sdk import query8081async for message in query(82 prompt="Find all TODO comments in the codebase and create a summary",83 options={"allowed_tools": ["Read", "Glob", "Grep"]},84):85 if message.type == "assistant":86 print(message.content, end="")87 if message.type == "result" and message.subtype == "success":88 print(f"\nDone: {message.result}")89```9091## Built-in Tools9293The SDK ships with the same tools available in Claude Code:9495| Tool | Description |96|------|-------------|97| **Read** | Read file contents |98| **Write** | Create new files |99| **Edit** | Make targeted edits to existing files |100| **Bash** | Execute shell commands |101| **Glob** | Find files by pattern |102| **Grep** | Search file contents with regex |103| **WebSearch** | Search the web |104| **WebFetch** | Fetch a URL and return its contents |105| **AskUserQuestion** | Prompt the user for input |106107You control which tools the agent can use through `allowedTools`. If a tool is not in the list, the agent cannot call it.108109## Permission Modes110111Since agents execute real commands on real systems, permissions matter.112113| Mode | Behavior | Use Case |114|------|----------|----------|115| `default` | Custom `canUseTool` callback decides per-call | Fine-grained control |116| `acceptEdits` | Auto-approve file operations, prompt for Bash | Development workflows |117| `dontAsk` | Deny anything not in allowedTools | Restricted agents |118| `bypassPermissions` | Approve everything automatically | Trusted sandboxed environments |119| `auto` | Model classifier decides safety | Balanced automation |120121```typescript122const conversation = query({123 prompt: "Refactor the auth module to use JWT",124 options: {125 allowedTools: ["Read", "Edit", "Glob", "Grep", "Bash"],126 permissionMode: "acceptEdits",127 },128});129```130131For production use, always run agents in sandboxed environments (containers, VMs) and use the most restrictive permission mode that still allows the agent to do its job.132133## Building Custom Tools with MCP134135The real power of the SDK comes from extending agents with your own tools. Custom tools are defined as in-process MCP servers - no subprocess management, no network overhead.136137### Example: Weather Tool138139```typescript140import { tool, createSdkMcpServer, query } from "@anthropic-ai/claude-agent-sdk";141import { z } from "zod";142143const getTemperature = tool(144 "get_temperature",145 "Get the current temperature at a location",146 {147 latitude: z.number().describe("Latitude"),148 longitude: z.number().describe("Longitude"),149 },150 async ({ latitude, longitude }) => {151 const res = await fetch(152 `https://api.open-meteo.com/v1/forecast?latitude=${latitude}&longitude=${longitude}¤t=temperature_2m&temperature_unit=celsius`153 );154 const data = await res.json();155 return {156 content: [157 {158 type: "text",159 text: `Current temperature: ${data.current.temperature_2m}C`,160 },161 ],162 };163 }164);165166const weatherServer = createSdkMcpServer({167 name: "weather",168 version: "1.0.0",169 tools: [getTemperature],170});171172for await (const message of query({173 prompt: "What's the weather like in Rome?",174 options: {175 mcpServers: { weather: weatherServer },176 allowedTools: ["mcp__weather__get_temperature"],177 },178})) {179 if (message.type === "result" && message.subtype === "success") {180 console.log(message.result);181 }182}183```184185Custom tools follow the naming convention `mcp__{server_name}__{tool_name}`. You can use wildcards in `allowedTools`: `"mcp__weather__*"` allows all tools from the weather server.186187### Example: Database Query Tool188189```typescript190const queryDb = tool(191 "query_database",192 "Run a read-only SQL query against the application database",193 {194 sql: z.string().describe("SQL SELECT query to execute"),195 },196 async ({ sql }) => {197 // Validate: only allow SELECT queries198 if (!sql.trim().toUpperCase().startsWith("SELECT")) {199 return {200 content: [{ type: "text", text: "Error: Only SELECT queries are allowed." }],201 };202 }203204 const result = await pool.query(sql);205 return {206 content: [207 {208 type: "text",209 text: JSON.stringify(result.rows, null, 2),210 },211 ],212 };213 }214);215```216217## Connecting External MCP Servers218219Beyond in-process tools, you can connect to any existing MCP server - the same servers that work with Claude Desktop, Cursor, and other MCP clients.220221```typescript222for await (const message of query({223 prompt: "Check the latest issues in the frontend repo and summarize them",224 options: {225 mcpServers: {226 github: {227 command: "npx",228 args: ["-y", "@modelcontextprotocol/server-github"],229 env: { GITHUB_PERSONAL_ACCESS_TOKEN: process.env.GITHUB_TOKEN },230 },231 },232 allowedTools: ["mcp__github__*"],233 },234})) {235 // ...236}237```238239You can combine multiple MCP servers. The agent sees all tools from all connected servers and uses them as needed.240241```mermaid242graph LR243 Agent[Your Agent] --> SDK[Agent SDK]244 SDK --> InProcess[In-process MCP\nCustom Tools]245 SDK --> GitHub[GitHub MCP Server]246 SDK --> Postgres[PostgreSQL MCP Server]247 SDK --> Slack[Slack MCP Server]248```249250## Multi-Agent Orchestration251252For complex workflows, you can define specialized sub-agents that the parent agent delegates to. Each sub-agent has its own prompt, tools, and focus area.253254```typescript255for await (const message of query({256 prompt: "Review the PR, check for security issues, and update the changelog",257 options: {258 allowedTools: ["Read", "Edit", "Bash", "Glob", "Grep", "Agent"],259 agents: [260 {261 name: "security-reviewer",262 description: "Reviews code for security vulnerabilities",263 prompt: "You are a security expert. Analyze code for OWASP Top 10 vulnerabilities.",264 allowedTools: ["Read", "Glob", "Grep"],265 },266 {267 name: "changelog-writer",268 description: "Updates the CHANGELOG.md file based on recent changes",269 prompt: "You maintain the project changelog. Follow Keep a Changelog format.",270 allowedTools: ["Read", "Edit", "Bash"],271 },272 ],273 },274})) {275 // The parent agent will:276 // 1. Read the PR diff277 // 2. Delegate security review to security-reviewer278 // 3. Delegate changelog update to changelog-writer279 // 4. Synthesize results280}281```282283Include `"Agent"` in the parent's `allowedTools` to enable delegation. Sub-agents run with their own tools and cannot access the parent's tools unless explicitly granted.284285```mermaid286graph TD287 Parent[Parent Agent] --> SR[Security Reviewer\nRead, Glob, Grep]288 Parent --> CW[Changelog Writer\nRead, Edit, Bash]289 SR --> Report[Security Report]290 CW --> Updated[Updated CHANGELOG]291 Report --> Parent292 Updated --> Parent293 Parent --> Final[Final Summary]294```295296## Sessions and Continuity297298Agents can maintain context across multiple queries using sessions. Capture the `session_id` from the first interaction and pass it in `resume` for subsequent queries.299300```typescript301let sessionId: string | undefined;302303// First query304for await (const message of query({305 prompt: "Read the project structure and understand the architecture",306 options: { allowedTools: ["Read", "Glob", "Grep"] },307})) {308 if (message.type === "init") {309 sessionId = message.session_id;310 }311}312313// Follow-up query (same session, full context preserved)314for await (const message of query({315 prompt: "Now refactor the auth module based on what you learned",316 resume: sessionId,317 options: { allowedTools: ["Read", "Edit", "Bash"] },318})) {319 // Agent remembers the full project context from the first query320}321```322323## Claude Managed Agents324325If you don't want to host the agent infrastructure yourself, **Claude Managed Agents** (launched April 2026) provides a fully managed cloud service. Anthropic runs the containers, handles scaling, and provides a streaming API.326327```mermaid328graph LR329 subgraph "Self-hosted (Agent SDK)"330 Code[Your Code] --> SDK2[Agent SDK]331 SDK2 --> API[Claude API]332 SDK2 --> Tools[Your Tools]333 end334335 subgraph "Managed Agents"336 App[Your App] --> MAPI[Managed Agents API]337 MAPI --> Container[Anthropic-hosted Container]338 Container --> API2[Claude API]339 Container --> Tools2[Tools in Container]340 end341```342343The key difference: with the Agent SDK, you run the agent loop in your own infrastructure. With Managed Agents, Anthropic hosts and runs the agent for you. You interact through a session-based API and receive events via Server-Sent Events.344345**Pricing:**346- **Agent SDK**: standard Claude API token rates only. You handle hosting.347- **Managed Agents**: token rates plus $0.08 per session-hour (billed per millisecond).348349## Production Best Practices350351### 1. Always Sandbox352353Never run agents with unrestricted permissions on a production machine. Use containers (Docker, Fly.io, Modal) or sandboxed environments (E2B, Vercel Sandbox).354355### 2. Limit Tool Access356357Follow the principle of least privilege. An agent that generates reports does not need `Bash` or `Write`.358359```typescript360// Too permissive361allowedTools: ["Read", "Write", "Edit", "Bash", "Glob", "Grep"]362363// Better: only what's needed364allowedTools: ["Read", "Glob", "Grep"]365```366367### 3. Use Hooks for Guardrails368369Hooks let you intercept tool calls before and after execution. Use them for logging, validation, and rate limiting.370371```typescript372const conversation = query({373 prompt: "Analyze the codebase",374 options: {375 allowedTools: ["Read", "Glob", "Grep"],376 hooks: {377 PreToolUse: async (toolName, input) => {378 console.log(`Tool call: ${toolName}`, input);379 // Return false to block the call380 if (toolName === "Bash" && input.command.includes("rm")) {381 return false;382 }383 return true;384 },385 },386 },387});388```389390### 4. Handle Errors Gracefully391392The agent loop can produce errors - tool failures, API rate limits, context window overflow. Always check message types.393394```typescript395for await (const message of conversation) {396 switch (message.type) {397 case "assistant":398 // Agent reasoning399 break;400 case "tool_use":401 // Agent is calling a tool402 break;403 case "result":404 if (message.subtype === "error") {405 console.error("Agent failed:", message.error);406 }407 break;408 }409}410```411412### 5. Monitor Token Usage413414Agent loops can consume significant tokens, especially with large codebases. The SDK includes automatic context compaction, but you should still monitor usage.415416## Conclusion417418The Claude Agent SDK turns an LLM from a question-answering machine into something closer to a junior developer. Your agents can read, write, execute, verify, and iterate - the same workflow a human follows.419420Start small: build an agent with a few built-in tools. Then add custom MCP tools for your specific domain. Scale up to multi-agent orchestration when your workflows require specialization.421422The agent loop is the same one that powers Claude Code. If it can build software, your agents can too.423424> **Getting Started Checklist:**425>426> - [x] Install the SDK (`npm install @anthropic-ai/claude-agent-sdk`)427> - [x] Set `ANTHROPIC_API_KEY` in your environment428> - [x] Build a simple agent with built-in tools (Read, Glob, Grep)429> - [x] Add a custom tool via in-process MCP server430> - [x] Connect an external MCP server (GitHub, PostgreSQL, etc.)431> - [x] Implement multi-agent orchestration with sub-agents432> - [x] Set up a sandboxed environment for production433> - [x] Add hooks for logging and guardrails434
:Building AI Agents with the Claude Agent SDK: A Practical Guidelines 1-434 (END) — press q to close