Building AI Agents with the Claude Agent SDK: A Practical Guide

spinny:~/writing $ less building-ai-agents-claude-agent-sdk.md

1 
2The Claude Agent SDK gives you programmatic access to the same agent loop that powers Claude Code. Your agents can read files, execute shell commands, search the web, edit code, call external APIs through MCP servers, and orchestrate sub-agents  -  all from a few lines of TypeScript or Python.
3 
4Unlike the standard Anthropic Client SDK where you build your own tool loop, the Agent SDK handles tool execution, context management, retries, and orchestration internally. You describe what you want, provide the tools, and the agent figures out the rest.
5 
6## Architecture
7 
8The SDK follows a simple loop: **gather context, take action, verify, repeat**.
9 
10```mermaid
11graph TD
12    Input[User Prompt] --> Loop[Agent Loop]
13    Loop --> Reason[Claude Reasons]
14    Reason --> Tool[Call Tool]
15    Tool --> Result[Tool Result]
16    Result --> Loop
17    Reason --> Done[Final Response]
18 
19    subgraph "Built-in Tools"
20        T1[Read / Write / Edit]
21        T2[Bash / Terminal]
22        T3[Glob / Grep]
23        T4[WebSearch / WebFetch]
24    end
25 
26    subgraph "Custom Tools via MCP"
27        M1[Your API]
28        M2[Database]
29        M3[Slack / GitHub / etc.]
30    end
31 
32    Tool --> T1
33    Tool --> M1
34```
35 
36The core entry point is `query()`, which returns an async iterator that streams messages as the agent works. Each message tells you what the agent is doing: reasoning, calling a tool, receiving a result, or delivering the final output.
37 
38## Getting Started
39 
40### Installation
41 
42```bash
43# TypeScript
44npm install @anthropic-ai/claude-agent-sdk
45 
46# Python
47pip install claude-agent-sdk
48```
49 
50You need an Anthropic API key set as `ANTHROPIC_API_KEY` in your environment.
51 
52### Your First Agent
53 
54```typescript
55import { query } from "@anthropic-ai/claude-agent-sdk";
56 
57const conversation = query({
58  prompt: "Find all TODO comments in the codebase and create a summary",
59  options: {
60    allowedTools: ["Read", "Glob", "Grep"],
61  },
62});
63 
64for await (const message of conversation) {
65  if (message.type === "assistant") {
66    process.stdout.write(message.content);
67  }
68  if (message.type === "result" && message.subtype === "success") {
69    console.log("\nDone:", message.result);
70  }
71}
72```
73 
74That's it. The agent will use Glob to find files, Grep to search for TODO patterns, Read to inspect matches, and return a structured summary. You don't write the orchestration logic  -  the SDK handles it.
75 
76### Python Equivalent
77 
78```python
79from claude_agent_sdk import query
80 
81async for message in query(
82    prompt="Find all TODO comments in the codebase and create a summary",
83    options={"allowed_tools": ["Read", "Glob", "Grep"]},
84):
85    if message.type == "assistant":
86        print(message.content, end="")
87    if message.type == "result" and message.subtype == "success":
88        print(f"\nDone: {message.result}")
89```
90 
91## Built-in Tools
92 
93The SDK ships with the same tools available in Claude Code:
94 
95| Tool | Description |
96|------|-------------|
97| **Read** | Read file contents |
98| **Write** | Create new files |
99| **Edit** | Make targeted edits to existing files |
100| **Bash** | Execute shell commands |
101| **Glob** | Find files by pattern |
102| **Grep** | Search file contents with regex |
103| **WebSearch** | Search the web |
104| **WebFetch** | Fetch a URL and return its contents |
105| **AskUserQuestion** | Prompt the user for input |
106 
107You control which tools the agent can use through `allowedTools`. If a tool is not in the list, the agent cannot call it.
108 
109## Permission Modes
110 
111Since agents execute real commands on real systems, permissions matter.
112 
113| Mode | Behavior | Use Case |
114|------|----------|----------|
115| `default` | Custom `canUseTool` callback decides per-call | Fine-grained control |
116| `acceptEdits` | Auto-approve file operations, prompt for Bash | Development workflows |
117| `dontAsk` | Deny anything not in allowedTools | Restricted agents |
118| `bypassPermissions` | Approve everything automatically | Trusted sandboxed environments |
119| `auto` | Model classifier decides safety | Balanced automation |
120 
121```typescript
122const conversation = query({
123  prompt: "Refactor the auth module to use JWT",
124  options: {
125    allowedTools: ["Read", "Edit", "Glob", "Grep", "Bash"],
126    permissionMode: "acceptEdits",
127  },
128});
129```
130 
131For production use, always run agents in sandboxed environments (containers, VMs) and use the most restrictive permission mode that still allows the agent to do its job.
132 
133## Building Custom Tools with MCP
134 
135The real power of the SDK comes from extending agents with your own tools. Custom tools are defined as in-process MCP servers  -  no subprocess management, no network overhead.
136 
137### Example: Weather Tool
138 
139```typescript
140import { tool, createSdkMcpServer, query } from "@anthropic-ai/claude-agent-sdk";
141import { z } from "zod";
142 
143const getTemperature = tool(
144  "get_temperature",
145  "Get the current temperature at a location",
146  {
147    latitude: z.number().describe("Latitude"),
148    longitude: z.number().describe("Longitude"),
149  },
150  async ({ latitude, longitude }) => {
151    const res = await fetch(
152      `https://api.open-meteo.com/v1/forecast?latitude=${latitude}&longitude=${longitude}&current=temperature_2m&temperature_unit=celsius`
153    );
154    const data = await res.json();
155    return {
156      content: [
157        {
158          type: "text",
159          text: `Current temperature: ${data.current.temperature_2m}C`,
160        },
161      ],
162    };
163  }
164);
165 
166const weatherServer = createSdkMcpServer({
167  name: "weather",
168  version: "1.0.0",
169  tools: [getTemperature],
170});
171 
172for await (const message of query({
173  prompt: "What's the weather like in Rome?",
174  options: {
175    mcpServers: { weather: weatherServer },
176    allowedTools: ["mcp__weather__get_temperature"],
177  },
178})) {
179  if (message.type === "result" && message.subtype === "success") {
180    console.log(message.result);
181  }
182}
183```
184 
185Custom tools follow the naming convention `mcp__{server_name}__{tool_name}`. You can use wildcards in `allowedTools`: `"mcp__weather__*"` allows all tools from the weather server.
186 
187### Example: Database Query Tool
188 
189```typescript
190const queryDb = tool(
191  "query_database",
192  "Run a read-only SQL query against the application database",
193  {
194    sql: z.string().describe("SQL SELECT query to execute"),
195  },
196  async ({ sql }) => {
197    // Validate: only allow SELECT queries
198    if (!sql.trim().toUpperCase().startsWith("SELECT")) {
199      return {
200        content: [{ type: "text", text: "Error: Only SELECT queries are allowed." }],
201      };
202    }
203 
204    const result = await pool.query(sql);
205    return {
206      content: [
207        {
208          type: "text",
209          text: JSON.stringify(result.rows, null, 2),
210        },
211      ],
212    };
213  }
214);
215```
216 
217## Connecting External MCP Servers
218 
219Beyond in-process tools, you can connect to any existing MCP server  -  the same servers that work with Claude Desktop, Cursor, and other MCP clients.
220 
221```typescript
222for await (const message of query({
223  prompt: "Check the latest issues in the frontend repo and summarize them",
224  options: {
225    mcpServers: {
226      github: {
227        command: "npx",
228        args: ["-y", "@modelcontextprotocol/server-github"],
229        env: { GITHUB_PERSONAL_ACCESS_TOKEN: process.env.GITHUB_TOKEN },
230      },
231    },
232    allowedTools: ["mcp__github__*"],
233  },
234})) {
235  // ...
236}
237```
238 
239You can combine multiple MCP servers. The agent sees all tools from all connected servers and uses them as needed.
240 
241```mermaid
242graph LR
243    Agent[Your Agent] --> SDK[Agent SDK]
244    SDK --> InProcess[In-process MCP\nCustom Tools]
245    SDK --> GitHub[GitHub MCP Server]
246    SDK --> Postgres[PostgreSQL MCP Server]
247    SDK --> Slack[Slack MCP Server]
248```
249 
250## Multi-Agent Orchestration
251 
252For complex workflows, you can define specialized sub-agents that the parent agent delegates to. Each sub-agent has its own prompt, tools, and focus area.
253 
254```typescript
255for await (const message of query({
256  prompt: "Review the PR, check for security issues, and update the changelog",
257  options: {
258    allowedTools: ["Read", "Edit", "Bash", "Glob", "Grep", "Agent"],
259    agents: [
260      {
261        name: "security-reviewer",
262        description: "Reviews code for security vulnerabilities",
263        prompt: "You are a security expert. Analyze code for OWASP Top 10 vulnerabilities.",
264        allowedTools: ["Read", "Glob", "Grep"],
265      },
266      {
267        name: "changelog-writer",
268        description: "Updates the CHANGELOG.md file based on recent changes",
269        prompt: "You maintain the project changelog. Follow Keep a Changelog format.",
270        allowedTools: ["Read", "Edit", "Bash"],
271      },
272    ],
273  },
274})) {
275  // The parent agent will:
276  // 1. Read the PR diff
277  // 2. Delegate security review to security-reviewer
278  // 3. Delegate changelog update to changelog-writer
279  // 4. Synthesize results
280}
281```
282 
283Include `"Agent"` in the parent's `allowedTools` to enable delegation. Sub-agents run with their own tools and cannot access the parent's tools unless explicitly granted.
284 
285```mermaid
286graph TD
287    Parent[Parent Agent] --> SR[Security Reviewer\nRead, Glob, Grep]
288    Parent --> CW[Changelog Writer\nRead, Edit, Bash]
289    SR --> Report[Security Report]
290    CW --> Updated[Updated CHANGELOG]
291    Report --> Parent
292    Updated --> Parent
293    Parent --> Final[Final Summary]
294```
295 
296## Sessions and Continuity
297 
298Agents can maintain context across multiple queries using sessions. Capture the `session_id` from the first interaction and pass it in `resume` for subsequent queries.
299 
300```typescript
301let sessionId: string | undefined;
302 
303// First query
304for await (const message of query({
305  prompt: "Read the project structure and understand the architecture",
306  options: { allowedTools: ["Read", "Glob", "Grep"] },
307})) {
308  if (message.type === "init") {
309    sessionId = message.session_id;
310  }
311}
312 
313// Follow-up query (same session, full context preserved)
314for await (const message of query({
315  prompt: "Now refactor the auth module based on what you learned",
316  resume: sessionId,
317  options: { allowedTools: ["Read", "Edit", "Bash"] },
318})) {
319  // Agent remembers the full project context from the first query
320}
321```
322 
323## Claude Managed Agents
324 
325If you don't want to host the agent infrastructure yourself, **Claude Managed Agents** (launched April 2026) provides a fully managed cloud service. Anthropic runs the containers, handles scaling, and provides a streaming API.
326 
327```mermaid
328graph LR
329    subgraph "Self-hosted (Agent SDK)"
330        Code[Your Code] --> SDK2[Agent SDK]
331        SDK2 --> API[Claude API]
332        SDK2 --> Tools[Your Tools]
333    end
334 
335    subgraph "Managed Agents"
336        App[Your App] --> MAPI[Managed Agents API]
337        MAPI --> Container[Anthropic-hosted Container]
338        Container --> API2[Claude API]
339        Container --> Tools2[Tools in Container]
340    end
341```
342 
343The key difference: with the Agent SDK, you run the agent loop in your own infrastructure. With Managed Agents, Anthropic hosts and runs the agent for you. You interact through a session-based API and receive events via Server-Sent Events.
344 
345**Pricing:**
346- **Agent SDK**: standard Claude API token rates only. You handle hosting.
347- **Managed Agents**: token rates plus $0.08 per session-hour (billed per millisecond).
348 
349## Production Best Practices
350 
351### 1. Always Sandbox
352 
353Never run agents with unrestricted permissions on a production machine. Use containers (Docker, Fly.io, Modal) or sandboxed environments (E2B, Vercel Sandbox).
354 
355### 2. Limit Tool Access
356 
357Follow the principle of least privilege. An agent that generates reports does not need `Bash` or `Write`.
358 
359```typescript
360// Too permissive
361allowedTools: ["Read", "Write", "Edit", "Bash", "Glob", "Grep"]
362 
363// Better: only what's needed
364allowedTools: ["Read", "Glob", "Grep"]
365```
366 
367### 3. Use Hooks for Guardrails
368 
369Hooks let you intercept tool calls before and after execution. Use them for logging, validation, and rate limiting.
370 
371```typescript
372const conversation = query({
373  prompt: "Analyze the codebase",
374  options: {
375    allowedTools: ["Read", "Glob", "Grep"],
376    hooks: {
377      PreToolUse: async (toolName, input) => {
378        console.log(`Tool call: ${toolName}`, input);
379        // Return false to block the call
380        if (toolName === "Bash" && input.command.includes("rm")) {
381          return false;
382        }
383        return true;
384      },
385    },
386  },
387});
388```
389 
390### 4. Handle Errors Gracefully
391 
392The agent loop can produce errors  -  tool failures, API rate limits, context window overflow. Always check message types.
393 
394```typescript
395for await (const message of conversation) {
396  switch (message.type) {
397    case "assistant":
398      // Agent reasoning
399      break;
400    case "tool_use":
401      // Agent is calling a tool
402      break;
403    case "result":
404      if (message.subtype === "error") {
405        console.error("Agent failed:", message.error);
406      }
407      break;
408  }
409}
410```
411 
412### 5. Monitor Token Usage
413 
414Agent loops can consume significant tokens, especially with large codebases. The SDK includes automatic context compaction, but you should still monitor usage.
415 
416## Conclusion
417 
418The Claude Agent SDK turns an LLM from a question-answering machine into something closer to a junior developer. Your agents can read, write, execute, verify, and iterate  -  the same workflow a human follows.
419 
420Start small: build an agent with a few built-in tools. Then add custom MCP tools for your specific domain. Scale up to multi-agent orchestration when your workflows require specialization.
421 
422The agent loop is the same one that powers Claude Code. If it can build software, your agents can too.
423 
424> **Getting Started Checklist:**
425>
426> - [x] Install the SDK (`npm install @anthropic-ai/claude-agent-sdk`)
427> - [x] Set `ANTHROPIC_API_KEY` in your environment
428> - [x] Build a simple agent with built-in tools (Read, Glob, Grep)
429> - [x] Add a custom tool via in-process MCP server
430> - [x] Connect an external MCP server (GitHub, PostgreSQL, etc.)
431> - [x] Implement multi-agent orchestration with sub-agents
432> - [x] Set up a sandboxed environment for production
433> - [x] Add hooks for logging and guardrails
434

:Building AI Agents with the Claude Agent SDK: A Practical Guidelines 1-434 (END) — press q to close

2The Claude Agent SDK gives you programmatic access to the same agent loop that powers Claude Code. Your agents can read files, execute shell commands, search the web, edit code, call external APIs through MCP servers, and orchestrate sub-agents - all from a few lines of TypeScript or Python.

4Unlike the standard Anthropic Client SDK where you build your own tool loop, the Agent SDK handles tool execution, context management, retries, and orchestration internally. You describe what you want, provide the tools, and the agent figures out the rest.

6## Architecture

8The SDK follows a simple loop: **gather context, take action, verify, repeat**.

10```mermaid

11graph TD

12 Input[User Prompt] --> Loop[Agent Loop]

13 Loop --> Reason[Claude Reasons]

14 Reason --> Tool[Call Tool]

15 Tool --> Result[Tool Result]

16 Result --> Loop

17 Reason --> Done[Final Response]

19 subgraph "Built-in Tools"

20 T1[Read / Write / Edit]

21 T2[Bash / Terminal]

22 T3[Glob / Grep]

23 T4[WebSearch / WebFetch]

24 end

26 subgraph "Custom Tools via MCP"

27 M1[Your API]

28 M2[Database]

29 M3[Slack / GitHub / etc.]

30 end

32 Tool --> T1

33 Tool --> M1

34```

36The core entry point is `query()`, which returns an async iterator that streams messages as the agent works. Each message tells you what the agent is doing: reasoning, calling a tool, receiving a result, or delivering the final output.

38## Getting Started

40### Installation

42```bash

43# TypeScript

44npm install @anthropic-ai/claude-agent-sdk

46# Python

47pip install claude-agent-sdk

48```

50You need an Anthropic API key set as `ANTHROPIC_API_KEY` in your environment.

52### Your First Agent

54```typescript

55import { query } from "@anthropic-ai/claude-agent-sdk";

57const conversation = query({

58 prompt: "Find all TODO comments in the codebase and create a summary",

59 options: {

60 allowedTools: ["Read", "Glob", "Grep"],

61 },

62});

64for await (const message of conversation) {

65 if (message.type === "assistant") {

66 process.stdout.write(message.content);

67 }

68 if (message.type === "result" && message.subtype === "success") {

69 console.log("\nDone:", message.result);

70 }

71}

72```

74That's it. The agent will use Glob to find files, Grep to search for TODO patterns, Read to inspect matches, and return a structured summary. You don't write the orchestration logic - the SDK handles it.

76### Python Equivalent

78```python

79from claude_agent_sdk import query

81async for message in query(

82 prompt="Find all TODO comments in the codebase and create a summary",

83 options={"allowed_tools": ["Read", "Glob", "Grep"]},

84):

85 if message.type == "assistant":

86 print(message.content, end="")

87 if message.type == "result" and message.subtype == "success":

88 print(f"\nDone: {message.result}")

89```

91## Built-in Tools

93The SDK ships with the same tools available in Claude Code:

95| Tool | Description |

96|------|-------------|

97| **Read** | Read file contents |

98| **Write** | Create new files |

99| **Edit** | Make targeted edits to existing files |

100| **Bash** | Execute shell commands |

101| **Glob** | Find files by pattern |

102| **Grep** | Search file contents with regex |

103| **WebSearch** | Search the web |

104| **WebFetch** | Fetch a URL and return its contents |

105| **AskUserQuestion** | Prompt the user for input |

106

107You control which tools the agent can use through `allowedTools`. If a tool is not in the list, the agent cannot call it.

108

109## Permission Modes

110

111Since agents execute real commands on real systems, permissions matter.

112

113| Mode | Behavior | Use Case |

114|------|----------|----------|

115| `default` | Custom `canUseTool` callback decides per-call | Fine-grained control |

116| `acceptEdits` | Auto-approve file operations, prompt for Bash | Development workflows |

117| `dontAsk` | Deny anything not in allowedTools | Restricted agents |

118| `bypassPermissions` | Approve everything automatically | Trusted sandboxed environments |

119| `auto` | Model classifier decides safety | Balanced automation |

120

121```typescript

122const conversation = query({

123 prompt: "Refactor the auth module to use JWT",

124 options: {

125 allowedTools: ["Read", "Edit", "Glob", "Grep", "Bash"],

126 permissionMode: "acceptEdits",

127 },

128});

129```

130

131For production use, always run agents in sandboxed environments (containers, VMs) and use the most restrictive permission mode that still allows the agent to do its job.

132

133## Building Custom Tools with MCP

134

135The real power of the SDK comes from extending agents with your own tools. Custom tools are defined as in-process MCP servers - no subprocess management, no network overhead.

136

137### Example: Weather Tool

138

139```typescript

140import { tool, createSdkMcpServer, query } from "@anthropic-ai/claude-agent-sdk";

141import { z } from "zod";

142

143const getTemperature = tool(

144 "get_temperature",

145 "Get the current temperature at a location",

146 {

147 latitude: z.number().describe("Latitude"),

148 longitude: z.number().describe("Longitude"),

149 },

150 async ({ latitude, longitude }) => {

151 const res = await fetch(

152 `https://api.open-meteo.com/v1/forecast?latitude=${latitude}&longitude=${longitude}&current=temperature_2m&temperature_unit=celsius`

153 );

154 const data = await res.json();

155 return {

156 content: [

157 {

158 type: "text",

159 text: `Current temperature: ${data.current.temperature_2m}C`,

160 },

161 ],

162 };

163 }

164);

165

166const weatherServer = createSdkMcpServer({

167 name: "weather",

168 version: "1.0.0",

169 tools: [getTemperature],

170});

171

172for await (const message of query({

173 prompt: "What's the weather like in Rome?",

174 options: {

175 mcpServers: { weather: weatherServer },

176 allowedTools: ["mcp__weather__get_temperature"],

177 },

178})) {

179 if (message.type === "result" && message.subtype === "success") {

180 console.log(message.result);

181 }

182}

183```

184

185Custom tools follow the naming convention `mcp__{server_name}__{tool_name}`. You can use wildcards in `allowedTools`: `"mcp__weather__*"` allows all tools from the weather server.

186

187### Example: Database Query Tool

188

189```typescript

190const queryDb = tool(

191 "query_database",

192 "Run a read-only SQL query against the application database",

193 {

194 sql: z.string().describe("SQL SELECT query to execute"),

195 },

196 async ({ sql }) => {

197 // Validate: only allow SELECT queries

198 if (!sql.trim().toUpperCase().startsWith("SELECT")) {

199 return {

200 content: [{ type: "text", text: "Error: Only SELECT queries are allowed." }],

201 };

202 }

203

204 const result = await pool.query(sql);

205 return {

206 content: [

207 {

208 type: "text",

209 text: JSON.stringify(result.rows, null, 2),

210 },

211 ],

212 };

213 }

214);

215```

216

217## Connecting External MCP Servers

218

219Beyond in-process tools, you can connect to any existing MCP server - the same servers that work with Claude Desktop, Cursor, and other MCP clients.

220

221```typescript

222for await (const message of query({

223 prompt: "Check the latest issues in the frontend repo and summarize them",

224 options: {

225 mcpServers: {

226 github: {

227 command: "npx",

228 args: ["-y", "@modelcontextprotocol/server-github"],

229 env: { GITHUB_PERSONAL_ACCESS_TOKEN: process.env.GITHUB_TOKEN },

230 },

231 },

232 allowedTools: ["mcp__github__*"],

233 },

234})) {

235 // ...

236}

237```

238

239You can combine multiple MCP servers. The agent sees all tools from all connected servers and uses them as needed.

240

241```mermaid

242graph LR

243 Agent[Your Agent] --> SDK[Agent SDK]

244 SDK --> InProcess[In-process MCP\nCustom Tools]

245 SDK --> GitHub[GitHub MCP Server]

246 SDK --> Postgres[PostgreSQL MCP Server]

247 SDK --> Slack[Slack MCP Server]

248```

249

250## Multi-Agent Orchestration

251

252For complex workflows, you can define specialized sub-agents that the parent agent delegates to. Each sub-agent has its own prompt, tools, and focus area.

253

254```typescript

255for await (const message of query({

256 prompt: "Review the PR, check for security issues, and update the changelog",

257 options: {

258 allowedTools: ["Read", "Edit", "Bash", "Glob", "Grep", "Agent"],

259 agents: [

260 {

261 name: "security-reviewer",

262 description: "Reviews code for security vulnerabilities",

263 prompt: "You are a security expert. Analyze code for OWASP Top 10 vulnerabilities.",

264 allowedTools: ["Read", "Glob", "Grep"],

265 },

266 {

267 name: "changelog-writer",

268 description: "Updates the CHANGELOG.md file based on recent changes",

269 prompt: "You maintain the project changelog. Follow Keep a Changelog format.",

270 allowedTools: ["Read", "Edit", "Bash"],

271 },

272 ],

273 },

274})) {

275 // The parent agent will:

276 // 1. Read the PR diff

277 // 2. Delegate security review to security-reviewer

278 // 3. Delegate changelog update to changelog-writer

279 // 4. Synthesize results

280}

281```

282

283Include `"Agent"` in the parent's `allowedTools` to enable delegation. Sub-agents run with their own tools and cannot access the parent's tools unless explicitly granted.

284

285```mermaid

286graph TD

287 Parent[Parent Agent] --> SR[Security Reviewer\nRead, Glob, Grep]

288 Parent --> CW[Changelog Writer\nRead, Edit, Bash]

289 SR --> Report[Security Report]

290 CW --> Updated[Updated CHANGELOG]

291 Report --> Parent

292 Updated --> Parent

293 Parent --> Final[Final Summary]

294```

295

296## Sessions and Continuity

297

298Agents can maintain context across multiple queries using sessions. Capture the `session_id` from the first interaction and pass it in `resume` for subsequent queries.

299

300```typescript

301let sessionId: string | undefined;

302

303// First query

304for await (const message of query({

305 prompt: "Read the project structure and understand the architecture",

306 options: { allowedTools: ["Read", "Glob", "Grep"] },

307})) {

308 if (message.type === "init") {

309 sessionId = message.session_id;

310 }

311}

312

313// Follow-up query (same session, full context preserved)

314for await (const message of query({

315 prompt: "Now refactor the auth module based on what you learned",

316 resume: sessionId,

317 options: { allowedTools: ["Read", "Edit", "Bash"] },

318})) {

319 // Agent remembers the full project context from the first query

320}

321```

322

323## Claude Managed Agents

324

325If you don't want to host the agent infrastructure yourself, **Claude Managed Agents** (launched April 2026) provides a fully managed cloud service. Anthropic runs the containers, handles scaling, and provides a streaming API.

326

327```mermaid

328graph LR

329 subgraph "Self-hosted (Agent SDK)"

330 Code[Your Code] --> SDK2[Agent SDK]

331 SDK2 --> API[Claude API]

332 SDK2 --> Tools[Your Tools]

333 end

334

335 subgraph "Managed Agents"

336 App[Your App] --> MAPI[Managed Agents API]

337 MAPI --> Container[Anthropic-hosted Container]

338 Container --> API2[Claude API]

339 Container --> Tools2[Tools in Container]

340 end

341```

342

343The key difference: with the Agent SDK, you run the agent loop in your own infrastructure. With Managed Agents, Anthropic hosts and runs the agent for you. You interact through a session-based API and receive events via Server-Sent Events.

344

345**Pricing:**

346- **Agent SDK**: standard Claude API token rates only. You handle hosting.

347- **Managed Agents**: token rates plus $0.08 per session-hour (billed per millisecond).

348

349## Production Best Practices

350

351### 1. Always Sandbox

352

353Never run agents with unrestricted permissions on a production machine. Use containers (Docker, Fly.io, Modal) or sandboxed environments (E2B, Vercel Sandbox).

354

355### 2. Limit Tool Access

356

357Follow the principle of least privilege. An agent that generates reports does not need `Bash` or `Write`.

358

359```typescript

360// Too permissive

361allowedTools: ["Read", "Write", "Edit", "Bash", "Glob", "Grep"]

362

363// Better: only what's needed

364allowedTools: ["Read", "Glob", "Grep"]

365```

366

367### 3. Use Hooks for Guardrails

368

369Hooks let you intercept tool calls before and after execution. Use them for logging, validation, and rate limiting.

370

371```typescript

372const conversation = query({

373 prompt: "Analyze the codebase",

374 options: {

375 allowedTools: ["Read", "Glob", "Grep"],

376 hooks: {

377 PreToolUse: async (toolName, input) => {

378 console.log(`Tool call: ${toolName}`, input);

379 // Return false to block the call

380 if (toolName === "Bash" && input.command.includes("rm")) {

381 return false;

382 }

383 return true;

384 },

385 },

386 },

387});

388```

389

390### 4. Handle Errors Gracefully

391

392The agent loop can produce errors - tool failures, API rate limits, context window overflow. Always check message types.

393

394```typescript

395for await (const message of conversation) {

396 switch (message.type) {

397 case "assistant":

398 // Agent reasoning

399 break;

400 case "tool_use":

401 // Agent is calling a tool

402 break;

403 case "result":

404 if (message.subtype === "error") {

405 console.error("Agent failed:", message.error);

406 }

407 break;

408 }

409}

410```

411

412### 5. Monitor Token Usage

413

414Agent loops can consume significant tokens, especially with large codebases. The SDK includes automatic context compaction, but you should still monitor usage.

415

416## Conclusion

417

418The Claude Agent SDK turns an LLM from a question-answering machine into something closer to a junior developer. Your agents can read, write, execute, verify, and iterate - the same workflow a human follows.

419

420Start small: build an agent with a few built-in tools. Then add custom MCP tools for your specific domain. Scale up to multi-agent orchestration when your workflows require specialization.

421

422The agent loop is the same one that powers Claude Code. If it can build software, your agents can too.

423

424> **Getting Started Checklist:**

425>

426> - [x] Install the SDK (`npm install @anthropic-ai/claude-agent-sdk`)

427> - [x] Set `ANTHROPIC_API_KEY` in your environment

428> - [x] Build a simple agent with built-in tools (Read, Glob, Grep)

429> - [x] Add a custom tool via in-process MCP server

430> - [x] Connect an external MCP server (GitHub, PostgreSQL, etc.)

431> - [x] Implement multi-agent orchestration with sub-agents

432> - [x] Set up a sandboxed environment for production

433> - [x] Add hooks for logging and guardrails

434