spinny:~/writing $ vim agentic-ai-frameworks-comparison.md
1~2AI agents have moved from research demos to production systems. Over 60% of enterprise AI applications are expected to include agentic components by 2026. But building agents from scratch - managing tool loops, state, memory, error handling, and multi-agent coordination - is complex. That's where frameworks come in.3~4Four frameworks dominate in 2026: **LangGraph**, **CrewAI**, **OpenAI Agents SDK**, and **Claude Agent SDK**. Each takes a fundamentally different approach to the same problem: giving LLMs the ability to reason, plan, use tools, and collaborate.5~6## At a Glance7~8| Aspect | LangGraph | CrewAI | OpenAI Agents SDK | Claude Agent SDK |9|--------|-----------|--------|-------------------|-----------------|10| **By** | LangChain | CrewAI Inc. | OpenAI | Anthropic |11| **Architecture** | Graph-based | Role-based | Handoff-based | Autonomous loop |12| **Philosophy** | Maximum control | Team collaboration | Minimal abstraction | Give agent a computer |13| **Languages** | Python, TypeScript | Python | Python | Python, TypeScript |14| **Model support** | Any (OpenAI, Claude, local) | Any | Any (despite the name) | Claude only |15| **GitHub stars** | ~29k | ~40k | ~21k | ~6k |16| **Best for** | Complex stateful workflows | Multi-agent specialization | Routing and triage | Coding and file-heavy tasks |17~18## LangGraph: The Graph Builder19~20LangGraph models agent workflows as **directed cyclic graphs**. You define nodes (functions that do work) and edges (transitions between them, optionally conditional). State flows through the graph and persists via checkpointing.21~22This is the most explicit and controllable framework - you wire every step yourself.23~24```mermaid25graph LR26 Start --> Router[Router Node]27 Router -->|needs research| Research[Research Node]28 Router -->|needs code| Code[Code Node]29 Research --> Synthesize[Synthesize Node]30 Code --> Synthesize31 Synthesize --> End32```33~34### Core Concepts35~36- **StateGraph**: the graph definition with typed state37- **Nodes**: Python functions that transform state38- **Edges**: connections between nodes, can be conditional39- **Checkpointing**: built-in persistence for long-running workflows40~41### Code Example42~43```python44from langgraph.graph import StateGraph, MessagesState, START, END45from langchain_openai import ChatOpenAI46~47llm = ChatOpenAI(model="gpt-4o")48~49def call_agent(state: MessagesState):50 response = llm.invoke(state["messages"])51 return {"messages": [response]}52~53def should_continue(state: MessagesState):54 last = state["messages"][-1]55 if last.tool_calls:56 return "tools"57 return END58~59def call_tools(state: MessagesState):60 # Execute tool calls and return results61 results = []62 for tool_call in state["messages"][-1].tool_calls:63 result = execute_tool(tool_call)64 results.append(result)65 return {"messages": results}66~67graph = StateGraph(MessagesState)68graph.add_node("agent", call_agent)69graph.add_node("tools", call_tools)70graph.add_edge(START, "agent")71graph.add_conditional_edges("agent", should_continue, {"tools": "tools", END: END})72graph.add_edge("tools", "agent")73~74app = graph.compile()75result = app.invoke({"messages": [{"role": "user", "content": "What's the weather?"}]})76```77~78### Strengths79~80- Fine-grained control over every step and transition81- Built-in checkpointing and human-in-the-loop82- Full TypeScript parity83- Works with any LLM provider84- Best for complex workflows with conditional branching and loops85~86### Weaknesses87~88- Steep learning curve - you need to understand graph theory concepts89- Verbose for simple use cases - a basic agent requires more boilerplate than other frameworks90- Debugging graph flows can be challenging without LangSmith91~92### Pricing93~94Open-source (MIT). LangSmith (managed observability platform) has paid tiers for production monitoring.95~96## CrewAI: The Team Assembler97~98CrewAI takes a human metaphor: you assemble a **crew** of specialized agents, each with a **role**, **goal**, and **backstory**. Agents collaborate on **tasks** using **tools**, coordinated by a **process** (sequential, hierarchical, or consensual).99~100Think of it as hiring a team where each member has a specific job title and specialty.101~102```mermaid103graph TD104 Crew[Crew Manager] --> R[Researcher\nRole: Find data\nTools: WebSearch]105 Crew --> W[Writer\nRole: Write content\nTools: FileWrite]106 Crew --> E[Editor\nRole: Review quality\nTools: FileRead]107 R --> Task1[Research Task]108 W --> Task2[Writing Task]109 E --> Task3[Review Task]110 Task1 --> Task2 --> Task3111```112~113### Core Concepts114~115- **Agent**: a persona with role, goal, backstory, and tools116- **Task**: an assignment with description, expected output, and assigned agent117- **Crew**: a group of agents working together118- **Process**: execution strategy (sequential, hierarchical, consensual)119- **Flow**: event-driven orchestration layer for connecting multiple crews120~121### Code Example122~123```python124from crewai import Agent, Task, Crew, Process125~126researcher = Agent(127 role="Senior Research Analyst",128 goal="Find comprehensive data about the given topic",129 backstory="You have 10 years of experience in technology research. "130 "You are thorough and always verify facts from multiple sources.",131 tools=[web_search_tool],132 verbose=True,133)134~135writer = Agent(136 role="Technical Writer",137 goal="Create clear, engaging technical content",138 backstory="You write for a developer audience. "139 "Your articles are practical and include code examples.",140 tools=[file_tool],141 verbose=True,142)143~144research_task = Task(145 description="Research the latest developments in WebAssembly in 2026. "146 "Focus on WASI, Component Model, and production use cases.",147 expected_output="A structured research document with key findings and sources.",148 agent=researcher,149)150~151writing_task = Task(152 description="Write a blog post based on the research. "153 "Include code examples and Mermaid diagrams.",154 expected_output="A complete blog post in Markdown format.",155 agent=writer,156 context=[research_task], # Writer receives researcher's output157)158~159crew = Crew(160 agents=[researcher, writer],161 tasks=[research_task, writing_task],162 process=Process.sequential,163 verbose=True,164)165~166result = crew.kickoff()167print(result.raw)168```169~170### Strengths171~172- Intuitive role-based abstraction - easy to reason about173- 100+ built-in tool integrations174- Shared memory across agents (short-term, long-term, entity)175- Largest community (~40k GitHub stars)176- Hierarchical process with a "manager" agent that delegates and validates177~178### Weaknesses179~180- Less fine-grained control than LangGraph - you define roles, not exact execution paths181- Hierarchical process can be unpredictable when agents disagree182- Debugging multi-agent conversations is harder than single-agent flows183~184### Pricing185~186Open-source core (free). CrewAI Platform: $99/month (Teams) to $120k/year (Enterprise). Pricing based on live crews and monthly executions.187~188## OpenAI Agents SDK: The Router189~190The OpenAI Agents SDK (spiritual successor to Swarm) focuses on **handoffs** - agents transferring conversations to other specialized agents. It is the most minimal framework: agents, tools, handoffs, and guardrails. That's it.191~192```mermaid193graph LR194 User --> Triage[Triage Agent]195 Triage -->|billing question| Billing[Billing Agent]196 Triage -->|refund request| Refund[Refund Agent]197 Triage -->|technical issue| Support[Support Agent]198 Billing --> Response[Response]199 Refund --> Response200 Support --> Response201```202~203### Core Concepts204~205- **Agent**: model + instructions + tools + handoffs206- **Handoff**: a transfer to another agent (modeled as a tool the LLM can call)207- **Guardrail**: input/output validation that runs in parallel with the agent208- **Runner**: executes the agent loop209- **Tracing**: built-in observability for all LLM calls, tool invocations, and handoffs210~211### Code Example212~213```python214from agents import Agent, Runner, handoff, InputGuardrail, GuardrailFunctionOutput215from pydantic import BaseModel216~217class SafetyCheck(BaseModel):218 is_safe: bool219 reason: str220~221async def content_safety(ctx, agent, input_text):222 result = await Runner.run(223 Agent(name="Safety", instructions="Check if input is safe. No PII."),224 input_text,225 context=ctx,226 )227 output = SafetyCheck.model_validate_json(result.final_output)228 return GuardrailFunctionOutput(229 output_info=output, tripwire_triggered=not output.is_safe230 )231~232billing_agent = Agent(233 name="Billing Agent",234 instructions="You handle billing inquiries. Be precise with numbers.",235 tools=[lookup_invoice, process_payment],236)237~238refund_agent = Agent(239 name="Refund Agent",240 instructions="You process refund requests. Always verify the order first.",241 tools=[lookup_order, issue_refund],242)243~244triage_agent = Agent(245 name="Triage Agent",246 instructions="Route the customer to the right specialist. "247 "Ask clarifying questions if needed.",248 handoffs=[billing_agent, refund_agent],249 input_guardrails=[InputGuardrail(guardrail_function=content_safety)],250)251~252result = await Runner.run(triage_agent, "I need a refund for order #4521")253print(result.final_output)254# The triage agent routes to refund_agent, which processes the refund255```256~257### Strengths258~259- Clean handoff pattern - natural for routing/triage workflows260- Guardrails run in parallel with execution (fail-fast, not blocking)261- Built-in tracing dashboard for debugging262- Despite the name, supports non-OpenAI models263- Minimal abstraction - easy to understand and extend264~265### Weaknesses266~267- Less mature state management than LangGraph268- No built-in persistence or checkpointing269- Ecosystem of third-party tools is smaller270- Handoff-centric design may not fit every architecture271~272### Pricing273~274Open-source (MIT). You pay per-token for whatever model you use.275~276## Claude Agent SDK: The Developer277~278The Claude Agent SDK takes a different approach: instead of defining workflows or roles, you give the agent a **set of tools and let it figure out how to accomplish the task**. It uses the same autonomous loop that powers Claude Code - read, act, verify, iterate.279~280```mermaid281graph TD282 Prompt[User Prompt] --> Loop[Autonomous Agent Loop]283 Loop --> Reason[Reason about next step]284 Reason --> Act[Execute tool]285 Act --> Verify[Check result]286 Verify -->|not done| Loop287 Verify -->|done| Output[Final output]288```289~290### Core Concepts291~292- **query()**: the main entry point that starts the agent loop293- **Built-in tools**: Read, Write, Edit, Bash, Glob, Grep, WebSearch, WebFetch294- **Custom tools via MCP**: define tools as in-process MCP servers295- **Sub-agents**: specialized agents the parent can delegate to296- **Sessions**: maintain context across multiple interactions297~298### Code Example299~300```typescript301import { tool, createSdkMcpServer, query } from "@anthropic-ai/claude-agent-sdk";302import { z } from "zod";303~304const searchDocs = tool(305 "search_docs",306 "Search the internal documentation for relevant information",307 { query: z.string().describe("Search query") },308 async ({ query }) => {309 const results = await vectorStore.similaritySearch(query, 5);310 return {311 content: [{ type: "text", text: results.map(r => r.pageContent).join("\n\n") }],312 };313 }314);315~316const docsServer = createSdkMcpServer({317 name: "docs",318 version: "1.0.0",319 tools: [searchDocs],320});321~322for await (const message of query({323 prompt: "Find how authentication works in our system and write a summary",324 options: {325 mcpServers: { docs: docsServer },326 allowedTools: ["Read", "Glob", "Grep", "mcp__docs__search_docs"],327 },328})) {329 if (message.type === "result" && message.subtype === "success") {330 console.log(message.result);331 }332}333```334~335### Strengths336~337- First-class MCP integration - connect to any MCP server ecosystem338- Built-in tools for file operations, terminal, and web access339- Automatic context compaction for large codebases340- Sub-agent parallelism for complex tasks341- Same engine as Claude Code - battle-tested on real development workflows342~343### Weaknesses344~345- Claude models only - no multi-provider support346- Newer framework with a smaller community347- Requires Node.js runtime even for the Python SDK348- Less explicit workflow control compared to LangGraph349~350### Pricing351~352Open-source. Standard Claude API token rates. Managed Agents (hosted version): $0.08 per session-hour in addition to token costs.353~354## When to Choose Which355~356```mermaid357graph TD358 Start{What's your priority?}359 Start -->|Full control over workflow| LG[LangGraph]360 Start -->|Multi-agent collaboration| CA[CrewAI]361 Start -->|Routing and triage| OA[OpenAI Agents SDK]362 Start -->|Coding and file automation| CS[Claude Agent SDK]363~364 LG --> LGU[Complex stateful workflows\nConditional branching\nHuman-in-the-loop]365 CA --> CAU[Team of specialized agents\nResearch + writing pipelines\nContent generation]366 OA --> OAU[Customer service routing\nMulti-step handoffs\nInput validation]367 CS --> CSU[Code generation and review\nFile-heavy automation\nMCP tool ecosystem]368```369~370### Choose LangGraph if:371- You need precise control over every step of the workflow372- Your use case involves complex conditional logic and loops373- You want built-in persistence and human-in-the-loop checkpoints374- You need to use multiple LLM providers in the same workflow375~376### Choose CrewAI if:377- You want an intuitive, role-based abstraction378- Your task involves multiple agents with distinct specialties379- You need agents to collaborate and pass context between each other380- You value the largest community and most built-in integrations381~382### Choose OpenAI Agents SDK if:383- Your primary pattern is routing conversations to specialists384- You need guardrails that validate input/output in parallel385- You want the simplest possible abstraction with minimal boilerplate386- Built-in tracing and observability are important387~388### Choose Claude Agent SDK if:389- Your agents need to read, write, and execute code390- You want first-class MCP server integration391- You need autonomous agents that iterate and self-correct392- You are already using Claude and want the deepest integration393~394## Can You Combine Frameworks?395~396Yes. A common pattern is using one framework for orchestration and another for individual agents:397~398- **LangGraph** for the overall workflow graph399- **CrewAI** for a specific node that requires multi-agent collaboration400- **Claude Agent SDK** for coding-related sub-tasks via MCP401- **OpenAI Agents SDK** for customer-facing triage and routing402~403The frameworks are not mutually exclusive. Use what fits each part of your system.404~405## Conclusion406~407Each framework makes a clear bet:408~409- **LangGraph** optimizes for control - you decide every transition410- **CrewAI** optimizes for collaboration - agents work as a team411- **OpenAI Agents SDK** optimizes for simplicity - minimal abstraction, clean handoffs412- **Claude Agent SDK** optimizes for autonomy - give it tools and let it work413~414The right choice depends on your workflow, your team, and your existing stack. Pick the one that matches your primary use case, learn it well, and pull in others when you hit their sweet spot.415~
NORMAL · agentic-ai-frameworks-comparison.md [readonly]415 lines · :q to close