AI agents have moved from research demos to production systems. Over 60% of enterprise AI applications are expected to include agentic components by 2026. But building agents from scratch - managing tool loops, state, memory, error handling, and multi-agent coordination - is complex. That's where frameworks come in.
Four frameworks dominate in 2026: LangGraph, CrewAI, OpenAI Agents SDK, and Claude Agent SDK. Each takes a fundamentally different approach to the same problem: giving LLMs the ability to reason, plan, use tools, and collaborate.
At a Glance
| Aspect | LangGraph | CrewAI | OpenAI Agents SDK | Claude Agent SDK |
|---|---|---|---|---|
| By | LangChain | CrewAI Inc. | OpenAI | Anthropic |
| Architecture | Graph-based | Role-based | Handoff-based | Autonomous loop |
| Philosophy | Maximum control | Team collaboration | Minimal abstraction | Give agent a computer |
| Languages | Python, TypeScript | Python | Python | Python, TypeScript |
| Model support | Any (OpenAI, Claude, local) | Any | Any (despite the name) | Claude only |
| GitHub stars | ~29k | ~40k | ~21k | ~6k |
| Best for | Complex stateful workflows | Multi-agent specialization | Routing and triage | Coding and file-heavy tasks |
LangGraph: The Graph Builder
LangGraph models agent workflows as directed cyclic graphs. You define nodes (functions that do work) and edges (transitions between them, optionally conditional). State flows through the graph and persists via checkpointing.
This is the most explicit and controllable framework - you wire every step yourself.
Core Concepts
- StateGraph: the graph definition with typed state
- Nodes: Python functions that transform state
- Edges: connections between nodes, can be conditional
- Checkpointing: built-in persistence for long-running workflows
Code Example
from langgraph.graph import StateGraph, MessagesState, START, END from langchain_openai import ChatOpenAI llm = ChatOpenAI(model="gpt-4o") def call_agent(state: MessagesState): response = llm.invoke(state["messages"]) return {"messages": [response]} def should_continue(state: MessagesState): last = state["messages"][-1] if last.tool_calls: return "tools" return END def call_tools(state: MessagesState): # Execute tool calls and return results results = [] for tool_call in state["messages"][-1].tool_calls: result = execute_tool(tool_call) results.append(result) return {"messages": results} graph = StateGraph(MessagesState) graph.add_node("agent", call_agent) graph.add_node("tools", call_tools) graph.add_edge(START, "agent") graph.add_conditional_edges("agent", should_continue, {"tools": "tools", END: END}) graph.add_edge("tools", "agent") app = graph.compile() result = app.invoke({"messages": [{"role": "user", "content": "What's the weather?"}]})
Strengths
- Fine-grained control over every step and transition
- Built-in checkpointing and human-in-the-loop
- Full TypeScript parity
- Works with any LLM provider
- Best for complex workflows with conditional branching and loops
Weaknesses
- Steep learning curve - you need to understand graph theory concepts
- Verbose for simple use cases - a basic agent requires more boilerplate than other frameworks
- Debugging graph flows can be challenging without LangSmith
Pricing
Open-source (MIT). LangSmith (managed observability platform) has paid tiers for production monitoring.
CrewAI: The Team Assembler
CrewAI takes a human metaphor: you assemble a crew of specialized agents, each with a role, goal, and backstory. Agents collaborate on tasks using tools, coordinated by a process (sequential, hierarchical, or consensual).
Think of it as hiring a team where each member has a specific job title and specialty.
Core Concepts
- Agent: a persona with role, goal, backstory, and tools
- Task: an assignment with description, expected output, and assigned agent
- Crew: a group of agents working together
- Process: execution strategy (sequential, hierarchical, consensual)
- Flow: event-driven orchestration layer for connecting multiple crews
Code Example
from crewai import Agent, Task, Crew, Process researcher = Agent( role="Senior Research Analyst", goal="Find comprehensive data about the given topic", backstory="You have 10 years of experience in technology research. " "You are thorough and always verify facts from multiple sources.", tools=[web_search_tool], verbose=True, ) writer = Agent( role="Technical Writer", goal="Create clear, engaging technical content", backstory="You write for a developer audience. " "Your articles are practical and include code examples.", tools=[file_tool], verbose=True, ) research_task = Task( description="Research the latest developments in WebAssembly in 2026. " "Focus on WASI, Component Model, and production use cases.", expected_output="A structured research document with key findings and sources.", agent=researcher, ) writing_task = Task( description="Write a blog post based on the research. " "Include code examples and Mermaid diagrams.", expected_output="A complete blog post in Markdown format.", agent=writer, context=[research_task], # Writer receives researcher's output ) crew = Crew( agents=[researcher, writer], tasks=[research_task, writing_task], process=Process.sequential, verbose=True, ) result = crew.kickoff() print(result.raw)
Strengths
- Intuitive role-based abstraction - easy to reason about
- 100+ built-in tool integrations
- Shared memory across agents (short-term, long-term, entity)
- Largest community (~40k GitHub stars)
- Hierarchical process with a "manager" agent that delegates and validates
Weaknesses
- Less fine-grained control than LangGraph - you define roles, not exact execution paths
- Hierarchical process can be unpredictable when agents disagree
- Debugging multi-agent conversations is harder than single-agent flows
Pricing
Open-source core (free). CrewAI Platform: $99/month (Teams) to $120k/year (Enterprise). Pricing based on live crews and monthly executions.
OpenAI Agents SDK: The Router
The OpenAI Agents SDK (spiritual successor to Swarm) focuses on handoffs - agents transferring conversations to other specialized agents. It is the most minimal framework: agents, tools, handoffs, and guardrails. That's it.
Core Concepts
- Agent: model + instructions + tools + handoffs
- Handoff: a transfer to another agent (modeled as a tool the LLM can call)
- Guardrail: input/output validation that runs in parallel with the agent
- Runner: executes the agent loop
- Tracing: built-in observability for all LLM calls, tool invocations, and handoffs
Code Example
from agents import Agent, Runner, handoff, InputGuardrail, GuardrailFunctionOutput from pydantic import BaseModel class SafetyCheck(BaseModel): is_safe: bool reason: str async def content_safety(ctx, agent, input_text): result = await Runner.run( Agent(name="Safety", instructions="Check if input is safe. No PII."), input_text, context=ctx, ) output = SafetyCheck.model_validate_json(result.final_output) return GuardrailFunctionOutput( output_info=output, tripwire_triggered=not output.is_safe ) billing_agent = Agent( name="Billing Agent", instructions="You handle billing inquiries. Be precise with numbers.", tools=[lookup_invoice, process_payment], ) refund_agent = Agent( name="Refund Agent", instructions="You process refund requests. Always verify the order first.", tools=[lookup_order, issue_refund], ) triage_agent = Agent( name="Triage Agent", instructions="Route the customer to the right specialist. " "Ask clarifying questions if needed.", handoffs=[billing_agent, refund_agent], input_guardrails=[InputGuardrail(guardrail_function=content_safety)], ) result = await Runner.run(triage_agent, "I need a refund for order #4521") print(result.final_output) # The triage agent routes to refund_agent, which processes the refund
Strengths
- Clean handoff pattern - natural for routing/triage workflows
- Guardrails run in parallel with execution (fail-fast, not blocking)
- Built-in tracing dashboard for debugging
- Despite the name, supports non-OpenAI models
- Minimal abstraction - easy to understand and extend
Weaknesses
- Less mature state management than LangGraph
- No built-in persistence or checkpointing
- Ecosystem of third-party tools is smaller
- Handoff-centric design may not fit every architecture
Pricing
Open-source (MIT). You pay per-token for whatever model you use.
Claude Agent SDK: The Developer
The Claude Agent SDK takes a different approach: instead of defining workflows or roles, you give the agent a set of tools and let it figure out how to accomplish the task. It uses the same autonomous loop that powers Claude Code - read, act, verify, iterate.
Core Concepts
- query(): the main entry point that starts the agent loop
- Built-in tools: Read, Write, Edit, Bash, Glob, Grep, WebSearch, WebFetch
- Custom tools via MCP: define tools as in-process MCP servers
- Sub-agents: specialized agents the parent can delegate to
- Sessions: maintain context across multiple interactions
Code Example
import { tool, createSdkMcpServer, query } from "@anthropic-ai/claude-agent-sdk"; import { z } from "zod"; const searchDocs = tool( "search_docs", "Search the internal documentation for relevant information", { query: z.string().describe("Search query") }, async ({ query }) => { const results = await vectorStore.similaritySearch(query, 5); return { content: [{ type: "text", text: results.map(r => r.pageContent).join("\n\n") }], }; } ); const docsServer = createSdkMcpServer({ name: "docs", version: "1.0.0", tools: [searchDocs], }); for await (const message of query({ prompt: "Find how authentication works in our system and write a summary", options: { mcpServers: { docs: docsServer }, allowedTools: ["Read", "Glob", "Grep", "mcp__docs__search_docs"], }, })) { if (message.type === "result" && message.subtype === "success") { console.log(message.result); } }
Strengths
- First-class MCP integration - connect to any MCP server ecosystem
- Built-in tools for file operations, terminal, and web access
- Automatic context compaction for large codebases
- Sub-agent parallelism for complex tasks
- Same engine as Claude Code - battle-tested on real development workflows
Weaknesses
- Claude models only - no multi-provider support
- Newer framework with a smaller community
- Requires Node.js runtime even for the Python SDK
- Less explicit workflow control compared to LangGraph
Pricing
Open-source. Standard Claude API token rates. Managed Agents (hosted version): $0.08 per session-hour in addition to token costs.
When to Choose Which
Choose LangGraph if:
- You need precise control over every step of the workflow
- Your use case involves complex conditional logic and loops
- You want built-in persistence and human-in-the-loop checkpoints
- You need to use multiple LLM providers in the same workflow
Choose CrewAI if:
- You want an intuitive, role-based abstraction
- Your task involves multiple agents with distinct specialties
- You need agents to collaborate and pass context between each other
- You value the largest community and most built-in integrations
Choose OpenAI Agents SDK if:
- Your primary pattern is routing conversations to specialists
- You need guardrails that validate input/output in parallel
- You want the simplest possible abstraction with minimal boilerplate
- Built-in tracing and observability are important
Choose Claude Agent SDK if:
- Your agents need to read, write, and execute code
- You want first-class MCP server integration
- You need autonomous agents that iterate and self-correct
- You are already using Claude and want the deepest integration
Can You Combine Frameworks?
Yes. A common pattern is using one framework for orchestration and another for individual agents:
- LangGraph for the overall workflow graph
- CrewAI for a specific node that requires multi-agent collaboration
- Claude Agent SDK for coding-related sub-tasks via MCP
- OpenAI Agents SDK for customer-facing triage and routing
The frameworks are not mutually exclusive. Use what fits each part of your system.
Conclusion
Each framework makes a clear bet:
- LangGraph optimizes for control - you decide every transition
- CrewAI optimizes for collaboration - agents work as a team
- OpenAI Agents SDK optimizes for simplicity - minimal abstraction, clean handoffs
- Claude Agent SDK optimizes for autonomy - give it tools and let it work
The right choice depends on your workflow, your team, and your existing stack. Pick the one that matches your primary use case, learn it well, and pull in others when you hit their sweet spot.