spinny:~/writing $ cat agentic-ai-frameworks-comparison.md

Agentic AI Frameworks Compared: LangGraph vs CrewAI vs OpenAI Agents SDK vs Claude Agent SDK

2026-04-12 · 10 min read · Filippo Spinella · AI, LLM, Developer Tools, Python

AI agents have moved from research demos to production systems. Over 60% of enterprise AI applications are expected to include agentic components by 2026. But building agents from scratch - managing tool loops, state, memory, error handling, and multi-agent coordination - is complex. That's where frameworks come in.

Four frameworks dominate in 2026: LangGraph, CrewAI, OpenAI Agents SDK, and Claude Agent SDK. Each takes a fundamentally different approach to the same problem: giving LLMs the ability to reason, plan, use tools, and collaborate.

At a Glance

Aspect	LangGraph	CrewAI	OpenAI Agents SDK	Claude Agent SDK
By	LangChain	CrewAI Inc.	OpenAI	Anthropic
Architecture	Graph-based	Role-based	Handoff-based	Autonomous loop
Philosophy	Maximum control	Team collaboration	Minimal abstraction	Give agent a computer
Languages	Python, TypeScript	Python	Python	Python, TypeScript
Model support	Any (OpenAI, Claude, local)	Any	Any (despite the name)	Claude only
GitHub stars	~29k	~40k	~21k	~6k
Best for	Complex stateful workflows	Multi-agent specialization	Routing and triage	Coding and file-heavy tasks

LangGraph: The Graph Builder

LangGraph models agent workflows as directed cyclic graphs. You define nodes (functions that do work) and edges (transitions between them, optionally conditional). State flows through the graph and persists via checkpointing.

This is the most explicit and controllable framework - you wire every step yourself.

Core Concepts

StateGraph: the graph definition with typed state
Nodes: Python functions that transform state
Edges: connections between nodes, can be conditional
Checkpointing: built-in persistence for long-running workflows

Code Example

from langgraph.graph import StateGraph, MessagesState, START, END
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o")

def call_agent(state: MessagesState):
    response = llm.invoke(state["messages"])
    return {"messages": [response]}

def should_continue(state: MessagesState):
    last = state["messages"][-1]
    if last.tool_calls:
        return "tools"
    return END

def call_tools(state: MessagesState):
    # Execute tool calls and return results
    results = []
    for tool_call in state["messages"][-1].tool_calls:
        result = execute_tool(tool_call)
        results.append(result)
    return {"messages": results}

graph = StateGraph(MessagesState)
graph.add_node("agent", call_agent)
graph.add_node("tools", call_tools)
graph.add_edge(START, "agent")
graph.add_conditional_edges("agent", should_continue, {"tools": "tools", END: END})
graph.add_edge("tools", "agent")

app = graph.compile()
result = app.invoke({"messages": [{"role": "user", "content": "What's the weather?"}]})

Strengths

Fine-grained control over every step and transition
Built-in checkpointing and human-in-the-loop
Full TypeScript parity
Works with any LLM provider
Best for complex workflows with conditional branching and loops

Weaknesses

Steep learning curve - you need to understand graph theory concepts
Verbose for simple use cases - a basic agent requires more boilerplate than other frameworks
Debugging graph flows can be challenging without LangSmith

Pricing

Open-source (MIT). LangSmith (managed observability platform) has paid tiers for production monitoring.

CrewAI: The Team Assembler

CrewAI takes a human metaphor: you assemble a crew of specialized agents, each with a role, goal, and backstory. Agents collaborate on tasks using tools, coordinated by a process (sequential, hierarchical, or consensual).

Think of it as hiring a team where each member has a specific job title and specialty.

Core Concepts

Agent: a persona with role, goal, backstory, and tools
Task: an assignment with description, expected output, and assigned agent
Crew: a group of agents working together
Process: execution strategy (sequential, hierarchical, consensual)
Flow: event-driven orchestration layer for connecting multiple crews

Code Example

from crewai import Agent, Task, Crew, Process

researcher = Agent(
    role="Senior Research Analyst",
    goal="Find comprehensive data about the given topic",
    backstory="You have 10 years of experience in technology research. "
              "You are thorough and always verify facts from multiple sources.",
    tools=[web_search_tool],
    verbose=True,
)

writer = Agent(
    role="Technical Writer",
    goal="Create clear, engaging technical content",
    backstory="You write for a developer audience. "
              "Your articles are practical and include code examples.",
    tools=[file_tool],
    verbose=True,
)

research_task = Task(
    description="Research the latest developments in WebAssembly in 2026. "
                "Focus on WASI, Component Model, and production use cases.",
    expected_output="A structured research document with key findings and sources.",
    agent=researcher,
)

writing_task = Task(
    description="Write a blog post based on the research. "
                "Include code examples and Mermaid diagrams.",
    expected_output="A complete blog post in Markdown format.",
    agent=writer,
    context=[research_task],  # Writer receives researcher's output
)

crew = Crew(
    agents=[researcher, writer],
    tasks=[research_task, writing_task],
    process=Process.sequential,
    verbose=True,
)

result = crew.kickoff()
print(result.raw)

Strengths

Intuitive role-based abstraction - easy to reason about
100+ built-in tool integrations
Shared memory across agents (short-term, long-term, entity)
Largest community (~40k GitHub stars)
Hierarchical process with a "manager" agent that delegates and validates

Weaknesses

Less fine-grained control than LangGraph - you define roles, not exact execution paths
Hierarchical process can be unpredictable when agents disagree
Debugging multi-agent conversations is harder than single-agent flows

Pricing

Open-source core (free). CrewAI Platform: $99/month (Teams) to $120k/year (Enterprise). Pricing based on live crews and monthly executions.

OpenAI Agents SDK: The Router

The OpenAI Agents SDK (spiritual successor to Swarm) focuses on handoffs - agents transferring conversations to other specialized agents. It is the most minimal framework: agents, tools, handoffs, and guardrails. That's it.

Core Concepts

Agent: model + instructions + tools + handoffs
Handoff: a transfer to another agent (modeled as a tool the LLM can call)
Guardrail: input/output validation that runs in parallel with the agent
Runner: executes the agent loop
Tracing: built-in observability for all LLM calls, tool invocations, and handoffs

Code Example

from agents import Agent, Runner, handoff, InputGuardrail, GuardrailFunctionOutput
from pydantic import BaseModel

class SafetyCheck(BaseModel):
    is_safe: bool
    reason: str

async def content_safety(ctx, agent, input_text):
    result = await Runner.run(
        Agent(name="Safety", instructions="Check if input is safe. No PII."),
        input_text,
        context=ctx,
    )
    output = SafetyCheck.model_validate_json(result.final_output)
    return GuardrailFunctionOutput(
        output_info=output, tripwire_triggered=not output.is_safe
    )

billing_agent = Agent(
    name="Billing Agent",
    instructions="You handle billing inquiries. Be precise with numbers.",
    tools=[lookup_invoice, process_payment],
)

refund_agent = Agent(
    name="Refund Agent",
    instructions="You process refund requests. Always verify the order first.",
    tools=[lookup_order, issue_refund],
)

triage_agent = Agent(
    name="Triage Agent",
    instructions="Route the customer to the right specialist. "
                 "Ask clarifying questions if needed.",
    handoffs=[billing_agent, refund_agent],
    input_guardrails=[InputGuardrail(guardrail_function=content_safety)],
)

result = await Runner.run(triage_agent, "I need a refund for order #4521")
print(result.final_output)
# The triage agent routes to refund_agent, which processes the refund

Strengths

Clean handoff pattern - natural for routing/triage workflows
Guardrails run in parallel with execution (fail-fast, not blocking)
Built-in tracing dashboard for debugging
Despite the name, supports non-OpenAI models
Minimal abstraction - easy to understand and extend

Weaknesses

Less mature state management than LangGraph
No built-in persistence or checkpointing
Ecosystem of third-party tools is smaller
Handoff-centric design may not fit every architecture

Pricing

Open-source (MIT). You pay per-token for whatever model you use.

Claude Agent SDK: The Developer

The Claude Agent SDK takes a different approach: instead of defining workflows or roles, you give the agent a set of tools and let it figure out how to accomplish the task. It uses the same autonomous loop that powers Claude Code - read, act, verify, iterate.

Core Concepts

query(): the main entry point that starts the agent loop
Built-in tools: Read, Write, Edit, Bash, Glob, Grep, WebSearch, WebFetch
Custom tools via MCP: define tools as in-process MCP servers
Sub-agents: specialized agents the parent can delegate to
Sessions: maintain context across multiple interactions

Code Example

import { tool, createSdkMcpServer, query } from "@anthropic-ai/claude-agent-sdk";
import { z } from "zod";

const searchDocs = tool(
  "search_docs",
  "Search the internal documentation for relevant information",
  { query: z.string().describe("Search query") },
  async ({ query }) => {
    const results = await vectorStore.similaritySearch(query, 5);
    return {
      content: [{ type: "text", text: results.map(r => r.pageContent).join("\n\n") }],
    };
  }
);

const docsServer = createSdkMcpServer({
  name: "docs",
  version: "1.0.0",
  tools: [searchDocs],
});

for await (const message of query({
  prompt: "Find how authentication works in our system and write a summary",
  options: {
    mcpServers: { docs: docsServer },
    allowedTools: ["Read", "Glob", "Grep", "mcp__docs__search_docs"],
  },
})) {
  if (message.type === "result" && message.subtype === "success") {
    console.log(message.result);
  }
}

Strengths

First-class MCP integration - connect to any MCP server ecosystem
Built-in tools for file operations, terminal, and web access
Automatic context compaction for large codebases
Sub-agent parallelism for complex tasks
Same engine as Claude Code - battle-tested on real development workflows

Weaknesses

Claude models only - no multi-provider support
Newer framework with a smaller community
Requires Node.js runtime even for the Python SDK
Less explicit workflow control compared to LangGraph

Pricing

Open-source. Standard Claude API token rates. Managed Agents (hosted version): $0.08 per session-hour in addition to token costs.

When to Choose Which

Choose LangGraph if:

You need precise control over every step of the workflow
Your use case involves complex conditional logic and loops
You want built-in persistence and human-in-the-loop checkpoints
You need to use multiple LLM providers in the same workflow

Choose CrewAI if:

You want an intuitive, role-based abstraction
Your task involves multiple agents with distinct specialties
You need agents to collaborate and pass context between each other
You value the largest community and most built-in integrations

Choose OpenAI Agents SDK if:

Your primary pattern is routing conversations to specialists
You need guardrails that validate input/output in parallel
You want the simplest possible abstraction with minimal boilerplate
Built-in tracing and observability are important

Choose Claude Agent SDK if:

Your agents need to read, write, and execute code
You want first-class MCP server integration
You need autonomous agents that iterate and self-correct
You are already using Claude and want the deepest integration

Can You Combine Frameworks?

Yes. A common pattern is using one framework for orchestration and another for individual agents:

LangGraph for the overall workflow graph
CrewAI for a specific node that requires multi-agent collaboration
Claude Agent SDK for coding-related sub-tasks via MCP
OpenAI Agents SDK for customer-facing triage and routing

The frameworks are not mutually exclusive. Use what fits each part of your system.

Conclusion

Each framework makes a clear bet:

LangGraph optimizes for control - you decide every transition
CrewAI optimizes for collaboration - agents work as a team
OpenAI Agents SDK optimizes for simplicity - minimal abstraction, clean handoffs
Claude Agent SDK optimizes for autonomy - give it tools and let it work

The right choice depends on your workflow, your team, and your existing stack. Pick the one that matches your primary use case, learn it well, and pull in others when you hit their sweet spot.

spinny:~/writing $ cat agentic-ai-frameworks-comparison.md

Agentic AI Frameworks Compared: LangGraph vs CrewAI vs OpenAI Agents SDK vs Claude Agent SDK

2026-04-12 · 10 min read · Filippo Spinella · AI, LLM, Developer Tools, Python

At a Glance

Aspect	LangGraph	CrewAI	OpenAI Agents SDK	Claude Agent SDK
By	LangChain	CrewAI Inc.	OpenAI	Anthropic
Architecture	Graph-based	Role-based	Handoff-based	Autonomous loop
Philosophy	Maximum control	Team collaboration	Minimal abstraction	Give agent a computer
Languages	Python, TypeScript	Python	Python	Python, TypeScript
Model support	Any (OpenAI, Claude, local)	Any	Any (despite the name)	Claude only
GitHub stars	~29k	~40k	~21k	~6k
Best for	Complex stateful workflows	Multi-agent specialization	Routing and triage	Coding and file-heavy tasks

LangGraph: The Graph Builder

This is the most explicit and controllable framework - you wire every step yourself.

Core Concepts

StateGraph: the graph definition with typed state
Nodes: Python functions that transform state
Edges: connections between nodes, can be conditional
Checkpointing: built-in persistence for long-running workflows

Code Example

from langgraph.graph import StateGraph, MessagesState, START, END
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o")

def call_agent(state: MessagesState):
    response = llm.invoke(state["messages"])
    return {"messages": [response]}

def should_continue(state: MessagesState):
    last = state["messages"][-1]
    if last.tool_calls:
        return "tools"
    return END

def call_tools(state: MessagesState):
    # Execute tool calls and return results
    results = []
    for tool_call in state["messages"][-1].tool_calls:
        result = execute_tool(tool_call)
        results.append(result)
    return {"messages": results}

graph = StateGraph(MessagesState)
graph.add_node("agent", call_agent)
graph.add_node("tools", call_tools)
graph.add_edge(START, "agent")
graph.add_conditional_edges("agent", should_continue, {"tools": "tools", END: END})
graph.add_edge("tools", "agent")

app = graph.compile()
result = app.invoke({"messages": [{"role": "user", "content": "What's the weather?"}]})

Strengths

Fine-grained control over every step and transition
Built-in checkpointing and human-in-the-loop
Full TypeScript parity
Works with any LLM provider
Best for complex workflows with conditional branching and loops

Weaknesses

Steep learning curve - you need to understand graph theory concepts
Verbose for simple use cases - a basic agent requires more boilerplate than other frameworks
Debugging graph flows can be challenging without LangSmith

Pricing

Open-source (MIT). LangSmith (managed observability platform) has paid tiers for production monitoring.

CrewAI: The Team Assembler

Think of it as hiring a team where each member has a specific job title and specialty.

Core Concepts

Agent: a persona with role, goal, backstory, and tools
Task: an assignment with description, expected output, and assigned agent
Crew: a group of agents working together
Process: execution strategy (sequential, hierarchical, consensual)
Flow: event-driven orchestration layer for connecting multiple crews

Code Example

from crewai import Agent, Task, Crew, Process

researcher = Agent(
    role="Senior Research Analyst",
    goal="Find comprehensive data about the given topic",
    backstory="You have 10 years of experience in technology research. "
              "You are thorough and always verify facts from multiple sources.",
    tools=[web_search_tool],
    verbose=True,
)

writer = Agent(
    role="Technical Writer",
    goal="Create clear, engaging technical content",
    backstory="You write for a developer audience. "
              "Your articles are practical and include code examples.",
    tools=[file_tool],
    verbose=True,
)

research_task = Task(
    description="Research the latest developments in WebAssembly in 2026. "
                "Focus on WASI, Component Model, and production use cases.",
    expected_output="A structured research document with key findings and sources.",
    agent=researcher,
)

writing_task = Task(
    description="Write a blog post based on the research. "
                "Include code examples and Mermaid diagrams.",
    expected_output="A complete blog post in Markdown format.",
    agent=writer,
    context=[research_task],  # Writer receives researcher's output
)

crew = Crew(
    agents=[researcher, writer],
    tasks=[research_task, writing_task],
    process=Process.sequential,
    verbose=True,
)

result = crew.kickoff()
print(result.raw)

Strengths

Intuitive role-based abstraction - easy to reason about
100+ built-in tool integrations
Shared memory across agents (short-term, long-term, entity)
Largest community (~40k GitHub stars)
Hierarchical process with a "manager" agent that delegates and validates

Weaknesses

Less fine-grained control than LangGraph - you define roles, not exact execution paths
Hierarchical process can be unpredictable when agents disagree
Debugging multi-agent conversations is harder than single-agent flows

Pricing

Open-source core (free). CrewAI Platform: $99/month (Teams) to $120k/year (Enterprise). Pricing based on live crews and monthly executions.

OpenAI Agents SDK: The Router

Core Concepts

Agent: model + instructions + tools + handoffs
Handoff: a transfer to another agent (modeled as a tool the LLM can call)
Guardrail: input/output validation that runs in parallel with the agent
Runner: executes the agent loop
Tracing: built-in observability for all LLM calls, tool invocations, and handoffs

Code Example

from agents import Agent, Runner, handoff, InputGuardrail, GuardrailFunctionOutput
from pydantic import BaseModel

class SafetyCheck(BaseModel):
    is_safe: bool
    reason: str

async def content_safety(ctx, agent, input_text):
    result = await Runner.run(
        Agent(name="Safety", instructions="Check if input is safe. No PII."),
        input_text,
        context=ctx,
    )
    output = SafetyCheck.model_validate_json(result.final_output)
    return GuardrailFunctionOutput(
        output_info=output, tripwire_triggered=not output.is_safe
    )

billing_agent = Agent(
    name="Billing Agent",
    instructions="You handle billing inquiries. Be precise with numbers.",
    tools=[lookup_invoice, process_payment],
)

refund_agent = Agent(
    name="Refund Agent",
    instructions="You process refund requests. Always verify the order first.",
    tools=[lookup_order, issue_refund],
)

triage_agent = Agent(
    name="Triage Agent",
    instructions="Route the customer to the right specialist. "
                 "Ask clarifying questions if needed.",
    handoffs=[billing_agent, refund_agent],
    input_guardrails=[InputGuardrail(guardrail_function=content_safety)],
)

result = await Runner.run(triage_agent, "I need a refund for order #4521")
print(result.final_output)
# The triage agent routes to refund_agent, which processes the refund

Strengths

Clean handoff pattern - natural for routing/triage workflows
Guardrails run in parallel with execution (fail-fast, not blocking)
Built-in tracing dashboard for debugging
Despite the name, supports non-OpenAI models
Minimal abstraction - easy to understand and extend

Weaknesses

Less mature state management than LangGraph
No built-in persistence or checkpointing
Ecosystem of third-party tools is smaller
Handoff-centric design may not fit every architecture

Pricing

Open-source (MIT). You pay per-token for whatever model you use.

Claude Agent SDK: The Developer

Core Concepts

query(): the main entry point that starts the agent loop
Built-in tools: Read, Write, Edit, Bash, Glob, Grep, WebSearch, WebFetch
Custom tools via MCP: define tools as in-process MCP servers
Sub-agents: specialized agents the parent can delegate to
Sessions: maintain context across multiple interactions

Code Example

import { tool, createSdkMcpServer, query } from "@anthropic-ai/claude-agent-sdk";
import { z } from "zod";

const searchDocs = tool(
  "search_docs",
  "Search the internal documentation for relevant information",
  { query: z.string().describe("Search query") },
  async ({ query }) => {
    const results = await vectorStore.similaritySearch(query, 5);
    return {
      content: [{ type: "text", text: results.map(r => r.pageContent).join("\n\n") }],
    };
  }
);

const docsServer = createSdkMcpServer({
  name: "docs",
  version: "1.0.0",
  tools: [searchDocs],
});

for await (const message of query({
  prompt: "Find how authentication works in our system and write a summary",
  options: {
    mcpServers: { docs: docsServer },
    allowedTools: ["Read", "Glob", "Grep", "mcp__docs__search_docs"],
  },
})) {
  if (message.type === "result" && message.subtype === "success") {
    console.log(message.result);
  }
}

Strengths

First-class MCP integration - connect to any MCP server ecosystem
Built-in tools for file operations, terminal, and web access
Automatic context compaction for large codebases
Sub-agent parallelism for complex tasks
Same engine as Claude Code - battle-tested on real development workflows

Weaknesses

Claude models only - no multi-provider support
Newer framework with a smaller community
Requires Node.js runtime even for the Python SDK
Less explicit workflow control compared to LangGraph

Pricing

Open-source. Standard Claude API token rates. Managed Agents (hosted version): $0.08 per session-hour in addition to token costs.

When to Choose Which

Choose LangGraph if:

You need precise control over every step of the workflow
Your use case involves complex conditional logic and loops
You want built-in persistence and human-in-the-loop checkpoints
You need to use multiple LLM providers in the same workflow

Choose CrewAI if:

You want an intuitive, role-based abstraction
Your task involves multiple agents with distinct specialties
You need agents to collaborate and pass context between each other
You value the largest community and most built-in integrations

Choose OpenAI Agents SDK if:

Your primary pattern is routing conversations to specialists
You need guardrails that validate input/output in parallel
You want the simplest possible abstraction with minimal boilerplate
Built-in tracing and observability are important

Choose Claude Agent SDK if:

Your agents need to read, write, and execute code
You want first-class MCP server integration
You need autonomous agents that iterate and self-correct
You are already using Claude and want the deepest integration

Can You Combine Frameworks?

Yes. A common pattern is using one framework for orchestration and another for individual agents:

LangGraph for the overall workflow graph
CrewAI for a specific node that requires multi-agent collaboration
Claude Agent SDK for coding-related sub-tasks via MCP
OpenAI Agents SDK for customer-facing triage and routing

The frameworks are not mutually exclusive. Use what fits each part of your system.

Conclusion

Each framework makes a clear bet:

LangGraph optimizes for control - you decide every transition
CrewAI optimizes for collaboration - agents work as a team
OpenAI Agents SDK optimizes for simplicity - minimal abstraction, clean handoffs
Claude Agent SDK optimizes for autonomy - give it tools and let it work

The right choice depends on your workflow, your team, and your existing stack. Pick the one that matches your primary use case, learn it well, and pull in others when you hit their sweet spot.