Comparing Multi-Agent Framework SDKs
HeetVekariya

HeetVekariya @heetvekariya

About: A Tech person doing Non-Tech things. Mod on Dev.to. Become a sponsor on GitHub and support me to create such blogs with consistency.

Joined:
Oct 12, 2023

Comparing Multi-Agent Framework SDKs

Publish Date: Jun 18
13 4

We explored about A2A and MCP in the previous blogs and how can we use them to work together. Now let's shift focus from protocols to building actual multi-agent systems.

There are several SDKs and frameworks available that the community uses today like CrewAI, LangChain, OpenAI Agent SDK, and Google Agent Development Kit (ADK).

While exploring options, I couldn’t find a clear comparison of these frameworks so I made one.

Each SDK brings something different to the table, from simplicity and flexibility to enterprise-grade scalability. In this blog, I’ll compare these four based on real parameters, using a simple and consistent example to highlight their strengths and trade-offs.


How We'll Compare These Agent Development Kits

The goal here is to help developers (like you and me) pick the right multi-agent framework.

We'll look at:

  • Ease of use
  • Model support
  • Agent collaboration
  • Scalability

To make it fair, we’ll use the same example across all frameworks: a system with three teaching agents (Math, Science, History) and one Principal agent, all powered by Gemini 2.0 Flash.

This way, we can actually see how each framework performs when solving the same problem.


Exploring the Four Agent Development Kits (ADKs)

1. LangChain

LangChain is one of the most widely adopted frameworks for LLM apps, with over 110k GitHub stars. It’s known for its flexibility, community support, and massive ecosystem.

  • It uses a chain and agent architecture, you can combine tools, agents, and prompts to build complex logic.
  • Multi-agent support is possible but requires manual coordination.
  • It supports a wide range of models, vector stores, and retrieval strategies, great for both RAG pipelines and agent workflows.

If you want full control over your agent logic, LangChain is solid.

2. CrewAI

CrewAI is purpose-built for multi-agent systems.

  • Agents are defined with roles and tasks, and they operate as a coordinated “crew.”
  • It simplifies collaboration and delegation between agents.
  • Used by companies like Oracle and Deloitte for enterprise AI tasks.

If your use case requires multiple agents working as a team, CrewAI feels very natural and productive.

3. OpenAI Agent SDK

OpenAI’s Agent SDK is still evolving and it does supports other models outside OpenAI's.

  • Tightly integrated with OpenAI APIs and tool calling.
  • Agents can automatically access tools or pass off tasks to other agents.
  • Good for early experimentation and small-scale agent setups, though it may be limited outside the OpenAI ecosystem.

If you're building with OpenAI tools and want fast prototyping, this SDK can be a quick start.

4. Google Agent Development Kit (ADK)

Google ADK, launched in April 2025, is a newcomer but powerful if you're in the Google ecosystem.

  • Built specifically for multi-agent setups.
  • Has a built-in handoff system, agents can pass tasks naturally to each other.
  • Native integration with Gemini models, Vertex AI, while supporting other models as well.
  • Lacks the community size of LangChain or CrewAI, but it’s growing, including myself.

If you want a clean developer experience with tight Gemini model integration, Google ADK is worth exploring.


Now that we’ve got a sense of each framework, let’s implement the same example with each and compare how they differ in practice.

Example: Teaching Agents System

To make this comparison hands-on, we’ll build a small system using each framework. The system has:

  • Three Teaching Agents
    • Math Teacher – answers math queries (e.g., “Solve 2x + 3 = 7”)
    • Science Teacher – explains science concepts (e.g., “What is photosynthesis?”)
    • History Teacher – responds to history questions (e.g., “What is the capital of France?”)
  • One Principal Agent
    • Routes student questions to the appropriate teacher agent
  • Model
    • All agents use Gemini 2.0 Flash—a fast, multimodal model

This example stays consistent across frameworks, helping us focus purely on the developer experience and capability differences.


LangChain Implementation

from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.agents import initialize_agent, Tool, AgentType

# Initialize Gemini 2.0 Flash LLM
llm = ChatGoogleGenerativeAI(
    model="gemini-2.0-flash", 
    convert_system_message_to_human=True,
    temperature=0.7
)

# Define Teaching Agent Tools
def ask_math_agent(query: str) -> str:
    prompt = f"You are a math teacher. Answer this math question: {query}"
    return llm.invoke(prompt).content

def ask_science_agent(query: str) -> str:
    prompt = f"You are a science teacher. Answer this science question: {query}"
    return llm.invoke(prompt).content

def ask_history_agent(query: str) -> str:
    prompt = f"You are a history teacher. Answer this history question: {query}"
    return llm.invoke(prompt).content

# Define Tools for Principal Agent
tools = [
    Tool(name="AskMath", func=ask_math_agent, description="Use for math questions."),
    Tool(name="AskScience", func=ask_science_agent, description="Use for science questions."),
    Tool(name="AskHistory", func=ask_history_agent, description="Use for history questions."),
]

# Initialize Principal Agent (Router)
principal_agent = initialize_agent(
    tools=tools,
    llm=llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True,
    handle_parsing_errors=True
)

# Run the System
student_query = "What is the capital of France?"
response = principal_agent.run(student_query)
print(response)

student_query = "What is the square root of 144? also explain the process."
response = principal_agent.run(student_query)
print(response)

student_query = "Who was the second president of the United States?"
response = principal_agent.run(student_query)
print(response)
Enter fullscreen mode Exit fullscreen mode
  • LLM Initialization: We initialize Gemini 2.0 Flash from Google using LangChain’s ChatGoogleGenerativeAI. It serves as the core reasoning engine for all agents.

  • Teaching Agents (Tools): Instead of building separate agents, we wrap each teaching function (Math, Science, History) inside a tool, a callable function paired with a description. Each tool receives a query, adds a role-specific prompt (e.g., "You are a math teacher..."), and gets a response from the LLM.

  • Principal Agent (Router): LangChain’s initialize_agent() is used to create a zero-shot reactive agent. This agent sees the student’s question, chooses the appropriate tool based on the descriptions, and delegates the question to that tool. For example, if the question is about “photosynthesis,” it will select AskScience.

> Entering new AgentExecutor chain...

I need to find out who the second president of the United States was.
Action: AskHistory
Action Input: Who was the second president of the United States?
Observation: Ah, a classic! The second president of the United States was **John Adams**. He served from 1797 to 1801. A fascinating and often overlooked figure in American history, Adams was a key player in the American Revolution and a staunch advocate for independence. However, his presidency was marked by controversy and challenges, particularly surrounding the Alien and Sedition Acts. Definitely a period worth further exploration!
Thought: I have the answer.
Final Answer: John Adams

> Finished chain.
John Adams
Enter fullscreen mode Exit fullscreen mode

CrewAI Implementation

CrewAI follows a specific directory structure for it's implementation, so we will do the same as defined in the docs.

  • Agents.yaml
math_teacher:
  role: "Math Teacher"
  goal: "Answer students' math questions clearly."
  backstory: >
    A passionate math educator skilled in algebra, calculus, and problem-solving.

science_teacher:
  role: "Science Teacher"
  goal: "Answer students' science questions clearly."
  backstory: >
    A curious scientist passionate about physics, chemistry, and biology.

history_teacher:
  role: "History Teacher"
  goal: "Answer students' history questions clearly."
  backstory: >
    A knowledgeable historian focused on events from ancient to modern times.

principal:
  role: "School Principal to route query to the right teacher"
  goal: "Route student queries to the appropriate teacher."
  backstory: >
    A wise principal who knows where each question belongs.
Enter fullscreen mode Exit fullscreen mode

Agent Configuration (agents.yaml): Each teaching agent (Math, Science, History) and the Principal is defined with a clear role, goal, and backstory. This information guides the behavior of each agent during execution.

For instance, the math_teacher is framed as an algebra expert with a passion for problem-solving.

  • Tasks.yaml
route_task:
  description: "Receive student query, i.e. {topic}, and delegate to correct teacher agent."
  expected_output: "The final answer from the most relevant teacher."
  agent: principal

math_task:
  description: "Answer the math question of {topic}."
  expected_output: "A detailed math answer."
  agent: math_teacher

science_task:
  description: "Answer the science question of {topic}."
  expected_output: "A clear science answer."
  agent: science_teacher

history_task:
  description: "Answer the history question of {topic}."
  expected_output: "A descriptive history answer."
  agent: history_teacher
Enter fullscreen mode Exit fullscreen mode

Task Configuration (tasks.yaml): Each agent is assigned a task tailored to their subject expertise. The route_task is given to the Principal, who acts as the system's router. It receives the student query and determines which teaching agent should handle it.

  • Crew.py

from dotenv import load_dotenv
from crewai import Agent, Crew, Process, Task, LLM
from crewai.project import CrewBase, agent, crew, task

llm = LLM(
    model="gemini/gemini-2.0-flash",
    temperature=0.7,  # Adjust temperature for response variability
)

# Load environment variables from .env file
load_dotenv()

@CrewBase
class SchoolCrew:
    """Crew to route student queries to subject teachers"""

    @agent
    def principal(self) -> Agent:
        return Agent(
            config=self.agents_config['principal'],
            verbose=True,
            llm=llm,
        )

    @agent
    def math_teacher(self) -> Agent:
        return Agent(config=self.agents_config['math_teacher'], verbose=True, llm=llm)

    @agent
    def science_teacher(self) -> Agent:
        return Agent(config=self.agents_config['science_teacher'], verbose=True, llm=llm)

    @agent
    def history_teacher(self) -> Agent:
        return Agent(config=self.agents_config['history_teacher'], verbose=True, llm=llm)

    @task
    def route_task(self) -> Task:
        return Task(config=self.tasks_config['route_task'])

    @task
    def math_task(self) -> Task:
        return Task(config=self.tasks_config['math_task'])

    @task
    def science_task(self) -> Task:
        return Task(config=self.tasks_config['science_task'])

    @task
    def history_task(self) -> Task:
        return Task(config=self.tasks_config['history_task'])

    @crew
    def crew(self) -> Crew:
        return Crew(
            agents=[self.math_teacher(), self.science_teacher(), self.history_teacher(), self.social_studies_teacher()],
            manager_agent=self.principal(),
            tasks=[self.route_task(), self.math_task(), self.science_task(), self.history_task(), self.social_studies_task()],
            process=Process.hierarchical,  # Dynamic routing
            verbose=True
        )
Enter fullscreen mode Exit fullscreen mode

System Setup (crew.py): Using the CrewBase class, agents and tasks are instantiated and grouped into a Crew. The principal acts as the manager agent, responsible for delegating tasks. The Process.hierarchical mode enables dynamic task routing based on the query context, mimicking decision-making in an actual school environment.

  • Main.py
import os
from crew import SchoolCrew

os.makedirs('output', exist_ok=True)

def run():
    inputs = {"topic": "What is the capital of France?"}
    result = SchoolCrew().crew().kickoff(inputs=inputs)
    print("\n=== FINAL RESPONSE ===\n", result.raw)

    inputs = {"topic": "What is the square root of 144? also explain the process."}
    result = SchoolCrew().crew().kickoff(inputs=inputs)
    print("\n=== FINAL RESPONSE ===\n", result.raw)

    inputs = {"topic": "Who was the first president of the United States?"}
    result = SchoolCrew().crew().kickoff(inputs=inputs)
    print("\n=== FINAL RESPONSE ===\n", result.raw)

if __name__ == "__main__":
    run()
Enter fullscreen mode Exit fullscreen mode

Execution (main.py): The system is run with various student queries passed as topic inputs. The crew evaluates the query, the principal agent routes it to the relevant teacher, and the appropriate teacher provides the answer using Gemini 2.0 Flash.

Note : CrewAI's output is long due to logging of each thinking process, so I'm not able to show you here. You can always run it and check the output.

OpenAI ADK Implementation

import asyncio
from agents import Agent
from agents import Runner

math_agent = Agent(
    name="MathAgent",
    model="litellm/gemini/gemini-2.0-flash",
    instructions="You are a math expert. Provide clear and concise answers to math-related questions.",
)
science_agent = Agent(
    name="ScienceAgent",
    model="litellm/gemini/gemini-2.0-flash",
    instructions="You are a science expert. Provide clear and concise answers to science-related questions.",
)
history_agent = Agent(
    name="HistoryAgent",
    model="litellm/gemini/gemini-2.0-flash",
    instructions="You are a history expert. Provide clear and concise answers to history-related questions.",
)

principal_agent = Agent(
    name="TriggerAgent",
    model="litellm/gemini/gemini-2.0-flash",
    instructions="You are a trigger agent. You will delegate questions to the appropriate subject expert based on the topic.",
    handoffs=[math_agent, science_agent, history_agent]
)

async def main():
    result = await Runner.run(principal_agent, "What is the capital of France?")
    print(result.final_output)

    result = await Runner.run(principal_agent, "What is the square root of 144? also explain the process.")
    print(result.final_output)

    result = await Runner.run(principal_agent, "Who was the first president of the United States?")
    print(result.final_output)

asyncio.run(main())
Enter fullscreen mode Exit fullscreen mode
  • Agent Creation with Specialization: Using OpenAI's ADK, we define multiple subject-specific agents (Math, Science, History). Each agent is initialized using the Agent class and powered by Gemini 2.0 Flash (via litellm). Every agent is assigned domain-specific instructions, guiding it to act as an expert in its respective field.

  • Central Delegator Agent: The TriggerAgent plays the role of a principal or router agent. It uses the same LLM but with a distinct instruction: identify the topic of the incoming question and handoff the task to the appropriate subject expert agent. This delegation logic is defined using the handoffs parameter in the agent configuration.

  • Execution with Runner: The Runner.run() function is used to pass user queries to the TriggerAgent. Based on the query context, the agent determines which expert to forward the question to. The responses are collected and printed, simulating an asynchronous interaction loop.

  • OpenAI ADK’s Edge: The ADK provides a clean and modular agent composition interface. It abstracts the routing and execution mechanics, allowing developers to define multi-agent collaboration using just simple Python classes and async runners, without deeply managing LLM context or chaining logic manually.

Query: Who was the first president of the India?

Dr. Rajendra Prasad was the first president of India.
Enter fullscreen mode Exit fullscreen mode

Google ADK Implementation

import asyncio
from google.adk.agents import Agent
from google.adk.sessions import InMemorySessionService
from google.adk.runners import Runner
from google.genai import types # For creating message Content/Parts

# Initialize Agents with Gemini 2.0 Flash
math_agent = Agent(
    name="MathTeacher", 
    model="gemini-2.0-flash",
    description="A math teacher who answers math-related questions.",
    instruction="You are a helpful math teacher. You only answer math-related questions."
)
science_agent = Agent(
    name="ScienceTeacher", 
    model="gemini-2.0-flash",
    description="A science teacher who answers science-related questions.",
    instruction="You are a helpful science teacher. You only answer science-related questions."
)
history_agent = Agent(
    name="HistoryTeacher", 
    model="gemini-2.0-flash",
    description="A history teacher who answers history-related questions.",
    instruction="You are a helpful history teacher. You only answer history-related questions."
)

principal_agent = Agent(
    name="Principal", 
    model="gemini-2.0-flash",
    description="A principal who manages the school system and delegates queries to appropriate teachers.",
    instruction="You are the principal of the school. You delegate questions to the appropriate teacher based on the subject.",
    sub_agents=[math_agent, science_agent, history_agent]
)

# Set Up Session Service
session_service = InMemorySessionService()
app_name = "SchoolSystem"
user_id = "Student_042"
session_id = "Session_001"

async def call_agent_async(query: str, runner, user_id, session_id):
  """Sends a query to the agent and prints the final response."""
  print(f"\n>>> User Query: {query}")

  content = types.Content(role='user', parts=[types.Part(text=query)])

  final_response_text = "Agent did not produce a final response." # Default

  async for event in runner.run_async(user_id=user_id, session_id=session_id, new_message=content):
      print(f"  [Event] Author: {event.author}, Type: {type(event).__name__}, Final: {event.is_final_response()}, Content: {event.content}")

      if event.is_final_response():
          if event.content and event.content.parts:
             final_response_text = event.content.parts[0].text
          elif event.actions and event.actions.escalate: # Handle potential errors/escalations
             final_response_text = f"Agent escalated: {event.error_message or 'No specific message.'}"
          break

  print(f"<<< Agent Response: {final_response_text}")

async def get_agent_response():
    session = await session_service.create_session(app_name=app_name, user_id=user_id, session_id=session_id)
    runner_agent_team = Runner(
        agent=principal_agent,
        app_name=app_name,
        session_service=session_service
    )

    print("Runner Agent Team initialized.")

    await call_agent_async(
        "What is the Pythagorean theorem?",
        runner_agent_team, 
        user_id, 
        session_id
    )

    await call_agent_async(
        "What is the chemical formula for water?",
        runner_agent_team, 
        user_id, 
        session_id
    )

    await call_agent_async(
        "Who was the first president of the United States?",
        runner_agent_team, 
        user_id, 
        session_id
    )

    await call_agent_async(
        "Which agents are available in the system?",
        runner_agent_team,
        user_id,
        session_id
    )

asyncio.run(get_agent_response())
Enter fullscreen mode Exit fullscreen mode
  • Subject Agents Setup: In this setup, we define three specialized agents: MathTeacher, ScienceTeacher, and HistoryTeacher using Google’s ADK. Each agent is initialized with Gemini 2.0 Flash and has clear role-based instructions. These agents are only allowed to answer questions from their subject domain, which helps maintain accuracy and specialization.

  • Principal Agent (Coordinator): The Principal agent acts like a school head. It doesn't answer questions itself—instead, it routes student queries to the right subject expert using the sub_agents configuration. This delegation is automatic and based on the query content.

  • Session Handling: Google ADK introduces a session-based memory system. Using the InMemorySessionService, the conversation state is preserved across multiple queries from the same user. This makes the interaction feel more continuous and context-aware.

  • Runner Execution: The Runner object manages the overall flow. It takes care of delivering the user’s question, collecting streaming responses (via events), and printing the final reply. The interaction loop is asynchronous, and the ADK supports rich content handling and escalation when agents face uncertainty.

  • Google ADK’s Strength: Google’s ADK provides a structured and event-driven way to build agent teams. It focuses on modular design, session memory, and hierarchical agent coordination—making it great for scalable multi-agent applications that mimic real-world roles.

>>> User Query: What is the Pythagorean theorem?
<<< Agent Response: The Pythagorean theorem states that in a right triangle, the square of the length of the hypotenuse (the side opposite the right angle) is equal to the sum of the squares of the lengths of the other two sides. This can be written as:

 +  = 

where:
*   a and b are the lengths of the two shorter sides (legs) of the right triangle
*   c is the length of the hypotenuse
Enter fullscreen mode Exit fullscreen mode

Comparison Table

Parameter LangChain CrewAI OpenAI Agent SDK Google ADK
Ease of Use High‑level APIs and many templates simplify development, but multi-agent setups still require manual orchestration Very easy to bootstrap multi-agent workflows with minimal boilerplate Lightweight primitives (Agents, Handoffs, Guardrails); straightforward to set up and debug Clear sub‑agent structure makes delegation intuitive; event‑driven runner adds minimal complexity
Documentation Quality Extensive guides, API refs, tutorials, and examples Friendly and clear, though lacks depth in advanced scenarios Concise, example‑rich, focused on agent patterns and production use Fresh, detailed, and beginner‑friendly, especially around agent lifecycle and orchestration
Community Support Massive, including 110k+ GitHub stars, Discord, Slack, third-party content Growing user base backed by Oracle/Deloitte; active but smaller Quickly growing; broad help across Reddit, GitHub, and blog posts Emerging community with strong Google Cloud backing, but not as mature as others
Model Support Native first-class support for OpenAI, Gemini, Anthropic, HuggingFace, etc. Supports major LLMs including Gemini and OpenAI; model flexibility good Designed for OpenAI models via Chat API; can be extended to custom providers using Litellm Built for Gemini series, with seamless Vertex AI integration but extendes to providers using Litellm
Multi‑Agent Capabilities Tool-based multi-agent support via chains and routers; DIY orchestration Crew metaphor simplifies role‑based agent collaboration Native support for handoffs makes delegation clean and traceable Sub‑agents and handoffs offer robust, built-in multi-agent coordination
Tool Integration 50+ official integrations (APIs, DB, scraping, computation, etc.) Integrated via LangChain backend; strong but slightly less extensive Easy to add via Python functions; tool wrappers and guardrails supported Full integration with Vertex AI tools, function calling, cloud APIs
Performance Generally responsive; performance depends on chain optimization Optimized for multi-agent workflows; performance smooth in practice Fast execution with built-in tracing; overhead minimal for basic use Native Gemini + cloud infra = low latency and high throughput
Scalability Scales with LangSmith, distributed chains, but needs customization Enterprise-ready orchestration; designed for scaling agent workflows Built-in tracing, scalable with OpenAI infra; easy deployment Cloud-native via Vertex AI; handles scale out-of-the-box
Cost Open-source; pay only for model/API usage and hosting Free core; API and enterprise feature usage incur costs Free SDK; pay for OpenAI model usage and any custom deployments Free-to-use SDK; pay for Gemini API (Free tier available) and Vertex AI compute

Codes for all implementation is available here: GitHub


Conclusion

So, who’s the winner?
It depends on what you’re working upon.

LangChain is your all-purpose tool, perfect for coders who want flexibility and a huge community to back them up,great for startups or solo projects.

CrewAI is the multi-agent master, making it easy to build collaborative AI teams, especially for enterprise gigs.

OpenAI Agent SDK is sleek and fast for OpenAI fans, but using Gemini 2.0 Flash requires some DIY, which might slow you down.

Google ADK is the new, ideal for Google Cloud users who want seamless Gemini integration and cloud scalability, though its community is still growing.


If you found this helpful, don’t forget to share and follow for more agent-powered insights. Got an idea or workflow in mind? Join the discussion in the comments or reach out on Twitter or LinkedIn.

Comments 4 total

  • Dotallio
    DotallioJun 18, 2025

    This is exactly the kind of side-by-side, example-driven breakdown I wish I had when first trying to pick a multi-agent framework - it really cuts through the hype.

    Did you run into any surprising limitations or workarounds in session handling or agent state between frameworks?

    • HeetVekariya
      HeetVekariyaJun 19, 2025

      Yes, So true.

      I was looking for a clear comparison to start making Multi-Agent systems, but was not able to find one. So I made by myself.

      To be honest, Haven't tested each framework much but from my experience OpenAI SDK is so easy to start with in terms of minimal code and clear documentation. (I'm stating this based on my experience so far with the simple example showed in the blog).

      CrewAI set up took a lot of time, as I was not able to figure out, how the input will provided to the agent. It was difficult to understand in first go for me.

      If you are interested, we can have more discussion around it.

  • Calm Matter
    Calm MatterJun 19, 2025

    Good stuff @heetvekariya , Google ADK comes with integrated FastAPI server and that gives a huge advantage as compared to other frameworks. Its new but it will grow fast. Also we already have a router like LiteLLM which causes a bit of latency but its with every other provider here so its a default as of now unless we just call tools built natively on the same LLM and dont call other LLMs (say, multimodal ones only). Thanks for this comparison, its a good headstart for beginners like me.

Add comment