We explored about A2A and MCP in the previous blogs and how can we use them to work together. Now let's shift focus from protocols to building actual multi-agent systems.
There are several SDKs and frameworks available that the community uses today like CrewAI, LangChain, OpenAI Agent SDK, and Google Agent Development Kit (ADK).
While exploring options, I couldn’t find a clear comparison of these frameworks so I made one.
Each SDK brings something different to the table, from simplicity and flexibility to enterprise-grade scalability. In this blog, I’ll compare these four based on real parameters, using a simple and consistent example to highlight their strengths and trade-offs.
How We'll Compare These Agent Development Kits
The goal here is to help developers (like you and me) pick the right multi-agent framework.
We'll look at:
- Ease of use
- Model support
- Agent collaboration
- Scalability
To make it fair, we’ll use the same example across all frameworks: a system with three teaching agents (Math, Science, History) and one Principal agent, all powered by Gemini 2.0 Flash.
This way, we can actually see how each framework performs when solving the same problem.
Exploring the Four Agent Development Kits (ADKs)
1. LangChain
LangChain is one of the most widely adopted frameworks for LLM apps, with over 110k GitHub stars. It’s known for its flexibility, community support, and massive ecosystem.
- It uses a chain and agent architecture, you can combine tools, agents, and prompts to build complex logic.
- Multi-agent support is possible but requires manual coordination.
- It supports a wide range of models, vector stores, and retrieval strategies, great for both RAG pipelines and agent workflows.
If you want full control over your agent logic, LangChain is solid.
2. CrewAI
CrewAI is purpose-built for multi-agent systems.
- Agents are defined with roles and tasks, and they operate as a coordinated “crew.”
- It simplifies collaboration and delegation between agents.
- Used by companies like Oracle and Deloitte for enterprise AI tasks.
If your use case requires multiple agents working as a team, CrewAI feels very natural and productive.
3. OpenAI Agent SDK
OpenAI’s Agent SDK is still evolving and it does supports other models outside OpenAI's.
- Tightly integrated with OpenAI APIs and tool calling.
- Agents can automatically access tools or pass off tasks to other agents.
- Good for early experimentation and small-scale agent setups, though it may be limited outside the OpenAI ecosystem.
If you're building with OpenAI tools and want fast prototyping, this SDK can be a quick start.
4. Google Agent Development Kit (ADK)
Google ADK, launched in April 2025, is a newcomer but powerful if you're in the Google ecosystem.
- Built specifically for multi-agent setups.
- Has a built-in handoff system, agents can pass tasks naturally to each other.
- Native integration with Gemini models, Vertex AI, while supporting other models as well.
- Lacks the community size of LangChain or CrewAI, but it’s growing, including myself.
If you want a clean developer experience with tight Gemini model integration, Google ADK is worth exploring.
Now that we’ve got a sense of each framework, let’s implement the same example with each and compare how they differ in practice.
Example: Teaching Agents System
To make this comparison hands-on, we’ll build a small system using each framework. The system has:
-
Three Teaching Agents
- Math Teacher – answers math queries (e.g., “Solve 2x + 3 = 7”)
- Science Teacher – explains science concepts (e.g., “What is photosynthesis?”)
- History Teacher – responds to history questions (e.g., “What is the capital of France?”)
-
One Principal Agent
- Routes student questions to the appropriate teacher agent
-
Model
- All agents use Gemini 2.0 Flash—a fast, multimodal model
This example stays consistent across frameworks, helping us focus purely on the developer experience and capability differences.
LangChain Implementation
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.agents import initialize_agent, Tool, AgentType
# Initialize Gemini 2.0 Flash LLM
llm = ChatGoogleGenerativeAI(
model="gemini-2.0-flash",
convert_system_message_to_human=True,
temperature=0.7
)
# Define Teaching Agent Tools
def ask_math_agent(query: str) -> str:
prompt = f"You are a math teacher. Answer this math question: {query}"
return llm.invoke(prompt).content
def ask_science_agent(query: str) -> str:
prompt = f"You are a science teacher. Answer this science question: {query}"
return llm.invoke(prompt).content
def ask_history_agent(query: str) -> str:
prompt = f"You are a history teacher. Answer this history question: {query}"
return llm.invoke(prompt).content
# Define Tools for Principal Agent
tools = [
Tool(name="AskMath", func=ask_math_agent, description="Use for math questions."),
Tool(name="AskScience", func=ask_science_agent, description="Use for science questions."),
Tool(name="AskHistory", func=ask_history_agent, description="Use for history questions."),
]
# Initialize Principal Agent (Router)
principal_agent = initialize_agent(
tools=tools,
llm=llm,
agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
verbose=True,
handle_parsing_errors=True
)
# Run the System
student_query = "What is the capital of France?"
response = principal_agent.run(student_query)
print(response)
student_query = "What is the square root of 144? also explain the process."
response = principal_agent.run(student_query)
print(response)
student_query = "Who was the second president of the United States?"
response = principal_agent.run(student_query)
print(response)
LLM Initialization: We initialize Gemini 2.0 Flash from Google using LangChain’s
ChatGoogleGenerativeAI
. It serves as the core reasoning engine for all agents.Teaching Agents (Tools): Instead of building separate agents, we wrap each teaching function (Math, Science, History) inside a tool, a callable function paired with a description. Each tool receives a query, adds a role-specific prompt (e.g., "You are a math teacher..."), and gets a response from the LLM.
Principal Agent (Router): LangChain’s
initialize_agent()
is used to create a zero-shot reactive agent. This agent sees the student’s question, chooses the appropriate tool based on the descriptions, and delegates the question to that tool. For example, if the question is about “photosynthesis,” it will selectAskScience
.
> Entering new AgentExecutor chain...
I need to find out who the second president of the United States was.
Action: AskHistory
Action Input: Who was the second president of the United States?
Observation: Ah, a classic! The second president of the United States was **John Adams**. He served from 1797 to 1801. A fascinating and often overlooked figure in American history, Adams was a key player in the American Revolution and a staunch advocate for independence. However, his presidency was marked by controversy and challenges, particularly surrounding the Alien and Sedition Acts. Definitely a period worth further exploration!
Thought: I have the answer.
Final Answer: John Adams
> Finished chain.
John Adams
CrewAI Implementation
CrewAI follows a specific directory structure for it's implementation, so we will do the same as defined in the docs.
- Agents.yaml
math_teacher:
role: "Math Teacher"
goal: "Answer students' math questions clearly."
backstory: >
A passionate math educator skilled in algebra, calculus, and problem-solving.
science_teacher:
role: "Science Teacher"
goal: "Answer students' science questions clearly."
backstory: >
A curious scientist passionate about physics, chemistry, and biology.
history_teacher:
role: "History Teacher"
goal: "Answer students' history questions clearly."
backstory: >
A knowledgeable historian focused on events from ancient to modern times.
principal:
role: "School Principal to route query to the right teacher"
goal: "Route student queries to the appropriate teacher."
backstory: >
A wise principal who knows where each question belongs.
Agent Configuration (agents.yaml): Each teaching agent (Math, Science, History) and the Principal is defined with a clear role, goal, and backstory. This information guides the behavior of each agent during execution.
For instance, the math_teacher
is framed as an algebra expert with a passion for problem-solving.
- Tasks.yaml
route_task:
description: "Receive student query, i.e. {topic}, and delegate to correct teacher agent."
expected_output: "The final answer from the most relevant teacher."
agent: principal
math_task:
description: "Answer the math question of {topic}."
expected_output: "A detailed math answer."
agent: math_teacher
science_task:
description: "Answer the science question of {topic}."
expected_output: "A clear science answer."
agent: science_teacher
history_task:
description: "Answer the history question of {topic}."
expected_output: "A descriptive history answer."
agent: history_teacher
Task Configuration (tasks.yaml): Each agent is assigned a task tailored to their subject expertise. The route_task
is given to the Principal, who acts as the system's router. It receives the student query and determines which teaching agent should handle it.
- Crew.py
from dotenv import load_dotenv
from crewai import Agent, Crew, Process, Task, LLM
from crewai.project import CrewBase, agent, crew, task
llm = LLM(
model="gemini/gemini-2.0-flash",
temperature=0.7, # Adjust temperature for response variability
)
# Load environment variables from .env file
load_dotenv()
@CrewBase
class SchoolCrew:
"""Crew to route student queries to subject teachers"""
@agent
def principal(self) -> Agent:
return Agent(
config=self.agents_config['principal'],
verbose=True,
llm=llm,
)
@agent
def math_teacher(self) -> Agent:
return Agent(config=self.agents_config['math_teacher'], verbose=True, llm=llm)
@agent
def science_teacher(self) -> Agent:
return Agent(config=self.agents_config['science_teacher'], verbose=True, llm=llm)
@agent
def history_teacher(self) -> Agent:
return Agent(config=self.agents_config['history_teacher'], verbose=True, llm=llm)
@task
def route_task(self) -> Task:
return Task(config=self.tasks_config['route_task'])
@task
def math_task(self) -> Task:
return Task(config=self.tasks_config['math_task'])
@task
def science_task(self) -> Task:
return Task(config=self.tasks_config['science_task'])
@task
def history_task(self) -> Task:
return Task(config=self.tasks_config['history_task'])
@crew
def crew(self) -> Crew:
return Crew(
agents=[self.math_teacher(), self.science_teacher(), self.history_teacher(), self.social_studies_teacher()],
manager_agent=self.principal(),
tasks=[self.route_task(), self.math_task(), self.science_task(), self.history_task(), self.social_studies_task()],
process=Process.hierarchical, # Dynamic routing
verbose=True
)
System Setup (crew.py): Using the CrewBase
class, agents and tasks are instantiated and grouped into a Crew. The principal
acts as the manager agent, responsible for delegating tasks. The Process.hierarchical
mode enables dynamic task routing based on the query context, mimicking decision-making in an actual school environment.
- Main.py
import os
from crew import SchoolCrew
os.makedirs('output', exist_ok=True)
def run():
inputs = {"topic": "What is the capital of France?"}
result = SchoolCrew().crew().kickoff(inputs=inputs)
print("\n=== FINAL RESPONSE ===\n", result.raw)
inputs = {"topic": "What is the square root of 144? also explain the process."}
result = SchoolCrew().crew().kickoff(inputs=inputs)
print("\n=== FINAL RESPONSE ===\n", result.raw)
inputs = {"topic": "Who was the first president of the United States?"}
result = SchoolCrew().crew().kickoff(inputs=inputs)
print("\n=== FINAL RESPONSE ===\n", result.raw)
if __name__ == "__main__":
run()
Execution (main.py): The system is run with various student queries passed as topic
inputs. The crew evaluates the query, the principal agent routes it to the relevant teacher, and the appropriate teacher provides the answer using Gemini 2.0 Flash.
Note : CrewAI's output is long due to logging of each thinking process, so I'm not able to show you here. You can always run it and check the output.
OpenAI ADK Implementation
import asyncio
from agents import Agent
from agents import Runner
math_agent = Agent(
name="MathAgent",
model="litellm/gemini/gemini-2.0-flash",
instructions="You are a math expert. Provide clear and concise answers to math-related questions.",
)
science_agent = Agent(
name="ScienceAgent",
model="litellm/gemini/gemini-2.0-flash",
instructions="You are a science expert. Provide clear and concise answers to science-related questions.",
)
history_agent = Agent(
name="HistoryAgent",
model="litellm/gemini/gemini-2.0-flash",
instructions="You are a history expert. Provide clear and concise answers to history-related questions.",
)
principal_agent = Agent(
name="TriggerAgent",
model="litellm/gemini/gemini-2.0-flash",
instructions="You are a trigger agent. You will delegate questions to the appropriate subject expert based on the topic.",
handoffs=[math_agent, science_agent, history_agent]
)
async def main():
result = await Runner.run(principal_agent, "What is the capital of France?")
print(result.final_output)
result = await Runner.run(principal_agent, "What is the square root of 144? also explain the process.")
print(result.final_output)
result = await Runner.run(principal_agent, "Who was the first president of the United States?")
print(result.final_output)
asyncio.run(main())
Agent Creation with Specialization: Using OpenAI's ADK, we define multiple subject-specific agents (Math, Science, History). Each agent is initialized using the
Agent
class and powered by Gemini 2.0 Flash (vialitellm
). Every agent is assigned domain-specific instructions, guiding it to act as an expert in its respective field.Central Delegator Agent: The TriggerAgent plays the role of a principal or router agent. It uses the same LLM but with a distinct instruction: identify the topic of the incoming question and handoff the task to the appropriate subject expert agent. This delegation logic is defined using the
handoffs
parameter in the agent configuration.Execution with Runner: The
Runner.run()
function is used to pass user queries to the TriggerAgent. Based on the query context, the agent determines which expert to forward the question to. The responses are collected and printed, simulating an asynchronous interaction loop.OpenAI ADK’s Edge: The ADK provides a clean and modular agent composition interface. It abstracts the routing and execution mechanics, allowing developers to define multi-agent collaboration using just simple Python classes and async runners, without deeply managing LLM context or chaining logic manually.
Query: Who was the first president of the India?
Dr. Rajendra Prasad was the first president of India.
Google ADK Implementation
import asyncio
from google.adk.agents import Agent
from google.adk.sessions import InMemorySessionService
from google.adk.runners import Runner
from google.genai import types # For creating message Content/Parts
# Initialize Agents with Gemini 2.0 Flash
math_agent = Agent(
name="MathTeacher",
model="gemini-2.0-flash",
description="A math teacher who answers math-related questions.",
instruction="You are a helpful math teacher. You only answer math-related questions."
)
science_agent = Agent(
name="ScienceTeacher",
model="gemini-2.0-flash",
description="A science teacher who answers science-related questions.",
instruction="You are a helpful science teacher. You only answer science-related questions."
)
history_agent = Agent(
name="HistoryTeacher",
model="gemini-2.0-flash",
description="A history teacher who answers history-related questions.",
instruction="You are a helpful history teacher. You only answer history-related questions."
)
principal_agent = Agent(
name="Principal",
model="gemini-2.0-flash",
description="A principal who manages the school system and delegates queries to appropriate teachers.",
instruction="You are the principal of the school. You delegate questions to the appropriate teacher based on the subject.",
sub_agents=[math_agent, science_agent, history_agent]
)
# Set Up Session Service
session_service = InMemorySessionService()
app_name = "SchoolSystem"
user_id = "Student_042"
session_id = "Session_001"
async def call_agent_async(query: str, runner, user_id, session_id):
"""Sends a query to the agent and prints the final response."""
print(f"\n>>> User Query: {query}")
content = types.Content(role='user', parts=[types.Part(text=query)])
final_response_text = "Agent did not produce a final response." # Default
async for event in runner.run_async(user_id=user_id, session_id=session_id, new_message=content):
print(f" [Event] Author: {event.author}, Type: {type(event).__name__}, Final: {event.is_final_response()}, Content: {event.content}")
if event.is_final_response():
if event.content and event.content.parts:
final_response_text = event.content.parts[0].text
elif event.actions and event.actions.escalate: # Handle potential errors/escalations
final_response_text = f"Agent escalated: {event.error_message or 'No specific message.'}"
break
print(f"<<< Agent Response: {final_response_text}")
async def get_agent_response():
session = await session_service.create_session(app_name=app_name, user_id=user_id, session_id=session_id)
runner_agent_team = Runner(
agent=principal_agent,
app_name=app_name,
session_service=session_service
)
print("Runner Agent Team initialized.")
await call_agent_async(
"What is the Pythagorean theorem?",
runner_agent_team,
user_id,
session_id
)
await call_agent_async(
"What is the chemical formula for water?",
runner_agent_team,
user_id,
session_id
)
await call_agent_async(
"Who was the first president of the United States?",
runner_agent_team,
user_id,
session_id
)
await call_agent_async(
"Which agents are available in the system?",
runner_agent_team,
user_id,
session_id
)
asyncio.run(get_agent_response())
Subject Agents Setup: In this setup, we define three specialized agents: MathTeacher, ScienceTeacher, and HistoryTeacher using Google’s ADK. Each agent is initialized with Gemini 2.0 Flash and has clear role-based instructions. These agents are only allowed to answer questions from their subject domain, which helps maintain accuracy and specialization.
Principal Agent (Coordinator): The Principal agent acts like a school head. It doesn't answer questions itself—instead, it routes student queries to the right subject expert using the
sub_agents
configuration. This delegation is automatic and based on the query content.Session Handling: Google ADK introduces a session-based memory system. Using the
InMemorySessionService
, the conversation state is preserved across multiple queries from the same user. This makes the interaction feel more continuous and context-aware.Runner Execution: The
Runner
object manages the overall flow. It takes care of delivering the user’s question, collecting streaming responses (via events), and printing the final reply. The interaction loop is asynchronous, and the ADK supports rich content handling and escalation when agents face uncertainty.Google ADK’s Strength: Google’s ADK provides a structured and event-driven way to build agent teams. It focuses on modular design, session memory, and hierarchical agent coordination—making it great for scalable multi-agent applications that mimic real-world roles.
>>> User Query: What is the Pythagorean theorem?
<<< Agent Response: The Pythagorean theorem states that in a right triangle, the square of the length of the hypotenuse (the side opposite the right angle) is equal to the sum of the squares of the lengths of the other two sides. This can be written as:
a² + b² = c²
where:
* a and b are the lengths of the two shorter sides (legs) of the right triangle
* c is the length of the hypotenuse
Comparison Table
Parameter | LangChain | CrewAI | OpenAI Agent SDK | Google ADK |
---|---|---|---|---|
Ease of Use | High‑level APIs and many templates simplify development, but multi-agent setups still require manual orchestration | Very easy to bootstrap multi-agent workflows with minimal boilerplate | Lightweight primitives (Agents, Handoffs, Guardrails); straightforward to set up and debug | Clear sub‑agent structure makes delegation intuitive; event‑driven runner adds minimal complexity |
Documentation Quality | Extensive guides, API refs, tutorials, and examples | Friendly and clear, though lacks depth in advanced scenarios | Concise, example‑rich, focused on agent patterns and production use | Fresh, detailed, and beginner‑friendly, especially around agent lifecycle and orchestration |
Community Support | Massive, including 110k+ GitHub stars, Discord, Slack, third-party content | Growing user base backed by Oracle/Deloitte; active but smaller | Quickly growing; broad help across Reddit, GitHub, and blog posts | Emerging community with strong Google Cloud backing, but not as mature as others |
Model Support | Native first-class support for OpenAI, Gemini, Anthropic, HuggingFace, etc. | Supports major LLMs including Gemini and OpenAI; model flexibility good | Designed for OpenAI models via Chat API; can be extended to custom providers using Litellm | Built for Gemini series, with seamless Vertex AI integration but extendes to providers using Litellm |
Multi‑Agent Capabilities | Tool-based multi-agent support via chains and routers; DIY orchestration | Crew metaphor simplifies role‑based agent collaboration | Native support for handoffs makes delegation clean and traceable | Sub‑agents and handoffs offer robust, built-in multi-agent coordination |
Tool Integration | 50+ official integrations (APIs, DB, scraping, computation, etc.) | Integrated via LangChain backend; strong but slightly less extensive | Easy to add via Python functions; tool wrappers and guardrails supported | Full integration with Vertex AI tools, function calling, cloud APIs |
Performance | Generally responsive; performance depends on chain optimization | Optimized for multi-agent workflows; performance smooth in practice | Fast execution with built-in tracing; overhead minimal for basic use | Native Gemini + cloud infra = low latency and high throughput |
Scalability | Scales with LangSmith, distributed chains, but needs customization | Enterprise-ready orchestration; designed for scaling agent workflows | Built-in tracing, scalable with OpenAI infra; easy deployment | Cloud-native via Vertex AI; handles scale out-of-the-box |
Cost | Open-source; pay only for model/API usage and hosting | Free core; API and enterprise feature usage incur costs | Free SDK; pay for OpenAI model usage and any custom deployments | Free-to-use SDK; pay for Gemini API (Free tier available) and Vertex AI compute |
Codes for all implementation is available here: GitHub
Conclusion
So, who’s the winner?
It depends on what you’re working upon.
LangChain is your all-purpose tool, perfect for coders who want flexibility and a huge community to back them up,great for startups or solo projects.
CrewAI is the multi-agent master, making it easy to build collaborative AI teams, especially for enterprise gigs.
OpenAI Agent SDK is sleek and fast for OpenAI fans, but using Gemini 2.0 Flash requires some DIY, which might slow you down.
Google ADK is the new, ideal for Google Cloud users who want seamless Gemini integration and cloud scalability, though its community is still growing.
If you found this helpful, don’t forget to share and follow for more agent-powered insights. Got an idea or workflow in mind? Join the discussion in the comments or reach out on Twitter or LinkedIn.
This is exactly the kind of side-by-side, example-driven breakdown I wish I had when first trying to pick a multi-agent framework - it really cuts through the hype.
Did you run into any surprising limitations or workarounds in session handling or agent state between frameworks?