TL;DR
I love speaking at technical conferences. But in order to get selected to speak at the event, you need to submit a strong talk proposal or abstract—one that clearly shows relevance, technical depth, and actionable takeaways for the audience attending your talk. A good abstract isn’t just about the idea itself; it needs to show why the topic matters right now and how the talk will benefit attendees. At the same time, you want to avoid repeating something that’s already been presented.
To solve this, I built an AI-powered agentic application that helps me ideate and draft compelling talk abstracts. It uses a research agent to do deep research on a topic—finding the latest trends, developments, and active discussions—and combines that with fast vector search using Couchbase over previous talks on the same subject from past conferences. In this case, the system is specifically designed for KubeCon, and in this post, I’ll walk you through how I built the full pipeline to create a conference talk brainstorming AI tool.
You can find the code for this project here.
Important Note 🚨
The goal of the agent is just to provide a well structured abstract idea. One shouldn't just directly copy this AI generated abstract and submit it. But use it as a source of reference and draft an original handcrafted proposal.
Tech Stack
I used a mix of tools to build this project, each handling a different part of the process. Google ADK helps run the AI agents, Couchbase stores past Kubecon talks data and performs the vector search, and Nebius Embedding model for generating embeddings and LLM models (Example: Qwen) generates summaries and talk abstracts.
Complete Pipeline Flow / Architecture Deep Dive
The system is built as a modular, multi-stage pipeline that combines historical data with real-time research to generate high-quality talk proposal ideas.
Step 1: URL Extraction / Data Collection (extract_events.py
)
Purpose: Scrape and extract all available KubeCon talk URLs from official conference schedule pages.
# Save the KubeCon schedule HTML to a file, then run:
python extract_events.py < schedule.html
What it does:
- Parses HTML content from stdin
- Extracts all event URLs with pattern
event/
- Merges with existing URLs in
event_urls.txt
- Outputs the count of new URLs discovered
Output: event_urls.txt
- Contains all unique talk URLs
Step 2: Talk Data Crawling / Data Ingestion (couchbase_utils.py
)
Purpose: Crawl each talk page, extract structured metadata (title, description, speakers, tags, etc.), and store it in Couchbase using well-defined document schemas.
python couchbase_utils.py
What it does:
- Reads URLs from
event_urls.txt
- Uses AsyncWebCrawler to fetch talk pages in batches
- Extracts structured data:
- Title
- Description
- Speaker(s)
- Category
- Date
- Location
- Stores directly to Couchbase with document keys like
talk_<event_id>
Features:
- Batch processing (5 URLs at a time)
- Error handling and retry logic
- Progress tracking with success/failure counts
- Automatic document key generation
Step 3: Embedding Generation (embeddinggeneration.py
)
Purpose: Generate semantic vector embeddings from talk content (title + description + category) using the intfloat/e5-mistral-7b-instruct
model from Nebius AI Studio, and store them back in Couchbase for fast vector search.
python embeddinggeneration.py
What it does:
- Queries all documents from Couchbase
- Combines title, description, and category into searchable text
- Generates embeddings using
intfloat/e5-mistral-7b-instruct
model - Updates documents with embedding vectors
- Enables vector search functionality
Model: Uses Nebius AI's embedding endpoint for high-quality vectors
Step 4: Agent + RAG Application (talk_suggestions_app.py
)
Purpose: The user inputs a rough topic idea via the Streamlit interface. The system runs both the research agent and vector search in parallel, then combines the outputs using a Nebius AI LLM to generate a unique, well-structured abstract with key takeaways.
streamlit run kubecon-talk-agent/talk_suggestions_app.py
Core Features:
On one side, the application performs a vector search through a database (Couchbase) of past KubeCon talks to understand what’s already been covered. On the other, it leverages a web research agent powered by Google ADK to gather the latest trends, technical developments, and community discussions around the topic. This leads into a three-stage generation process: the Research Phase
, where the agent collects up-to-date context; the Retrieval Phase
, where similar historical talks are surfaced; and the Synthesis Phase
, where an LLM merges both streams into a compelling proposal.
Let's look a bit deeper into the 3 step process:
Research Agent Execution
I created a custom multi-agent research system using Google ADK (Agent Development Kit). This system is designed to autonomously explore the to research emerging trends across the CNCF ecosystem in real-time from trusted sources.
Here's how it works under the hood:
Parallel Execution for Web Research
The first step involves spinning up multiple research agents that gather insights independently from different web sources. I use a ParallelAgent
to run all of these at the same time:
ExaAgent
: Leverages the Exa API to search for recent high-quality blogs, articles, and summaries published in the past 90 days.
TavilyAgent
(optional): Pulls developer sentiment and discussion threads from platforms like Reddit, X (formerly Twitter), and Dev.to.
LinkupAgent
(optional): Surfaces curated technical posts, deep-dives from sites like GitHub and Hacker News.
Each of these tools is wrapped in its own LlmAgent
, configured with dynamic instructions based on the user’s topic. Because they operate independently, they don’t interfere with one another and collectively reduce total response time.
These agents are executed in parallel using a ParallelAgent
, ensuring low latency and independent execution. Once all the raw data is collected, it is passed to a SummaryAgent
, which synthesizes the results into a clean, structured summary using a powerful LLM (nebius/Qwen/Qwen3-235B-A22B).
Sequential Reasoning for Synthesis and Insight
Once all agents (ParallelAgent
) complete their respective searches, I combine their outputs into a single structured flow.
Other than search agents, the entire pipeline with steps like summarization and analysis is being done sequentially, managed using ADK’s SequentialAgent
:
SummaryAgent
: This agent synthesizes the raw research results into a cohesive, structured Markdown summary. It filters the highlights common themes, and stitches together the key insights from the research agents.
AnalysisAgent
: This agent reviews the summary and delivers deeper insights including:
- Key Trends – Major developments or patterns observed
- Novel Angles – Unique viewpoints or underexplored ideas
- Unanswered Questions – What the community is still trying to figure out
- Contrarian Viewpoints – Active debates or non-mainstream takes
This sequential setup is intentional: the AnalysisAgent
depends on the clean output from the SummaryAgent
. Running them in parallel would reduce quality and coherence.
The Orchestration Layer
The full pipeline is managed through ADK’s orchestration features:
- ParallelAgent → for running web search agents
- SequentialAgent → for dependent reasoning steps
- Runner → to execute the pipeline
- InMemorySessionService → for fast, stateless execution
Here's a simplified breakdown of the pipeline:
def run_adk_research(topic: str) -> str:
# 1. Setup Models
nebius_base_model = LiteLlm(model="nebius/Qwen/Qwen3-235B-A22B", api_key=os.getenv("NEBIUS_API_KEY"))
# 2. Define Agents
exa_agent = LlmAgent(
name="ExaAgent",
model=nebius_base_model,
instruction=f"Use the exa_search_ai tool to fetch the latest news and developments about '{topic}'.",
tools=[exa_search_ai],
output_key="exa_results"
)
# 3. Summarize Results
summary_agent = LlmAgent(
name="SummaryAgent",
model=nebius_base_model,
instruction="""
You are a meticulous research summarizer. Combine the results from 'exa_results'
into a cohesive markdown summary. Focus on trends, notable discussions, and
community sentiment.
""",
output_key="final_summary"
)
# 4. Execute Pipeline
pipeline = SequentialAgent(
name="AIPipelineAgent",
sub_agents=[
ParallelAgent(name="ParallelSearch", sub_agents=[exa_agent]),
summary_agent
]
)
runner = Runner(agent=pipeline, app_name="adk_research_app", session_service=InMemorySessionService())
content = types.Content(role="user", parts=[types.Part(text=f"Start analysis for {topic}")])
events = runner.run(user_id="streamlit_user", session_id="session_xyz", new_message=content)
for event in events:
if event.is_final_response():
return event.content.parts[0].text
return "Failed to generate summary."
Retrieval Agent Execution
Once real-time research is complete, the system now proceeds to retrieving historical context from past KubeCon talks. This is done using Couchbase vector search, which allows us to compare the semantic similarity of the user's idea with previous talk proposals.
What happens here?
We take the user’s query and generate an embedding using intfloat/e5-mistral-7b-instruct
via Nebius' embedding API.
We then perform a vector search against a kubecontalks
index in Couchbase that stores embeddings of historical talks.
Finally, we fetch the metadata (title, speaker, category, description) for the top matching talks.
This helps us:
- Understand what’s already been covered.
- Avoid duplicate proposals.
- Borrow inspiration from successful submissions
Here's the sample code for the same:
class CouchbaseConnection:
def __init__(self):
connection_string = os.getenv('CB_CONNECTION_STRING')
username = os.getenv('CB_USERNAME')
password = os.getenv('CB_PASSWORD')
bucket_name = os.getenv('CB_BUCKET')
collection_name = os.getenv('CB_COLLECTION')
auth = PasswordAuthenticator(username, password)
options = ClusterOptions(auth)
self.cluster = Cluster(connection_string, options)
self.bucket = self.cluster.bucket(bucket_name)
self.scope = self.bucket.scope("_default")
self.collection = self.bucket.collection(collection_name)
self.search_index_name = os.getenv('CB_SEARCH_INDEX', "kubecontalks")
def generate_embedding(self, text: str) -> List[float]:
client = OpenAI(base_url=os.getenv("NEBIUS_API_BASE"), api_key=os.getenv("NEBIUS_API_KEY"))
response = client.embeddings.create(
model="intfloat/e5-mistral-7b-instruct",
input=text,
timeout=30
)
return response.data[0].embedding
def get_similar_talks(self, query: str, num_results: int = 5) -> List[Dict[str, Any]]:
embedding = self.generate_embedding(query)
search_req = SearchRequest.create(MatchNoneQuery()).with_vector_search(
VectorSearch.from_vector_query(
VectorQuery("embedding", embedding, num_candidates=num_results)
)
)
result = self.scope.search(self.search_index_name, search_req)
rows = list(result.rows())
similar_talks = []
for row in rows:
doc = self.collection.get(row.id)
if doc and doc.value:
talk = doc.value
similar_talks.append({
"title": talk.get("title", "N/A"),
"description": talk.get("description", "N/A"),
"category": talk.get("category", "N/A"),
"speaker": talk.get("speaker", "N/A"),
"score": row.score
})
return similar_talks
The results of this phase are then passed into the final synthesis stage.
Synthesis Phase
The final phase brings everything together: the user’s idea, the ADK-generated real-time insights, and the similar historical talks.
The goal is to produce a talk propsal idea proposal that is:
- Timely – aligned with current trends.
- Unique – not duplicating past talks.
- Actionable – with clear learning objectives and audience fit.
How it works?
We use a LLM (Qwen/Qwen3-235B-A22B)
to analyze:
- User’s raw idea
- Web analysis from the research agent
- Historical KubeCon talks from vector search
We then ask the model to synthesize all of this into a structured format containing:
- Title
- Abstract
- Key Learning Objectives
- Target Audience
- Why this talk is unique
def generate_talk_suggestion(query: str, similar_talks: List[Dict[str, Any]], adk_research: str) -> str:
historical_context = "\n\n".join([
f"Title: {talk['title']}\nDescription: {talk['description']}\nCategory: {talk['category']}"
for talk in similar_talks
]) if similar_talks else "No similar talks found."
prompt = f"""
You are an expert in cloud-native conference planning.
User's Idea:
{query}
PART 1: Historical Talks
{historical_context}
PART 2: Web Research
{adk_research}
Your task is to generate a fresh and compelling talk proposal. Follow this structure:
**Title:**
*A catchy title that grabs attention.*
**Abstract:**
*2–3 paragraphs outlining the core idea, approach, and takeaways.*
**Key Learning Objectives:**
- Bullet 1
- Bullet 2
- Bullet 3
**Target Audience:**
*Beginner SREs? Advanced Platform Engineers?*
**Why This Talk is Unique:**
*Explain how it differs from existing talks and addresses a fresh trend or gap.*
"""
client = OpenAI(api_key=os.getenv("NEBIUS_API_KEY"), base_url=os.getenv("NEBIUS_API_BASE"))
response = client.chat.completions.create(
model="Qwen/Qwen3-235B-A22B",
messages=[
{"role": "system", "content": "You are a cloud-native conference program advisor."},
{"role": "user", "content": prompt}
],
temperature=0.7,
max_tokens=2048
)
return response.choices[0].message.content
This is where the magic happens. The model takes a dual-context approach—both fresh insights and past data—to recommend a proposal that’s:
- grounded in reality,
- informed by what’s already been done
- provides real world use-cases
Final thoughts
Building this made me realize that talk ideation is just another AI use case. Blending historical talk data with up-to-the-minute research minimizes time and effort spent to getting latest information and having to spend time finding previous talks on the topics.
AI Agents help simplify tasks and can orchestrate complex workflows with ease.
Curious to try this for your own conference? Drop me a note—I’d love to hear your ideas and evolve this further with the community!
Great Project Shivay!