How I Built an Agentic RAG Application to Brainstorm Conference Talk Ideas

TL;DR

I love speaking at technical conferences. But in order to get selected to speak at the event, you need to submit a strong talk proposal or abstract—one that clearly shows relevance, technical depth, and actionable takeaways for the audience attending your talk. A good abstract isn’t just about the idea itself; it needs to show why the topic matters right now and how the talk will benefit attendees. At the same time, you want to avoid repeating something that’s already been presented.

To solve this, I built an AI-powered agentic application that helps me ideate and draft compelling talk abstracts. It uses a research agent to do deep research on a topic—finding the latest trends, developments, and active discussions—and combines that with fast vector search using Couchbase over previous talks on the same subject from past conferences. In this case, the system is specifically designed for KubeCon, and in this post, I’ll walk you through how I built the full pipeline to create a conference talk brainstorming AI tool.

You can find the code for this project here.

Important Note 🚨

The goal of the agent is just to provide a well structured abstract idea. One shouldn't just directly copy this AI generated abstract and submit it. But use it as a source of reference and draft an original handcrafted proposal.

Tech Stack

I used a mix of tools to build this project, each handling a different part of the process. Google ADK helps run the AI agents, Couchbase stores past Kubecon talks data and performs the vector search, and Nebius Embedding model for generating embeddings and LLM models (Example: Qwen) generates summaries and talk abstracts.

Complete Pipeline Flow / Architecture Deep Dive

The system is built as a modular, multi-stage pipeline that combines historical data with real-time research to generate high-quality talk proposal ideas.

Step 1: URL Extraction / Data Collection (`extract_events.py`)

Purpose: Scrape and extract all available KubeCon talk URLs from official conference schedule pages.

# Save the KubeCon schedule HTML to a file, then run:
python extract_events.py < schedule.html

What it does:

Parses HTML content from stdin
Extracts all event URLs with pattern event/
Merges with existing URLs in event_urls.txt
Outputs the count of new URLs discovered

Output: event_urls.txt - Contains all unique talk URLs

Step 2: Talk Data Crawling / Data Ingestion (`couchbase_utils.py`)

Purpose: Crawl each talk page, extract structured metadata (title, description, speakers, tags, etc.), and store it in Couchbase using well-defined document schemas.

python couchbase_utils.py

What it does:

Reads URLs from event_urls.txt
Uses AsyncWebCrawler to fetch talk pages in batches
Extracts structured data:
- Title
- Description
- Speaker(s)
- Category
- Date
- Location
Stores directly to Couchbase with document keys like talk_<event_id>

Features:

Batch processing (5 URLs at a time)
Error handling and retry logic
Progress tracking with success/failure counts
Automatic document key generation

Step 3: Embedding Generation (`embeddinggeneration.py`)

Purpose: Generate semantic vector embeddings from talk content (title + description + category) using the intfloat/e5-mistral-7b-instruct model from Nebius AI Studio, and store them back in Couchbase for fast vector search.

python embeddinggeneration.py

What it does:

Queries all documents from Couchbase
Combines title, description, and category into searchable text
Generates embeddings using intfloat/e5-mistral-7b-instruct model
Updates documents with embedding vectors
Enables vector search functionality

Model: Uses Nebius AI's embedding endpoint for high-quality vectors

Step 4: Agent + RAG Application (`talk_suggestions_app.py`)

Purpose: The user inputs a rough topic idea via the Streamlit interface. The system runs both the research agent and vector search in parallel, then combines the outputs using a Nebius AI LLM to generate a unique, well-structured abstract with key takeaways.

streamlit run kubecon-talk-agent/talk_suggestions_app.py

Core Features:

On one side, the application performs a vector search through a database (Couchbase) of past KubeCon talks to understand what’s already been covered. On the other, it leverages a web research agent powered by Google ADK to gather the latest trends, technical developments, and community discussions around the topic. This leads into a three-stage generation process: the Research Phase, where the agent collects up-to-date context; the Retrieval Phase, where similar historical talks are surfaced; and the Synthesis Phase, where an LLM merges both streams into a compelling proposal.

Let's look a bit deeper into the 3 step process:

Research Agent Execution

I created a custom multi-agent research system using Google ADK (Agent Development Kit). This system is designed to autonomously explore the to research emerging trends across the CNCF ecosystem in real-time from trusted sources.

Here's how it works under the hood:

Parallel Execution for Web Research

The first step involves spinning up multiple research agents that gather insights independently from different web sources. I use a ParallelAgent to run all of these at the same time:

ExaAgent: Leverages the Exa API to search for recent high-quality blogs, articles, and summaries published in the past 90 days.

TavilyAgent (optional): Pulls developer sentiment and discussion threads from platforms like Reddit, X (formerly Twitter), and Dev.to.

LinkupAgent (optional): Surfaces curated technical posts, deep-dives from sites like GitHub and Hacker News.

Each of these tools is wrapped in its own LlmAgent, configured with dynamic instructions based on the user’s topic. Because they operate independently, they don’t interfere with one another and collectively reduce total response time.

These agents are executed in parallel using a ParallelAgent, ensuring low latency and independent execution. Once all the raw data is collected, it is passed to a SummaryAgent, which synthesizes the results into a clean, structured summary using a powerful LLM (nebius/Qwen/Qwen3-235B-A22B).

Sequential Reasoning for Synthesis and Insight

Once all agents (ParallelAgent) complete their respective searches, I combine their outputs into a single structured flow.

Other than search agents, the entire pipeline with steps like summarization and analysis is being done sequentially, managed using ADK’s SequentialAgent:

SummaryAgent: This agent synthesizes the raw research results into a cohesive, structured Markdown summary. It filters the highlights common themes, and stitches together the key insights from the research agents.

AnalysisAgent: This agent reviews the summary and delivers deeper insights including:

Key Trends – Major developments or patterns observed
Novel Angles – Unique viewpoints or underexplored ideas
Unanswered Questions – What the community is still trying to figure out
Contrarian Viewpoints – Active debates or non-mainstream takes

This sequential setup is intentional: the AnalysisAgent depends on the clean output from the SummaryAgent. Running them in parallel would reduce quality and coherence.

The Orchestration Layer

The full pipeline is managed through ADK’s orchestration features:

ParallelAgent → for running web search agents
SequentialAgent → for dependent reasoning steps
Runner → to execute the pipeline
InMemorySessionService → for fast, stateless execution

Here's a simplified breakdown of the pipeline:

def run_adk_research(topic: str) -> str:
    # 1. Setup Models
    nebius_base_model = LiteLlm(model="nebius/Qwen/Qwen3-235B-A22B", api_key=os.getenv("NEBIUS_API_KEY"))

    # 2. Define Agents
    exa_agent = LlmAgent(
        name="ExaAgent",
        model=nebius_base_model,
        instruction=f"Use the exa_search_ai tool to fetch the latest news and developments about '{topic}'.",
        tools=[exa_search_ai],
        output_key="exa_results"
    )

    # 3. Summarize Results
    summary_agent = LlmAgent(
        name="SummaryAgent",
        model=nebius_base_model,
        instruction="""
            You are a meticulous research summarizer. Combine the results from 'exa_results' 
            into a cohesive markdown summary. Focus on trends, notable discussions, and 
            community sentiment.
        """,
        output_key="final_summary"
    )

    # 4. Execute Pipeline
    pipeline = SequentialAgent(
        name="AIPipelineAgent",
        sub_agents=[
            ParallelAgent(name="ParallelSearch", sub_agents=[exa_agent]),
            summary_agent
        ]
    )

    runner = Runner(agent=pipeline, app_name="adk_research_app", session_service=InMemorySessionService())

    content = types.Content(role="user", parts=[types.Part(text=f"Start analysis for {topic}")])
    events = runner.run(user_id="streamlit_user", session_id="session_xyz", new_message=content)

    for event in events:
        if event.is_final_response():
            return event.content.parts[0].text

    return "Failed to generate summary."

Retrieval Agent Execution

Once real-time research is complete, the system now proceeds to retrieving historical context from past KubeCon talks. This is done using Couchbase vector search, which allows us to compare the semantic similarity of the user's idea with previous talk proposals.

What happens here?

We take the user’s query and generate an embedding using intfloat/e5-mistral-7b-instruct via Nebius' embedding API.

We then perform a vector search against a kubecontalks index in Couchbase that stores embeddings of historical talks.

Finally, we fetch the metadata (title, speaker, category, description) for the top matching talks.

This helps us:

Understand what’s already been covered.
Avoid duplicate proposals.
Borrow inspiration from successful submissions

Here's the sample code for the same:

class CouchbaseConnection:
    def __init__(self):
        connection_string = os.getenv('CB_CONNECTION_STRING')
        username = os.getenv('CB_USERNAME')
        password = os.getenv('CB_PASSWORD')
        bucket_name = os.getenv('CB_BUCKET')
        collection_name = os.getenv('CB_COLLECTION')

        auth = PasswordAuthenticator(username, password)
        options = ClusterOptions(auth)
        self.cluster = Cluster(connection_string, options)
        self.bucket = self.cluster.bucket(bucket_name)
        self.scope = self.bucket.scope("_default")
        self.collection = self.bucket.collection(collection_name)
        self.search_index_name = os.getenv('CB_SEARCH_INDEX', "kubecontalks")

    def generate_embedding(self, text: str) -> List[float]:
        client = OpenAI(base_url=os.getenv("NEBIUS_API_BASE"), api_key=os.getenv("NEBIUS_API_KEY"))
        response = client.embeddings.create(
            model="intfloat/e5-mistral-7b-instruct",
            input=text,
            timeout=30
        )
        return response.data[0].embedding

    def get_similar_talks(self, query: str, num_results: int = 5) -> List[Dict[str, Any]]:
        embedding = self.generate_embedding(query)
        search_req = SearchRequest.create(MatchNoneQuery()).with_vector_search(
            VectorSearch.from_vector_query(
                VectorQuery("embedding", embedding, num_candidates=num_results)
            )
        )
        result = self.scope.search(self.search_index_name, search_req)
        rows = list(result.rows())

        similar_talks = []
        for row in rows:
            doc = self.collection.get(row.id)
            if doc and doc.value:
                talk = doc.value
                similar_talks.append({
                    "title": talk.get("title", "N/A"),
                    "description": talk.get("description", "N/A"),
                    "category": talk.get("category", "N/A"),
                    "speaker": talk.get("speaker", "N/A"),
                    "score": row.score
                })
        return similar_talks

The results of this phase are then passed into the final synthesis stage.

Synthesis Phase

The final phase brings everything together: the user’s idea, the ADK-generated real-time insights, and the similar historical talks.

The goal is to produce a talk propsal idea proposal that is:

Timely – aligned with current trends.
Unique – not duplicating past talks.
Actionable – with clear learning objectives and audience fit.

How it works?

We use a LLM (Qwen/Qwen3-235B-A22B) to analyze:

User’s raw idea
Web analysis from the research agent
Historical KubeCon talks from vector search

We then ask the model to synthesize all of this into a structured format containing:

Title
Abstract
Key Learning Objectives
Target Audience
Why this talk is unique

def generate_talk_suggestion(query: str, similar_talks: List[Dict[str, Any]], adk_research: str) -> str:
    historical_context = "\n\n".join([
        f"Title: {talk['title']}\nDescription: {talk['description']}\nCategory: {talk['category']}"
        for talk in similar_talks
    ]) if similar_talks else "No similar talks found."

    prompt = f"""
You are an expert in cloud-native conference planning.

User's Idea:
{query}

PART 1: Historical Talks
{historical_context}

PART 2: Web Research
{adk_research}

Your task is to generate a fresh and compelling talk proposal. Follow this structure:

**Title:**  
*A catchy title that grabs attention.*

**Abstract:**  
*2–3 paragraphs outlining the core idea, approach, and takeaways.*

**Key Learning Objectives:**  
- Bullet 1  
- Bullet 2  
- Bullet 3  

**Target Audience:**  
*Beginner SREs? Advanced Platform Engineers?*

**Why This Talk is Unique:**  
*Explain how it differs from existing talks and addresses a fresh trend or gap.*
"""

    client = OpenAI(api_key=os.getenv("NEBIUS_API_KEY"), base_url=os.getenv("NEBIUS_API_BASE"))
    response = client.chat.completions.create(
        model="Qwen/Qwen3-235B-A22B",
        messages=[
            {"role": "system", "content": "You are a cloud-native conference program advisor."},
            {"role": "user", "content": prompt}
        ],
        temperature=0.7,
        max_tokens=2048
    )
    return response.choices[0].message.content

This is where the magic happens. The model takes a dual-context approach—both fresh insights and past data—to recommend a proposal that’s:

grounded in reality,
informed by what’s already been done
provides real world use-cases

Final thoughts

Building this made me realize that talk ideation is just another AI use case. Blending historical talk data with up-to-the-minute research minimizes time and effort spent to getting latest information and having to spend time finding previous talks on the topics.

AI Agents help simplify tasks and can orchestrate complex workflows with ease.

Curious to try this for your own conference? Drop me a note—I’d love to hear your ideas and evolve this further with the community!

Shivay Lamba @shivaylamba