TechMentor Voice is the first Domain Expert Voice Agent specifically designed for developers. It's a real-time AI voice assistant that provides instant, accurate programming help through natural conversation - transforming how developers access documentation and solve technical problems.
// Just speak naturally to get instant help:π€"How do I implement authentication in Next.js 14?"π€"For Next.js 14 authentication, I recommend using NextAuth.js v5..."π€"Show me React Server Components best practices"π€"React Server Components should be used for data fetching..."
π― Domain Expert Features
Voice-Optimized Technical Knowledge:
Live Documentation Integration: Real-time access to up-to-date library docs via Context7 MCP
Intelligent Context Understanding: Automatically detects frameworks, libraries, and technical concepts
Code-Aware Responses: Provides practical examples with proper syntax highlighting
Natural Developer Conversations: Optimized for technical terminology and programming concepts
Technical Knowledge Access:
Access to extensive documentation via Context7 MCP
Real-time retrieval from thousands of libraries and frameworks
Covers popular technologies like React, Python, databases, and more
Automatically detects relevant documentation based on your questions
Why Voice for Developer Documentation?
Traditional documentation browsing interrupts the coding flow. Developers constantly switch between IDE, browser, and Stack Overflow. TechMentor Voice eliminates this friction by providing instant, spoken answers while you code - like having a senior developer pair programming with you.
Demo
π Live Demo:
Sample Voice Interactions
# Real conversations with TechMentor Voice:
π€ "How do I optimize my Next.js app for production?"
π€ "For Next.js production optimization, focus on these key areas:
First, enable static generation where possible using
generateStaticParams. Second, implement proper image
optimization with next/image component..."
π€ "Debug TypeScript interface inheritance issues"
π€ "TypeScript interface inheritance issues often stem from
property conflicts. Use intersection types with & operator
for combining interfaces, or extend with proper overrides..."
π€ "Best practices for React useEffect cleanup"
π€ "useEffect cleanup prevents memory leaks. Return a cleanup
function for subscriptions, timers, and event listeners.
Here's the pattern: useEffect(() => { const subscription =
subscribe(); return () => subscription.unsubscribe(); }, []);"
Real-time AI voice assistant for developers - Built for the AssemblyAI Voice Agents Challenge using Universal-Streaming, Context7 MCP, and Gemini 2.0 Flash.
π― What I Built
TechMentor Voice is the first voice-driven documentation assistant that provides instant, accurate programming help through natural conversation. Ask any technical question and get real-time answers with current documentation and code examples.
β¨ Key Features
π€ Ultra-Fast Voice Input: AssemblyAI Universal-Streaming with 300ms latency
π Live Documentation: Context7 MCP integration for up-to-date library docs
π§ Smart AI Processing: Gemini 2.0 Flash for accurate, conversational responses
π£οΈ Premium Voice Output: ElevenLabs TTS with Web Speech fallback
β‘ Real-Time Performance: End-to-end latency under 1 second
π¨ Beautiful UI: Modern, responsive design with live transcription
components/ConversationHistory.tsx - Chat history display
Technical Implementation & AssemblyAI Integration
π― AssemblyAI Universal-Streaming: The Voice Foundation
The core of TechMentor Voice leverages AssemblyAI's Universal-Streaming v3 for ultra-low latency voice processing, specifically optimized for technical conversations.
// Real-time WebSocket connection to Universal-Streaming v3constwsUrl=`wss://streaming.assemblyai.com/v3/ws?api_key=${apiKey}`;constws=newWebSocket(wsUrl);// Configure for optimal voice agent performance constconfig={type:'configure',format_turns:true,// π― Enhanced turn detectionpunctuate:true,// π Automatic punctuation end_utterance_silence_threshold:1500,// β±οΈ Smart endpointingvoice_activity_detection:true// π Advanced VAD};// Process immutable transcripts with intelligent turn detectionws.onmessage=(event)=>{constdata=JSON.parse(event.data);// Critical: Prevent audio feedback loopsif (isAISpeakingRef.current){console.log('π Ignoring transcript - AI is speaking');return;}if (data.end_of_turn&&data.transcript.trim()){// Process complete developer questionsprocessVoiceQuery(data.transcript);}};
π§ Smart Audio State Management
Critical Innovation: Preventing infinite feedback loops between AI speech and microphone input.
// Audio feedback prevention systemconstisAISpeakingRef=useRef(false);constspeakResponse=async (text:string)=>{console.log('π Starting AI response');isAISpeakingRef.current=true;// CRITICAL: Stop listening while AI speaksawaitstopMicrophoneTemporarily();try{// ElevenLabs TTS with proper cleanupconstaudioBlob=awaitgenerateSpeech(text);awaitplayAudioWithCallback(audioBlob);}finally{// Resume listening after AI finishesisAISpeakingRef.current=false;setTimeout(resumeListening,500);// Prevent echo}};// WebSocket message filtering during AI speechws.onmessage=(event)=>{if (isAISpeakingRef.current)return;// π‘οΈ Feedback protectionprocessTranscript(event.data);};
π Context7 MCP Integration: Live Documentation
Domain Expertise comes from real-time documentation retrieval using Context7's Model Context Protocol.
// Smart library detection and documentation retrievalasyncfunctiongetRelevantDocumentation(query:string){// 1. Detect frameworks/libraries from voice queryconstdetectedLibraries=extractTechnicalTerms(query);// 2. Query Context7 MCP for live documentationconstmcpResponse=awaitfetch('https://mcp.context7.com/mcp',{method:'POST',headers:{'Content-Type':'application/json'},body:JSON.stringify({jsonrpc:'2.0',method:'tools/call',params:{name:'get-library-docs',arguments:{context7CompatibleLibraryID:detectedLibraries[0],tokens:3000,topic:extractTechnicalTopic(query)}}})});// 3. Score and rank documentation chunksreturnscoreDocumentRelevance(query,documentation);}// Technical term extraction optimized for voicefunctionextractTechnicalTerms(voiceQuery:string):string[]{consttechPatterns={'next.js':/\b(next\.?js|nextjs)\b/i,'react':/\breact\b/i,'typescript':/\b(typescript|ts)\b/i,'node.js':/\b(node\.?js|nodejs)\b/i,'python':/\bpython\b/i};returnObject.keys(techPatterns).filter(lib=>techPatterns[lib].test(voiceQuery));}
π€ Gemini 2.0 Flash: Voice-Optimized AI Processing
Domain Expert System Prompt specifically designed for technical conversations:
constDOMAIN_EXPERT_PROMPT=`
You are TechMentor Voice, a specialized AI assistant for developers.
EXPERTISE AREAS:
- Modern JavaScript/TypeScript development
- React, Next.js, Node.js ecosystems
- Python, Django, FastAPI backends
- Database design and optimization
- DevOps, Docker, Kubernetes
- Cloud platforms (AWS, Vercel, Cloudflare)
VOICE-OPTIMIZED RESPONSES:
1. **Conversational**: Speak naturally as if pair programming
2. **Concise**: 100-200 words maximum for voice delivery
3. **Practical**: Include actionable code examples
4. **Current**: Focus on modern best practices
5. **Structured**: Clear transitions between concepts
TECHNICAL RESPONSE FORMAT:
- Start with direct answer
- Provide brief code example if relevant
- Explain reasoning behind recommendations
- Suggest next steps or related concepts
Remember: Users are SPEAKING to you and will HEAR your response.
Make it conversational yet technically accurate.
`;
π¨ Advanced Web Audio Processing
High-Quality Audio Pipeline for professional developer interactions:
// Professional audio configuration for clear technical discussionsconstaudioConfig={sampleRate:16000,// Optimal for speech recognitionchannelCount:1,// Mono for efficiency echoCancellation:true,// Prevent feedbacknoiseSuppression:true,// Clear technical termsautoGainControl:true// Consistent volume};// Real-time PCM16 conversion for Universal-StreamingconstconvertFloat32ToPCM16=(float32Array:Float32Array):ArrayBuffer=>{constpcm16Array=newInt16Array(float32Array.length);for (leti=0;i<float32Array.length;i++){pcm16Array[i]=Math.max(-32768,Math.min(32767,float32Array[i]*32768));}returnpcm16Array.buffer;};// Audio processing with technical term optimizationprocessorRef.current.onaudioprocess=(event)=>{if (wsRef.current?.readyState===WebSocket.OPEN&&!isAISpeakingRef.current){constinputData=event.inputBuffer.getChannelData(0);constpcmData=convertFloat32ToPCM16(inputData);wsRef.current.send(pcmData);// Send to AssemblyAI}};
π Performance Optimizations
Sub-Second Response Pipeline achieved through:
// Parallel processing for minimal latencyasyncfunctionprocessVoiceQuery(transcript:string){conststartTime=Date.now();// Parallel execution of context retrieval and AI processingconst[contextResult]=awaitPromise.allSettled([getRelevantDocumentation(transcript),// ~200ms// Pre-warm Gemini connection during context fetch]);constcontextTime=Date.now()-startTime;// Process with Gemini using retrieved contextconstaiResponse=awaitprocessWithGemini(transcript,contextResult);consttotalTime=Date.now()-startTime;// Performance logging for optimizationconsole.log(`β‘ Total processing: ${totalTime}ms`);returnaiResponse;}
π‘οΈ Error Handling & Fallbacks
Production-Ready Reliability:
// Graceful fallbacks for each componentconsterrorHandling={universalStreaming:"Auto-reconnection with status indicators",context7MCP:"Graceful fallback to general knowledge",geminiAPI:"Comprehensive error responses with retry logic",ttsServices:"Automatic fallback from ElevenLabs to Web Speech"};
π― What Makes This Project Unique
1. Specialized for Developer Workflows
Live Documentation Access: Real-time retrieval from Context7's extensive library database
Voice-First Design: Built specifically for spoken technical conversations
Code-Aware Responses: Understands programming context and provides relevant examples
2. Technical Innovation
Audio Feedback Prevention: Solved the critical challenge of voice loops in AI assistants
Intelligent Document Relevance: Smart scoring system to find the most relevant documentation chunks
Multi-Modal Pipeline: Seamless integration of voice, documentation, and AI processing
3. Developer-Focused Experience
Natural Technical Conversations: Handles programming terminology and framework-specific questions
Instant Context Switching: No need to leave your coding environment
Production-Ready Architecture: Built with proper error handling and fallback mechanisms
4. Real-World Problem Solving
Eliminates Documentation Friction: Reduces context switching during development
Accelerates Learning: Provides instant explanations for new concepts
Improves Accessibility: Voice interface benefits developers with different needs
Developer Testimonial "Finally, a voice assistant that actually understands when I say 'useState hook' vs 'use state hook' - the difference matters!"
IDE Integration: VS Code extension for in-editor voice queries
Team Knowledge: Company-specific documentation integration
Voice Code Generation: Speak algorithms, get implementation
// The future of developer assistance is hereconstdeveloper=newTechMentorVoice();awaitdeveloper.ask("How do I optimize this React component?");// π€ β π§ β π¬ β π
TechMentor Voice isn't just another chatbot - it's your AI pair programming partner that understands code, speaks developer, and thinks in frameworks. The future of technical assistance is conversational, intelligent, and always available.