๐ŸŽ“๐Ÿง  Grasp, Articulate & Refine: Your Real-Time Voice Coach for Smarter Recall & Academic Mastery ๐ŸŽค๐Ÿ“šโšก
Divya

Divya @divyasinghdev

About: A curious lifelong learner, currently a full-time Masters student persuing Computer Science stream. Enthusiastic about development.

Joined:
Jul 9, 2022

๐ŸŽ“๐Ÿง  Grasp, Articulate & Refine: Your Real-Time Voice Coach for Smarter Recall & Academic Mastery ๐ŸŽค๐Ÿ“šโšก

Publish Date: Jul 28
68 21

This is a submission for the AssemblyAI Voice Agents Challenge

๐Ÿ’ก What I Built

A real-time, AI-powered academic listening coach designed to help you:

  • Grasp any concept
  • Articulate it in your own words
  • Get real-time feedback from an AI mentor trained to respond like a domain-specific educator

project landing page

Imagine a personalized Listening Toastmasters for academics, one that:

โœ… Listens while you speak
โœ… Transcribes in real-time
โœ… Analyzes your response
โœ… Gives constructive feedback
โœ… Grades your clarity, tone, and structure, not you ๐Ÿ˜…

Perfect for viva prep, thesis defense, placement interviews, just better grasping a concept or topic, or even explaining tough concepts out loud.


โœจ Why I Built It

I've always craved a mentor who could truly adapt to me -
One who listens without judgment.
Who cares how I speak, not just what I say.
Who waits when I pause. And helps me find the words when I blank out.

As a student juggling placements, exams,hackathons and life, I often find myself:

  • Mumbling under pressure
  • Rambling mid-response
  • Or going completely blank during interviews

So, I built this for that version of me.
The nervous student. The silent developer.
The person who knows the answer, but just canโ€™t say it clearly.

This is more than a tool.
Itโ€™s a gentle, nerdy best friend in your laptop, reminding you:

โ€œYouโ€™ve got this. Just speak, Iโ€™ll help you shape it.โ€

Oh, and if you're overusing filler words?
Itโ€™ll lovingly ๐Ÿ’˜ roast you:

โ€œBestie, you just said โ€˜ummโ€™ 27 times. Letโ€™s fix that.โ€


๐Ÿ› ๏ธ Features

1๏ธโƒฃ ๐ŸŽ™๏ธ Mic On, Brain On

โ€ƒLive voice input straight from your browser (no app install needed!)

2๏ธโƒฃ โœ๏ธ Real-Time Whispering

โ€ƒInstant speech-to-text via โšก AssemblyAIโ€™s Streaming API

3๏ธโƒฃ ๐Ÿ“Š Instant Report Card

Get scored out of 10 on key communication metrics:

  • ๐Ÿ—ฃ๏ธ Fluency
  • ๐Ÿงฉ Coherence
  • ๐Ÿ” Redundancy
  • ๐Ÿง  Technical Depth
  • ๐Ÿ’ช Confidence Markers

๐Ÿ“ˆ Delivered in real-time โ†’ your growth, visualized.

4๏ธโƒฃ ๐ŸŽ“ AI Educator Mode

โ€ƒYour speech gets evaluated like you're explaining to a domain expert (Groq)

5๏ธโƒฃ ๐Ÿ” Retry Until Itโ€™s Right

โ€ƒStumble? Speak again. Smarter each time. ๐Ÿ”

6๏ธโƒฃ ๐ŸŽฏ Focused Solo Practice

โ€ƒA quiet dojo to train your mind-mouth connection ๐Ÿค๐Ÿง˜โ€โ™€๏ธ

7๏ธโƒฃ ๐Ÿงช Built for the Serious Learners

โ€ƒIdeal for:

โ€ƒโ€ƒ- ๐Ÿงฌ Viva / Thesis prep

โ€ƒโ€ƒ- ๐Ÿง‘โ€๐Ÿ’ป Tech interviews

โ€ƒโ€ƒ- ๐Ÿ“š Academic presentations

โ€ƒโ€ƒ- ๐ŸŽค Fluency drills

โ€ƒโ€ƒ- โœจ Better grasping any topic

8๏ธโƒฃ ๐Ÿ’ป Minimal UI, Max Results

โ€ƒNo distractions. Just you, your thoughts, and your growth ๐Ÿ’ฅ


๐ŸŽฌ Demo

Here's my live project:- Grasp Articulate Refine

โš ๏ธ Works best in Chrome. Firefox sulks. Brave is brave. Safari is... shy.

You can check me showcasing my project here:-


GitHub Repository

๐Ÿ‘‰๐Ÿ‘‡

๐Ÿง โœจ๐Ÿ“ˆ Grasp Articulate Refine

Your smart study coach, powered by AI - designed to help you truly understand what you learn, speak it with confidence, and get thoughtful feedback so you grow smarter, faster.

My project at a glimpse:-

screencapture-localhost-5000-2025-07-28-02_44_24

Check it out here live:- Grasp Articulate Refine


โœจ Features

  • Adaptive Content Generation: Creates 2000-3000 word educational content tailored to your academic level
  • Voice-Based Assessment: Uses Assembly AI for speech-to-text transcription
  • AI-Powered Analysis: Acts as a globally renowned educator providing detailed feedback
  • Intelligent Grading: Grades responses out of 10 with detailed explanations
  • Progress Tracking: Students must score 9+ to advance to next topics
  • Celebration System: 3-second emoji overlay for excellent performance (๐Ÿฅณ๐ŸŽ‰๐ŸŽŠ)
  • Mobile Responsive: Darker blue theme with high contrast design
  • Real References: Provides working, relevant reference links for the explanation provided
  • Multiple Academic Levels: High School, Undergraduate, Graduate, Professional
  • Custom Subjectโ€ฆ

You can check out my repo above if you are more of a code person, or want to analyse my code ๐Ÿค”, get inspiration, fork it, clone it, and work on it on your device locally.


Technical Implementation & AssemblyAI Integration

Here are the code snippets demonstrating the technical implementation and AssemblyAI integration in this project:-

๐ŸŽฏ 1. AssemblyAI Initialization & Configuration

python
# utils/voice_manager.py - AssemblyAI Setup
import assemblyai as aai

class VoiceManager:
    def __init__(self, api_keys: Dict[str, str]):
        self.api_keys = api_keys
        self.assemblyai_available = False
        self._init_assemblyai()

    def _init_assemblyai(self):
        if ASSEMBLYAI_AVAILABLE and self.api_keys.get('ASSEMBLYAI_API_KEY'):
            try:
                aai.settings.api_key = self.api_keys['ASSEMBLYAI_API_KEY']

                test_config = aai.TranscriptionConfig(
                    language_detection=True,   
                    punctuate=True,           
                    format_text=True,          
                    speaker_labels=False,      
                    auto_highlights=False      
                )

                self.assemblyai_available = True
                print("โœ… AssemblyAI initialized successfully")

            except Exception as e:
                print(f"โŒ AssemblyAI initialization failed: {e}")
                self.assemblyai_available = False
Enter fullscreen mode Exit fullscreen mode

Sets up AssemblyAI SDK with API key and configures transcription settings including language detection, punctuation, and text
formatting. Initializes the VoiceManager class with enhanced features optimized for educational content transcription.

๐ŸŽค 2. Core Audio Transcription Implementation

python
def transcribe_audio(self, audio_file_path: str) -> str:

    if not os.path.exists(audio_file_path):
        return "โŒ Audio file not found"

    # Primary Method: AssemblyAI SDK
    if self.assemblyai_available:
        try:
            print("๐Ÿ”„ Trying AssemblyAI SDK...")

            config = aai.TranscriptionConfig(
                language_detection=True,    
                punctuate=True,            
                format_text=True,          
                speaker_labels=False,      
                auto_highlights=False                  )

            transcriber = aai.Transcriber(config=config)
            transcript = transcriber.transcribe(audio_file_path)

            if transcript.status == "completed":
                print("โœ… AssemblyAI SDK transcription successful")
                return self._clean_transcription(transcript.text)
            elif transcript.status == "error":
                print(f"โŒ AssemblyAI SDK error: {transcript.error}")
                return f"โŒ Transcription error: {transcript.error}"

        except Exception as e:
            print(f"โŒ AssemblyAI SDK error: {e}")

    # Fallback Method: Direct API
    if self.api_keys.get('ASSEMBLYAI_API_KEY'):
        try:
            print("๐Ÿ”„ Trying AssemblyAI Direct API...")
            result = self._transcribe_with_api(audio_file_path)
            if result and not result.startswith("โŒ"):
                print("โœ… AssemblyAI API transcription successful")
                return self._clean_transcription(result)
        except Exception as e:
            print(f"โŒ AssemblyAI API error: {e}")

    return "โŒ Transcription failed. Please check API configuration."
Enter fullscreen mode Exit fullscreen mode

Main transcription function using dual-mode approach: primary AssemblyAI SDK method with fallback to direct API. Handles audio file
validation, processes transcription with enhanced configuration, and includes comprehensive error handling for reliable speech-to-
text conversion.

๐Ÿ”ง 3. Direct API Implementation with Enhanced Features

python
# utils/voice_manager.py - Direct API Implementation
def _transcribe_with_api(self, audio_file_path: str) -> str:
    """
    Direct AssemblyAI API implementation with robust error handling
    """
    try:
        headers = {'authorization': self.api_keys['ASSEMBLYAI_API_KEY']}

        print("๐Ÿ“ค Uploading audio file...")
        with open(audio_file_path, 'rb') as f:
            response = requests.post(
                'https://api.assemblyai.com/v2/upload',
                headers=headers,
                files={'file': f},
                timeout=60
            )

        if response.status_code != 200:
            return f"โŒ Upload failed: {response.status_code} - {response.text}"

        upload_url = response.json()['upload_url']
        print(f"โœ… File uploaded: {upload_url}")

        print("๐Ÿ”„ Requesting transcription...")
        data = {
            'audio_url': upload_url,
            'language_detection': True,    
            'punctuate': True,             
            'format_text': True,           
            'speaker_labels': False,       
            'auto_highlights': False       
        }

        response = requests.post(
            'https://api.assemblyai.com/v2/transcript',
            headers=headers,
            json=data,
            timeout=30
        )

        if response.status_code != 200:
            return f"โŒ Transcription request failed: {response.status_code}"

        transcript_id = response.json()['id']
        print(f"๐Ÿ”„ Transcription ID: {transcript_id}")

        print("โณ Waiting for transcription to complete...")
        max_attempts = 60  # 2-minute timeout
        attempt = 0

        while attempt < max_attempts:
            response = requests.get(
                f'https://api.assemblyai.com/v2/transcript/{transcript_id}',
                headers=headers,
                timeout=30
            )

            if response.status_code != 200:
                return f"โŒ Status check failed: {response.status_code}"

            result = response.json()
            status = result['status']

            if status == 'completed':
                print("โœ… Transcription completed")
                return result['text'] or "โŒ No text in transcription result"
            elif status == 'error':
                error_msg = result.get('error', 'Unknown error')
                return f"โŒ Transcription error: {error_msg}"
            elif status in ['queued', 'processing']:
                print(f"โณ Status: {status} (attempt {attempt + 1}/{max_attempts})")
                import time
                time.sleep(2)  # 2-second polling interval
                attempt += 1
            else:
                return f"โŒ Unknown status: {status}"

        return "โŒ Transcription timeout - took too long to process"

    except requests.exceptions.Timeout:
        return "โŒ Request timeout - please try again"
    except Exception as e:
        return f"โŒ Unexpected error: {str(e)}"
Enter fullscreen mode Exit fullscreen mode

Implements direct AssemblyAI API calls as fallback method. Handles file upload, transcription request with enhanced features, and
intelligent polling with 2-minute timeout. Provides robust error handling for network issues and API failures.

๐Ÿงน 4. Advanced Text Processing & Cleaning

python
# utils/voice_manager.py - Text Processing
def _clean_transcription(self, text: str) -> str:
    if not text:
        return "โŒ Empty transcription result"

    text = text.strip()

    text = re.sub(r'\s+', ' ', text)

    text = re.sub(r'([.!?])\s*([a-z])',
                  lambda m: m.group(1) + ' ' + m.group(2).upper(), text)

    if text and not text[0].isupper():
        text = text[0].upper() + text[1:]

    if text and text[-1] not in '.!?':
        text += '.'

    return text

def validate_audio_file(self, file_path: str) -> Dict[str, any]:
    if not os.path.exists(file_path):
        return {
            'valid': False,
            'error': 'File does not exist',
            'file_size': 0
        }

    file_size = os.path.getsize(file_path)
    max_size = 100 * 1024 * 1024  # 100MB limit

    if file_size > max_size:
        return {
            'valid': False,
            'error': f'File too large: {file_size / (1024*1024):.1f}MB (max 100MB)',
            'file_size': file_size
        }

    if file_size < 1000:  # Minimum 1KB
        return {
            'valid': False,
            'error': 'File too small - may be empty or corrupted',
            'file_size': file_size
        }

    return {
        'valid': True,
        'error': None,
        'file_size': file_size,
        'file_size_mb': file_size / (1024 * 1024)
    }
Enter fullscreen mode Exit fullscreen mode

Post-processes transcription results with text cleaning, whitespace normalization, sentence capitalization fixes, and proper
punctuation. Includes audio file validation checking size limits and file integrity for optimal transcription quality.

๐ŸŒ 5. Flask Integration & API Endpoints

python
# app.py - Flask Integration
@app.route('/transcribe_audio', methods=['POST'])
def transcribe_audio():
    try:
        audio_file = request.files.get('audio')
        if not audio_file:
            return jsonify({'error': 'No audio file provided'}), 400

        session_id = session.get('session_id', 'unknown')
        temp_path = f"temp/audio_{session_id}.wav"
        os.makedirs('temp', exist_ok=True)
        audio_file.save(temp_path)

        print(f"๐Ÿ”„ Starting transcription of {temp_path}")

        validation = voice_manager.validate_audio_file(temp_path)
        if not validation['valid']:
            if os.path.exists(temp_path):
                os.remove(temp_path)
            return jsonify({'error': validation['error']}), 400

        transcription = voice_manager.transcribe_audio(temp_path)

        if os.path.exists(temp_path):
            os.remove(temp_path)

        print(f"โœ… Transcription result: {transcription[:100]}...")

        return jsonify({
            'success': True,
            'transcription': transcription,
            'file_size_mb': validation.get('file_size_mb', 0)
        })

    except Exception as e:
        print(f"โŒ Transcription error: {e}")
        return jsonify({'error': f'Transcription failed: {str(e)}'}), 500

@app.route('/voice_status')
def voice_status():
    return jsonify(voice_manager.get_voice_status())
Enter fullscreen mode Exit fullscreen mode

Flask endpoints for audio transcription with session-based temporary file handling. Includes comprehensive error handling, file
validation, and cleanup. Provides voice status endpoint for real-time feature availability monitoring and diagnostics.

๐Ÿ“Š 6. Status Monitoring & Diagnostics

python
# utils/voice_manager.py - Status Monitoring
def get_voice_status(self) -> Dict[str, bool]:
    return {
        'assemblyai_available': self.assemblyai_available,
        'voice_recording_available': self.assemblyai_available,
        'transcription_available': self.assemblyai_available,
        'api_key_configured': bool(self.api_keys.get('ASSEMBLYAI_API_KEY')),
        'sdk_available': ASSEMBLYAI_AVAILABLE,
        'direct_api_available': bool(self.api_keys.get('ASSEMBLYAI_API_KEY'))
    }

def _print_status(self):
    print("\n๐ŸŽค VOICE FEATURES STATUS:")
    print(f"   AssemblyAI Available: {'โœ…' if self.assemblyai_available else 'โŒ'}")
    print(f"   Voice Recording: {'โœ…' if self.assemblyai_available else 'โŒ'}")
    print(f"   Audio Transcription: {'โœ…' if self.assemblyai_available else 'โŒ'}")
    print(f"   API Key Configured: {'โœ…' if self.api_keys.get('ASSEMBLYAI_API_KEY') else 'โŒ'}")
    print()
Enter fullscreen mode Exit fullscreen mode

Comprehensive system for monitoring AssemblyAI feature availability including SDK status, API key configuration, and transcription
capabilities. Provides detailed status reporting for troubleshooting and system health monitoring.

๐Ÿ”’ 7. HTTPS Configuration for Microphone Access

python
def create_self_signed_cert():
    try:
        from cryptography import x509
        from cryptography.x509.oid import NameOID
        from cryptography.hazmat.primitives import hashes
        from cryptography.hazmat.primitives.asymmetric import rsa
        from cryptography.hazmat.primitives import serialization
        import datetime
        import ipaddress

        private_key = rsa.generate_private_key(
            public_exponent=65537,
            key_size=2048,
        )

        subject = issuer = x509.Name([
            x509.NameAttribute(NameOID.COUNTRY_NAME, "US"),
            x509.NameAttribute(NameOID.STATE_OR_PROVINCE_NAME, "Local"),
            x509.NameAttribute(NameOID.LOCALITY_NAME, "Local"),
            x509.NameAttribute(NameOID.ORGANIZATION_NAME, "AI Learning Platform"),
            x509.NameAttribute(NameOID.COMMON_NAME, "localhost"),
        ])

        cert = x509.CertificateBuilder().subject_name(
            subject
        ).issuer_name(
            issuer
        ).public_key(
            private_key.public_key()
        ).serial_number(
            x509.random_serial_number()
        ).not_valid_before(
            datetime.datetime.utcnow()
        ).not_valid_after(
            datetime.datetime.utcnow() + datetime.timedelta(days=365)
        ).add_extension(
            x509.SubjectAlternativeName([
                x509.DNSName("localhost"),
                x509.DNSName("127.0.0.1"),
                x509.IPAddress(ipaddress.IPv4Address("127.0.0.1")),
            ]),
        ).sign(private_key, hashes.SHA256()) certificate and key
        with open("cert.pem", "wb") as f:
            f.write(cert.public_bytes(serialization.Encoding.PEM))

        with open("key.pem", "wb") as f:.private_bytes(
                encoding=serialization.Encoding.PEM,
                format=serialization.PrivateFormat.PKCS8,
                encryption_algorithm=serialization.NoEncryption()
            ))

        print("โœ… Self-signed certificate created")
        return True

    except Exception as e:
        print(f"โŒ Failed to create certificate: {e}")
        return False
Enter fullscreen mode Exit fullscreen mode

Creates self-signed SSL certificates required for browser microphone access. Generates cryptographic certificates for localhost with
proper domain configuration, enabling secure audio recording in web browsers for the educational platform.


Tech Stack Used

Backend

  • Python + Flask - For handling sessions and inference
  • AssemblyAI - Real-time transcription (Streaming API)
  • Groq (LLaMA3-8B) - For instant feedback

Frontend

  • JavaScript - Audio streaming + Web Audio API
  • HTML/CSS - Minimal, responsive, focused on clarity

๐Ÿ’ญ Final Thoughts

This wasnโ€™t just a submission.
This was a love letter ๐Ÿ’Œ๐Ÿ’Œ to every shy, nerdy student who ever wished their thoughts could come out clearer.

Itโ€™s funny.
We spend years learning things, but no one ever bothered teaching us how to say them well.
This project is my way of fixing that - with code, care, and a mic.

Would I build more on top of this? Absolutely.
Would I cry if I win? Probably.
Would I still keep improving it if I lose? Without a question. ๐Ÿฅน


๐Ÿซถ Thank You for Listening (Literally)

To the judges, mentors, and every dev reading this โ€”

Letโ€™s speak better. Letโ€™s build louder.
And maybeโ€ฆ letโ€™s stutter a little less along the way.

๐ŸŽค๐Ÿ’™
Divya Singh

Thank you for reading till the end

bow gif

Comments 21 total

  • Anmol Baranwal
    Anmol BaranwalJul 28, 2025

    this is really cool ๐Ÿ”ฅ

    • Divya
      DivyaJul 28, 2025

      Thank you for checking it out ๐Ÿ˜Š๐Ÿ˜Š

  • Custom Patches By Fineyst
    Custom Patches By FineystJul 28, 2025

    amazing๐Ÿ”ฅ

  • Fayaz
    FayazJul 28, 2025

    Nice!

    You don't miss any dev Challenge, do you! ๐Ÿ˜‡

    All the best ๐Ÿฅณ

    • Divya
      DivyaJul 28, 2025

      Not 100% of them, i just create multiple submissions for a single challenge mostly ๐Ÿ˜…

      Thank you ๐Ÿฅน

      • Fayaz
        FayazJul 28, 2025

        That's a great strategy!

        I barely ever get time to submit one project, and that too only on some challenges. ๐Ÿ˜’

        You may like my last submission though. ๐Ÿ˜

        • Divya
          DivyaJul 28, 2025

          It seems useful, but is the github repo completely updated?

          • Fayaz
            FayazJul 28, 2025

            I'll add some more instructions and polishing, but it already does what is advertised on the post. YES!

            • Divya
              DivyaJul 28, 2025

              I will check it out then ๐Ÿ˜

  • Rohan Sharma
    Rohan SharmaJul 28, 2025

    I hope you win this one!

    • Divya
      DivyaJul 28, 2025

      I hope so as well ๐Ÿ˜…

      Thank you ๐Ÿ™

  • Meenakshi Agarwal
    Meenakshi AgarwalJul 28, 2025

    Nice work! Is there a way to run the above app/demonstrate the functionality in a non-metered environment?

    • Divya
      DivyaJul 28, 2025

      Non- metered as in without the api?

      • Meenakshi Agarwal
        Meenakshi AgarwalJul 28, 2025

        Smart minds always get the right meaning, yes but without the api-key to be precise or with some public demo key....

        • Divya
          DivyaJul 28, 2025

          The project's main feature is listening, understanding and then analysis of it, and feedback for the learner.

          It needs these 2 apis , or any 2 ig, for the audio part and the analysis + feedback part.

  • kavindu shashith
    kavindu shashithJul 28, 2025

    ๐Ÿš€ Hello Developers & Learners!

    Iโ€™m excited to share my new programming website with you โ€” a place built to help you learn coding, solve real-world problems, and improve your skills step by step.

    Whether you're a beginner or experienced, youโ€™ll find:

    โœ… Easy-to-follow tutorials
    โœ… Real project examples
    โœ… Tips on Python, JavaScript, PHP
    โœ… Free downloads (source code, eBooks, tools & templates)
    โœ… Code snippets to save time

    ๐Ÿ“ฅ Download free coding resources directly from the site!
    ๐Ÿ‘‰ [programmingcodesforyou12.blogspot....]

    Start learning and building today. Letโ€™s code smarter โ€” together! ๐Ÿ’ป๐Ÿ”ฅ

  • kavindu shashith
    kavindu shashithJul 29, 2025

    ๐Ÿš€ Hello Developers & Learners!

    Iโ€™m excited to share my new programming website with you โ€” a place built to help you learn coding, solve real-world problems, and improve your skills step by step.

    Whether you're a beginner or experienced, youโ€™ll find:

    โœ… Easy-to-follow tutorials
    โœ… Tips on Python, JavaScript, PHP
    โœ… Free downloads (source code, eBooks, tools & templates)
    โœ… Code snippets to save time

    ๐Ÿ“ฅ Download free coding resources directly from the site!
    ๐Ÿ‘‰ [programmingcodesforyou12.blogspot....]

    Start learning and building today. Letโ€™s code smarter โ€” together! ๐Ÿ’ป๐Ÿ”ฅ

  • dummy
    dummyJul 30, 2025

    you are a rockstar, completed 3 challenges and all are awesome.
    Great work, liked all of your three challenges. โค๏ธ
    Wish you all the very best for these challengesโœจโœจโœจ

    • Divya
      DivyaJul 31, 2025

      Thank you Mr Ninja ๐Ÿ˜

      Not that awesome, plus it was a last minute rush, but yup, ultimately submitted it all before the deadline.

Add comment