Speech Recognition with JavaScript
JoelBonetR 🥇

JoelBonetR 🥇 @joelbonetr

About: Tech Lead/Team Lead. Senior WebDev. Intermediate Grade on Computer Systems- High Grade on Web Application Development- MBA (+Marketing+HHRR). Studied a bit of law, economics and design

Location:
Spain
Joined:
Apr 19, 2019

Speech Recognition with JavaScript

Publish Date: Aug 22 '22
167 23

Cover image credits: dribbble

Some time ago, speech recognition API was added to the specs and we got partial support on Chrome, Safari, Baidu, android webview, iOS safari, samsung internet and Kaios browsers (see browser support in detail).

Disclaimer: This implementation won't work in Opera (as it doesn't support the constructor) and also won't work in FireFox (because it doesn't support a single thing of it) so if you're using one of those, I suggest you to use Chrome -or any other compatible browser- if you want to take a try.

Speech recognition code and PoC

Edit: I realised that for any reason it won't work when embedded so here's the link to open it directly.

The implementation I made currently supports English and Spanish just to showcase.

Quick instructions and feature overview:

  • Choose one of the languages from the drop down.
  • Hit the mic icon and it will start recording (you'll notice a weird animation).
  • Once you finish a sentence it will write it down in the box.
  • When you want it to stop recording, simply press the mic again (animation stops).
  • You can also hit the box to copy the text in your clipboard.

Speech Recognition in the Browser with JavaScript - key code blocks:



/* Check whether the SpeechRecognition or the webkitSpeechRecognition API is available on window and reference it */
const recognitionSvc = window.SpeechRecognition || window.webkitSpeechRecognition;

// Instantiate it
const recognition = new recognitionSvc();

/* Set the speech recognition to continuous so it keeps listening to whatever you say. This way you can record long texts, conversations and so on. */
recognition.continuous = true;


/* Sets the language for speech recognition. It uses IETF tags, ISO 639-1 like en-GB, en-US, es-ES and so on */
recognition.lang = 'en-GB';

// Start the speech recognition
recognition.start();

// Event triggered when it gets a match
recognition.onresult = (event) => { 
  // iterate through speech recognition results
  for (const result of event.results) {
    // Print the transcription to the console
    console.log(`${result[0].transcript}`);
  }
}

// Stop the speech recognition
recognition.stop();


Enter fullscreen mode Exit fullscreen mode

This implementation currently supports the following languages for speech recognition:

  • en-GB
  • en-US
  • es-ES
  • de-DE
  • de-CH
  • fr-FR

If you want me to add support for more languages tell me in the comment sections and I'm updating it in a blink so you can test it on your own language 😁

That's all for today, hope you enjoyed I sure did doing that

Comments 23 total

  • Thomas Hansen
    Thomas HansenAug 22, 2022

    Cool. I once created a speech based speech recognition thing based upon MySQL and SoundEx allowing me to create code by speaking through my headphones. It was based upon creating a hierarchical “menu” where I could say “Create button”. Then the machine would respond with “what button”, etc. The thing of course produced Hyperlambda though. I doubt it can be done without meta programming.

    One thing that bothers me is that this was 5 years ago, and speech support has basically stood 100% perfectly still in all browsers since then … 😕

    • JoelBonetR 🥇
      JoelBonetR 🥇Aug 22, 2022
      One thing that bothers me is that this was 5 years ago, and speech support has basically stood 100% perfectly still in all browsers since then … 😕

      Not in all of them, (e.g. Opera mini, FireFox mobile), it's a nice to have in browsers, specially targeting accessibility, but screen readers for blind people do the job and, on the other hand, most implementations for any other purpose send data to a backend using streams so they can process the incoming speech plus use the user feedback to train an IA among others and without hurting the performance.

      ...allowing me to create code by speaking through my headphones... ... I doubt it can be done without meta programming.

      I agree on this. The concept "metaprogramming" is extense and covers different ways in which it can work (or be implemented) and from its own definition it is a building block for this kind of applications.

  • JoelBonetR 🥇
    JoelBonetR 🥇Aug 24, 2022

    Thank you! 😁

  • Samuelrivaldo
    SamuelrivaldoAug 24, 2022

    Thanks you 🙏. I'd like that you put in french too.

  • venkatgadicherla
    venkatgadicherlaAug 25, 2022

    It's cool mate. Very good

  • JoelBonetR 🥇
    JoelBonetR 🥇Aug 25, 2022

    I added support for some extra languages in the mean time 😁

  • Arantis-jr
    Arantis-jrAug 26, 2022

    Cool 😎

  • Marcelo Soares
    Marcelo SoaresAug 27, 2022

    Thank you 🙏. I'd like that you put in Brazilian Portuguse too.

    • JoelBonetR 🥇
      JoelBonetR 🥇Aug 29, 2022

      Added both Portugal and Brazilian portuguese 😁

  • Symeon Sideris
    Symeon SiderisAug 29, 2022

    Thank you very much for your useful article and implementation. Does it support Greek?
    Have a nice (programming) day

    • JoelBonetR 🥇
      JoelBonetR 🥇Aug 29, 2022

      Hi Symeon, added support for Greek el-GR, try it out! 😃

  • İbrahim Yaşar
    İbrahim YaşarAug 31, 2022

    This is really awesome. Could you please add the Turkish language? I would definitely like to try this in my native language and use it in my projects.

  • Aheed Tahir
    Aheed TahirJan 15, 2023

    Can you please add urdu language

  • Harshi Acchu
    Harshi AcchuSep 7, 2024

    Could you pls add tamil in it

Add comment