Offline Voice AI in a browser may seem contradictory. Doesn't using a web browser mean you’re online? But even if connected to the Internet, offline voice recognition in-browser means eliminating variable latency and intrinsic privacy. Local voice recognition also unlocks always-listening behaviour that is impractical to perform continuously with cloud-based services.
Here is a demonstration application that uses the Porcupine wake word engine and the Rhino Speech-to-Intent engines to control lights in a home. All speech recognition is private, offline, and in-browser.
The speech-to-intent engine didn't understand the command.
What's under the Hood?
Offline web voice AI is challenging. We had to extend our in-house deep learning framework to run on WebAssembly with SIMD support. These models run in the background using Web Workers . Together with the Web Audio API , these provided the foundation for accessing microphone data in the browser and processing it. We have abstracted these cutting-edge web technologies in Picovoice SDKs for the Web.
Start building with Leopard speech-to-text for free.
- Porcupine Wake Word: Wake Word Detection, Keyword Spotting, Voice Activation, and Always listening Voice Commands
- Rhino Speech-to-Intent: Voice Commands, Domain-Specific Natural Language Understanding, NLU, Spoken Language Understanding, and SLU
- Leopard Speech-to-Text: Speech-to-Text, STT, Automatic Speech Recognition, ASR, Large-Vocabulary Speech Recognition, and Open-Domain Transcription
- Octopus Speech-to-Index: Voice Search and Audio Indexing
- Cobra Voice Activity Detection: Voice Activity Detection and VAD