Offline Voice AI in a browser may seem contradictory. Doesn't using a web browser mean you’re online? But even if connected to the Internet, offline voice recognition in-browser means eliminating variable latency and intrinsic privacy. Local voice recognition also unlocks always-listening behaviour that is impractical to perform continuously with cloud-based services.
Here is a demonstration application that uses the Porcupine wake word engine and the Rhino Speech-to-Intent engines to control lights in a home. All speech recognition is private, offline, and in-browser.
The speech-to-intent engine didn't understand the command.
What's under the Hood?
Offline web voice AI is challenging. We had to extend our in-house deep learning framework to run on WebAssembly with SIMD support. These models run in the background using Web Workers. Together with the Web Audio API, these provided the foundation for accessing microphone data in the browser and processing it. We have abstracted these cutting-edge web technologies in Picovoice SDKs for the Web.
Start building with Leopard speech-to-text for free.
- Porcupine Wake Word: Wake Word Detection, Keyword Spotting, Voice Activation, and Always listening Voice Commands
- Rhino Speech-to-Intent: Voice Commands, Domain-Specific Natural Language Understanding, NLU, Spoken Language Understanding, and SLU
- Leopard Speech-to-Text: Speech-to-Text, STT, Automatic Speech Recognition, ASR, Large-Vocabulary Speech Recognition, and Open-Domain Transcription
- Octopus Speech-to-Index: Voice Search and Audio Indexing
- Cobra Voice Activity Detection: Voice Activity Detection and VAD