Picovoice WordmarkPicovoice Console
Introduction
Introduction
AndroidC.NETFlutterlink to GoiOSJavaNvidia JetsonLinuxmacOSNodejsPythonRaspberry PiReact NativeRustWebWindows
AndroidC.NETFlutterlink to GoiOSJavaNodejsPythonReact NativeRustWeb
SummaryPicovoice LeopardAmazon TranscribeAzure Speech-to-TextGoogle ASRGoogle ASR (Enhanced)IBM Watson Speech-to-Text
FAQ
Introduction
AndroidC.NETFlutterlink to GoiOSJavaNodejsPythonReact NativeRustWeb
AndroidC.NETFlutterlink to GoiOSJavaNodejsPythonReact NativeRustWeb
FAQ
Introduction
AndroidCiOSLinuxmacOSPythonWebWindows
AndroidCiOSPythonWeb
SummaryOctopus Speech-to-IndexGoogle Speech-to-TextMozilla DeepSpeech
FAQ
Introduction
AndroidAngularArduinoBeagleBoneCChrome.NETEdgeFirefoxFlutterlink to GoiOSJavaNvidia JetsonLinuxmacOSMicrocontrollerNodejsPythonRaspberry PiReactReact NativeRustSafariUnityVueWebWindows
AndroidAngularC.NETFlutterlink to GoiOSJavaMicrocontrollerNodejsPythonReactReact NativeRustUnityVueWeb
SummaryPorcupineSnowboyPocketSphinx
Wake Word TipsFAQ
Introduction
AndroidAngularBeagleBoneCChrome.NETEdgeFirefoxFlutterlink to GoiOSJavaNvidia JetsonlinuxmacOSNodejsPythonRaspberry PiReactReact NativeRustSafariUnityVueWebWindows
AndroidAngularC.NETFlutterlink to GoiOSJavaNodejsPythonReactReact NativeRustUnityVueWeb
SummaryPicovoice RhinoGoogle DialogflowAmazon LexIBM WatsonMicrosoft LUIS
Expression SyntaxFAQ
Introduction
AndroidBeagleboneCiOSNvidia JetsonLinuxmacOSPythonRaspberry PiRustWebWindows
AndroidCiOSPythonRustWeb
SummaryPicovoice CobraWebRTC VAD
FAQ
Introduction
AndroidCiOSPythonWeb
AndroidCiOSPythonWeb
SummaryPicovoice KoalaMozilla RNNoise
Introduction
AndroidAngularArduinoBeagleBoneC.NETFlutterlink to GoiOSJavaNvidia JetsonMicrocontrollerNodejsPythonRaspberry PiReactReact NativeRustUnityVueWeb
AndroidAngularCMicrocontroller.NETFlutterlink to GoiOSJavaNodejsPythonReactReact NativeRustUnityVueWeb
Picovoice SDK - FAQ
IntroductionSTM32F407G-DISC1 (Arm Cortex-M4)STM32F411E-DISCO (Arm Cortex-M4)STM32F769I-DISCO (Arm Cortex-M7)IMXRT1050-EVKB (Arm Cortex-M7)
FAQGlossary

Wake Word Benchmark


A wake word engine is a speech recognition algorithm that detects utterances of a given keyword (keyphrase) within a stream of audio. Recently, the most common use case of a wake word engine has become voice activation (i.e. wake phrase detection). But it can (sometimes should) be used for implementing always-listening voice commands and voice search. Keyword spotting usage is beyond voice assistants and encompasses voice user interfaces (VUIs), interactive voice response (IVR) systems, voice analytics (e.g. call centres), content moderation (e.g. online games), and many more.

The major obstacle in the adoption of wake word engines is their reliance on massive data gathering for training each new model. Picovoice Porcupine solves this problem by removing the need for data gathering for each new model. You can train custom branded wake word models using Picovoice Console by typing the phrase you want. A production-ready model will be ready in a few seconds.

Below is a series of benchmarks to back our claims. They also empower customers to make data-driven decisions using the datasets that matter to their business.

Methodology

Speech Corpus

LibriSpeech (test_clean portion) is used as background dataset. It can be downloaded from OpenSLR.

Furthermore, more than 300 recordings of six keywords (alexa, computer, jarvis, smart mirror, snowboy, and view glass) from more than 50 distinct speakers are used. The recordings are crowd-sourced. The recordings are stored within the repository here.

Resilience to Noise

In order to simulate real-world situations, the data is mixed with noise (at 10 dB SNR). For this purpose, we use DEMAND dataset which has noise recording in 18 different environments (e.g. kitchen, office, traffic, etc.). It can be downloaded from Kaggle .

Metrics

We compare accuracies of engines under test at a fixed false alarm rate of 1 per 10 hours. For runtime, we consider the CPU usage on a Raspberry Pi 3.

Results

Accuracy

The figure below shows the accuracy of engines when the false alarm rate is 1 per 10 hours with noise (10 dB SNR) and background speech.

Wake Word Accuracy ComparisonWake Word Accuracy Comparison

Efficiency

The figure below depicts CPU usage (single-core) of each engine on a Raspberry Pi 3.

Wake Word runtime comparisonWake Word runtime comparison

Was this doc helpful?

Issue with this doc?

Report a GitHub Issue
Wake Word Benchmark
  • Methodology
  • Speech Corpus
  • Resilience to Noise
  • Metrics
  • Results
  • Accuracy
  • Efficiency
Platform
  • Leopard Speech-to-Text
  • Cheetah Streaming Speech-to-Text
  • Octopus Speech-to-Index
  • Porcupine Wake Word
  • Rhino Speech-to-Intent
  • Cobra Voice Activity Detection
Resources
  • Docs
  • Console
  • Blog
  • Demos
Sales
  • Pricing
  • Starter Tier
  • Enterprise
Company
  • Careers
Follow Picovoice
  • LinkedIn
  • GitHub
  • Twitter
  • Medium
  • YouTube
  • AngelList
Subscribe to our newsletter
Terms of Use
Privacy Policy
© 2019-2022 Picovoice Inc.