Porcupine Wake Word Engine - FAQ

How do I evaluate the performance of Porcupine wake word engine?

We have benchmarked the performance of Porcupine software rigorously and published the results here. We have also open-sourced the code, wake word models, and audio files used for benchmarking in the same repository. You can also use the code with your audio files.

What's the accuracy of Porcupine wake word detection library?

We have extensively benchmarked the performance of Porcupine software and compared its accuracy against alternatives. The open-sourced benchmark is published here. Porcupine achieves 94%+ accuracy (detection rate) with less than 1 false alarm in 10 hours in the presence of background speech and ambient noise.

Does the accuracy of Porcupine depend on the choice of wake word?

Yes. We have published a guide here to help you pick a wake word that achieves optimal performance. In short, you need to avoid using short phrases and make sure your wake word includes diverse sounds and at least six phonemes. Long phrases are also not recommended due to the poor user experience.

Is there a guideline for picking a wake word?

We have published a guide here to help you pick a wake word that achieves optimal performance.

What's the CPU and memory usage of Porcupine wake word library?

We offer several trims for our wake word detection model. The standard model uses about 1 MB of memory and less than 4% of a single core on a Raspberry Pi 3.

What should I set the sensitivity value to?

You should pick a sensitivity parameter that suits your application's requirements. A higher sensitivity value gives a lower miss rate at the expense of a higher false alarm rate.

What is a ROC curve?

The accuracy of a binary classifier (any decision-making algorithm with a “yes” or “no” output) can be measured by two parameters: false rejection rate (FRR) and false acceptance rate (FAR). A wake word detector is a binary classifier. Hence, we use these metrics to benchmark it.

The detection threshold of binary classifiers can be tuned to balance FRR and FAR. A lower detection threshold yields higher sensitivity. A highly-sensitive classifier has a high FAR and a low FRR value (i.e. it accepts almost everything). A receiver operating characteristic (ROC) curve plots FRR values against corresponding FAR values for varying sensitivity values.

To learn more about ROC curves and benchmarking a wake word detection, you may read the blog post here and Porcupine benchmark published here.

If I use Porcupine wake word detection in my mobile application, does it function when the app is running in the background?

Developers have been able to successfully run Porcupine wake word detection software on iOS and Android in background mode. However, this feature is controlled by the operating system, and we cannot guarantee that this will be possible in future releases of iOS or Android. Please check iOS and Android guidelines, technical documentation, and terms of service.

Which platforms does Porcupine wake word detection support?

  • ARM Cortex-M
  • ARM Cortex-A
  • Raspberry Pi (all variants)
  • BeagleBone
  • NVIDIA Jetson
  • Android
  • iOS
  • Linux (x86_64)
  • macOS (x86_64)
  • Windows (x86_64)
  • Modern Web Browsers

Does Porcupine wake word detection software work with everyone’s voice (universal) or does it only work with my voice (personal)?

Porcupine wake word detection software is universal and trained to work with a variety of accents and people’s voices.

Does the user need to pause and remain silent before saying the wake word?

By default, no. But if that is a requirement, we can customize the software (as part of our professional services for you) to require silence either before or after the wake word.

Does Porcupine wake word detection work with accents?

Yes, it works well with accents. However, it’s impossible to quantify it. We recommend you try the engine for yourself and perhaps evaluate with an accented dataset of your choice to see if it meets your requirements.

How many wake words can Porcupine detect simultaneously?

There is no technical limit on the number of wake words the software can listen to simultaneously.

How much additional memory and CPU is needed for detecting additional wake word or trigger phrases?

Listening to additional wake words does not increase CPU usage. However, it will require 1 KB of memory per additional wake word model.

Is the Picovoice “Alexa” wake word verified by Amazon?

Amazon Alexa Certification requirements are different for near, mid, and far-field applications (AVS, AMA, etc.). Also, the certification is typically performed on the end hardware, and the outcome depends on many design choices such as a microphone, enclosure acoustics, audio front end, and wake word. Picovoice can assist with new product introduction (NPI) and Alexa certification under our technical support package.

Does Picovoice wake word detection software work with Google Assistant?

Yes. However, your product may have to go through a certification procedure with Google. Please check Google’s guidelines and terms of service for related information.

Can you use Picovoice wake word detection software with Cortana, IBM Watson, or Samsung Bixby?

Yes, Picovoice can generate any third-party wake words at your request. However, you are responsible for any necessary integration with such platforms and potential areas of compliance.

What’s the power consumption of Picovoice wake word detection engine?

The absolute power consumption (in wattage) depends on numerous factors such as processor architecture, vendor, fabrication technology, and system-level power management design. If your design requires low power consumption in the (sub) milliwatt range for always-listening wake word detection, you will likely need to consider MCU (ARM Cortex-M) or DSP implementation.

Can Porcupine distinguish words with similar pronunciations?

The rigidity of rejecting words with similar pronunciations has several side effects such as rejecting accented pronunciations, as well as a higher rejection rate in noisy conditions. By lowering the detection sensitivity you can achieve lower false acceptance of words with similar pronunciations at the cost of higher miss rate.

What is your licensing model?

Please refer to the pricing page.


Issue with this doc? Please let us know.