Cobra: Picovoice’s Voice Activity Detection Engine

  • Voice Activity Detection
  • VAD
  • Voice User Interfaces
  • VUI
October 27, 2021

Voice activity detection (VAD) is a crucial component of many speech processing solutions. VAD detects the presence of a human voice within a stream of audio. Although simple to describe, this is a challenging task in the presence of non-stationary noise.

Picovoice’s VAD, Cobra, has initially been developed as an internal tool and then been offered to only a select number of alpha customers. Today, we are excited to make it publicly available. Cobra is

  1. Highly accurate
  2. Compact and computationally efficient
  3. Cross-platform; Runs on Raspberry Pi, BeagleBone, NVIDIA Jetson Nano, Linux (x86_64), macOS (arm64 and x86_64), Windows (x86_64), Android, iOS, and modern web browsers (using WebAssembly). Support for various Cortex-M microcontrollers and Cortex-A microprocessors is available for enterprise customers.

Live Demo

Arrow (pointing at microphone button)
Press the microphone button to activate the demo.

Start Building!

Go to GitHub and start building with Cobra, for free!

Benchmark

An open-source comparison between Cobra and WebRTC's VAD (developed by Google) is available on GitHub. The Figure below summarizes the comparison in the form of a ROC curve (a larger area under the curve is better).

Comparison with WebRTC's VAD