Voice activity detection (VAD) is a crucial component of many speech processing solutions. VAD detects the presence of a human voice within a stream of audio. Although simple to describe, this is a challenging task in the presence of non-stationary noise.

Picovoice’s VAD, Cobra, has initially been developed as an internal tool and then been offered to only a select number of alpha customers. Today, we are excited to make it publicly available. Cobra is

  1. Highly accurate
  2. Compact and computationally efficient
  3. Cross-platform; Runs on Raspberry Pi, BeagleBone, NVIDIA Jetson Nano, Linux (x86_64), macOS (arm64 and x86_64), Windows (x86_64), Android, iOS, and modern web browsers (using WebAssembly). Support for various Cortex-M microcontrollers and Cortex-A microprocessors is available for enterprise customers.

Live Demo


Probability of Voice
Loading...

Start Building!

Go to GitHub and start building with Cobra, for free!

Benchmark

An open-source comparison between Cobra and WebRTC's VAD (developed by Google) is available on here. The Figure below summarizes the comparison in the form of a ROC curve (a larger area under the curve is better).

Comparison with WebRTC's VAD