Voice activity detection (VAD) is a crucial component of many speech processing solutions. VAD detects the presence of a human voice within a stream of audio. Although simple to describe, this is a challenging task in the presence of non-stationary noise.
Picovoice’s VAD, Cobra, has initially been developed as an internal tool and then been offered to only a select number of alpha customers. Today, we are excited to make it publicly available. Cobra is
- Highly accurate
- Compact and computationally efficient
- Cross-platform; Runs on Raspberry Pi, BeagleBone, NVIDIA Jetson Nano, Linux (x86_64), macOS (arm64 and x86_64), Windows (x86_64), Android, iOS, and modern web browsers (using WebAssembly). Support for various Cortex-M microcontrollers and Cortex-A microprocessors is available for enterprise customers.
Go to GitHub and start building with Cobra, for free!
An open-source comparison between Cobra and WebRTC's VAD (developed by Google) is available in the Picovoice docs. The Figure below summarizes the comparison in the form of a ROC curve (a larger area under the curve is better).