Detecting the presence of human speech in audio can be an essential component of a Speech Recognition pipeline. But whereas we humans are able to naturally distinguish speech from other sounds, machines need some help to be able to make the same distinction. Engines that do this are usually called Voice Activity Detectors or VADs for short. Given some audio input, a VAD makes a binary decision and determines whether the input contains speech or not.

Picovoice makes it simple to detect voice activity in audio data using Cobra Voice Activity Detection. It is lightweight, runs on-device and on any platform - including mobile phones. Cobra VAD performs voice activity detection locally, keeping your voice data private (i.e. it is GDPR and HIPAA-compliant by design).

Importantly, the Cobra Voice Activity Detection engine is the most accurate VAD engine across all platforms, even in comparison to Google's widely used WebRTC VAD.

In just a dozen lines of code, you can start detecting voice activity in real time from a microphone using the Cobra Voice Activity Detection iOS SDK. Let’s get started!

Install VAD SDK

The Cobra Voice Activity Detection (VAD) iOS SDK is available through CocoaPods. To add CocoaPods to a project, execute the following in your XCode project directory:

Import the Cobra-iOS binding and Picovoice's ios-voice-processor by adding the following lines to your project's Podfile:

Then, install by running:

Make sure to always open the Xcode workspace instead of the project file when building your project!

App Microphone Permissions

Add the following to the app's Info.plist file to enable recording with an iOS device's microphone:

Sign up for Picovoice Console

Sign up for a free Picovoice Console account and obtain your AccessKey. The AccessKey is only required for authentication and authorization.

Initialize Cobra in Swift

Import Cobra Voice Activity Detection and create an instance of the VAD engine with your Picovoice AccessKey:

Add Audio Recording

`Cobra Voice Activity Detection` operates on frames of audio. Create a callback that passes frames of audio to the `.process()` function:

Process Audio with Cobra

Finally, register the callback with ios-voice-processor and call .start() to begin detecting voice probability:

Putting it Together

All together, the example looks like this:

Done! It really is that easy.

For further details, visit the Cobra Voice Activity Detection product page or refer to the Cobra iOS SDK quick start guide.