Detecting the presence of human speech in audio can be an essential component of a
Speech Recognition pipeline. But whereas we humans are able to naturally distinguish speech from other sounds, machines need some help to be able to make the same distinction. Engines that so this are usually called
Voice Activity Detectors or
VADs for short. Given some audio input, a
VAD makes a binary decision and determines whether the input contains speech or not.
Picovoice makes it simple to detect voice activity in audio data using Cobra Voice Activity Detection. It is lightweight, runs on-device and on any platform - including mobile phones.
Cobra VAD performs voice activity detection locally, keeping your voice data private (i.e. it is
HIPAA-compliant by design).
In just a dozen lines of code, you can start detecting voice activity in real time from a microphone using the
Cobra Voice Activity Detection iOS SDK. Let’s get started!
Install VAD SDK
Then, install by running:
Make sure to always open the
Xcode workspace instead of the project file when building your project!
App Microphone Permissions
Add the following to the app's
Info.plist file to enable recording with an iOS device's microphone:
Sign up for Picovoice Console
Sign up for a free Picovoice Console account and obtain your
AccessKey is only required for authentication and authorization.
Initialize Cobra in Swift
Cobra Voice Activity Detection and create an instance of the VAD engine with your Picovoice
Add Audio Recording`Cobra Voice Activity Detection` operates on frames of audio. Create a callback that passes frames of audio to the `.process()` function:
Process Audio with Cobra
Finally, register the callback with ios-voice-processor and call
.start() to begin detecting voice probability:
Putting it Together
All together, the example looks like this:
Done! It really is that easy.