Detecting the presence of human speech in audio can be an essential component of a Speech Recognition
pipeline. But whereas we humans are able to naturally distinguish speech from other sounds, machines need some help to be able to make the same distinction. Engines that do this are usually called Voice Activity Detectors
or VAD
s for short. Given some audio input, a VAD
makes a binary decision and determines whether the input contains speech or not.
Picovoice makes it simple to detect voice activity in audio data using Cobra Voice Activity Detection. It is lightweight, runs on-device and on any platform - including mobile phones. Cobra VAD
performs voice activity detection locally, keeping your voice data private (i.e. it is GDPR
and HIPAA
-compliant by design).
Importantly, the Cobra Voice Activity Detection engine
is the most accurate VAD
engine across all platforms, even in comparison to Google's widely used WebRTC VAD.
In just a dozen lines of code, you can start detecting voice activity in real time from a microphone using the Cobra Voice Activity Detection
iOS SDK. Let’s get started!
Install VAD SDK
The Cobra Voice Activity Detection (VAD) iOS SDK is available through CocoaPods
. To add CocoaPods
to a project, execute the following in your XCode
project directory:
Import the Cobra-iOS binding and Picovoice's ios-voice-processor by adding the following lines to your project's Podfile
:
Then, install by running:
Make sure to always open the Xcode workspace
instead of the project file when building your project!
App Microphone Permissions
Add the following to the app's Info.plist
file to enable recording with an iOS device's microphone:
Sign up for Picovoice Console
Sign up for a free Picovoice Console account and obtain your AccessKey
. The AccessKey
is only required for authentication and authorization.
Initialize Cobra in Swift
Import Cobra Voice Activity Detection
and create an instance of the VAD engine with your Picovoice AccessKey
:
Add Audio Recording
`Cobra Voice Activity Detection` operates on frames of audio. Create a callback that passes frames of audio to the `.process()` function:Process Audio with Cobra
Finally, register the callback with ios-voice-processor and call .start()
to begin detecting voice probability:
Putting it Together
All together, the example looks like this:
Done! It really is that easy.
For further details, visit the Cobra Voice Activity Detection product page or refer to the Cobra iOS SDK quick start guide.