Cobra, Picovoice’s Voice Activity Detection engine, scans audio streams and detects the presence of human voices. This article shows how to detect voice activity in audio data using Picovoice Cobra Voice Activity Detection (VAD) Python SDK. The SDK runs on Linux
, macOS
, Windows
, and Raspberry Pi
.
Install Voice Activity Detection SDK
Sign up for Picovoice Console
Log in to (sign up for) Picovoice Console. It is free, and no credit card is required!
Copy your AccessKey
to the clipboard.
Implement in Python
Import the Cobra Voice Activity Detection Python package and create an instance of the Voice Activity Detection engine with your AccessKey
:
When initialized, the valid sample rate is given by handle.sample_rate
. The expected frame length (number of audio samples
in an input array) is handle.frame_length
. The engine accepts 16-bit linearly-encoded PCM and operates on single-channel
audio.
Below is an example output of Cobra Voice Activity Detection on a test audio file:
It takes less than 90 seconds to start detecting human speech in audio files in real time!
Don't forget to check out other Python Tutorials such as LLM-powered Voice Assistant in Python, Real-time Speaker Recognition with Python, and Speaker Diarization with Python.