Voice Activity Detection in Python

🚀 Best-in-class Voice AI!

Build compliant and low-latency AI apps using Python without sending user data to 3rd party servers.

Cobra, Picovoice’s Voice Activity Detection engine, scans audio streams and detects the presence of human voices. This article shows how to detect voice activity in audio data using Picovoice Cobra Voice Activity Detection (VAD) Python SDK. The SDK runs on Linux, macOS, Windows, and Raspberry Pi.

🚀 Best-in-class Voice AI!

Build compliant and low-latency AI apps using Python without sending user data to 3rd party servers.

Start Free

Install Voice Activity Detection SDK

pip3 install pvcobra

Log in to (sign up for) Picovoice Console. It is free, and no credit card is required! Copy your AccessKey to the clipboard.

Implement in Python

Import the Cobra Voice Activity Detection Python package and create an instance of the Voice Activity Detection engine with your AccessKey:

import pvcobra

handle = pvcobra.create(access_key)

When initialized, the valid sample rate is given by handle.sample_rate. The expected frame length (number of audio samples in an input array) is handle.frame_length. The engine accepts 16-bit linearly-encoded PCM and operates on single-channel audio.

def get_next_audio_frame():
    pass

while True:
    voice_probability = handle.process(get_next_audio_frame())

Below is an example output of Cobra Voice Activity Detection on a test audio file:

It takes less than 90 seconds to start detecting human speech in audio files in real time!

Don't forget to check out other Python Tutorials such as LLM-powered Voice Assistant in Python, Real-time Speaker Recognition with Python, and Speaker Diarization with Python.

Voice Activity Detection in Python

Install Voice Activity Detection SDK

Sign up for Picovoice Console

Implement in Python

More from Picovoice