Complete Tutorial: Voice Activity Detection in C

🎯 Voice AI Consulting

Get dedicated support and consultation to ensure your specific needs are met.

Voice Activity Detection (VAD) is a core building block for speech and audio systems, used to determine when human speech is present in an audio stream. In low-level and embedded environments, developers often need a real-time, offline, on-device voice detection solution implemented in C that works reliably across operating systems (Linux, Windows, macOS, and Raspberry Pi) and hardware targets.

Many developers evaluate existing solutions such as WebRTC VAD, Silero VAD, or cloud-based speech detection services, each with trade-offs in accuracy, performance, and deployment complexity. This tutorial focuses on Cobra Voice Activity Detection, an on-device VAD engine with published benchmark results for accuracy and real-time performance, and a C API designed for cross-platform deployment.

You will learn how to capture microphone audio, process fixed-size PCM frames, and compute per-frame voice probability in real time with Cobra VAD. The result is a portable voice detection C application suitable for embedded systems, desktop applications, and edge deployments.

Important: This guide builds on How to Record Audio in C. If you haven't completed that setup yet, start with that tutorial to get your recording environment in place.

Prerequisites

C99-compatible compiler
Windows: MinGW

Supported Platforms

Linux (x86_64)
macOS (x86_64, arm64)
Windows (x86_64, arm64)
Raspberry Pi (Zero, 3, 4, 5)

Part 1. Set Up Your C VAD Project Structure

This is the folder structure used in this tutorial. You can organize your files differently if you like, but make sure to update the paths in the examples accordingly:

project_root/
├── cobra_tutorial.c
├── pvrecorder/             # Set up previously in "How to Record Audio in C"
│   ├── libpv_recorder.{so|dylib|dll}
│   └── include/
│       ├── pv_circular_buffer.h
│       └── pv_recorder.h
└── cobra/                  # This folder will be created in the next step.
    ├── libpv_cobra.{so|dylib|dll}
    └── include/
        ├── picovoice.h
        └── pv_cobra.h

To set up audio capture, refer to: How to Record Audio in C.

Add Cobra library files

Create a folder named cobra/.
Download the Cobra header files from GitHub and place them in:

cobra/include/

Download a Cobra model file and the correct library file for your platform and place them in:

cobra/

Part 2. Cross-Platform Dynamic Library Loading for VAD in C

Cobra VAD distributes pre-built platform libraries, meaning:

the shared library (.so, .dylib, .dll) is not linked at compile time
the program loads it at runtime
functions must be retrieved by name

So, we need to write small helper functions to:

open the shared library
look up function pointers
close the shared library

Include platform-specific headers

#if defined(_WIN32) || defined(_WIN64)
#include <windows.h>
#else
#include <dlfcn.h>
#endif

#include <getopt.h>
#include <signal.h>
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>

#include "pv_cobra.h"
#include "pv_recorder.h"

Understanding the headers:

On Windows systems, windows.h provides the LoadLibrary function to load a shared library and GetProcAddress to retrieve individual function pointers.
On Unix-based systems, dlopen and dlsym from the dlfcn.h header provide the same functionality.
signal.h allows us to handle Ctrl-C later in this example.

Define dynamic loading helper functions

Open the shared library

static void *open_dl(const char *dl_path) {
#if defined(_WIN32) || defined(_WIN64)
    return LoadLibrary(dl_path);
#else
    return dlopen(dl_path, RTLD_NOW);
#endif
}

Load function symbols

static void *load_symbol(void *handle, const char *symbol) {
#if defined(_WIN32) || defined(_WIN64)
    return GetProcAddress((HMODULE) handle, symbol);
#else
    return dlsym(handle, symbol);
#endif
}

Close the dynamic library

static void close_dl(void *handle) {
#if defined(_WIN32) || defined(_WIN64)
    FreeLibrary((HMODULE) handle);
#else
    dlclose(handle);
#endif
}

Print platform-correct errors

static void print_dl_error(const char *message) {
#if defined(_WIN32) || defined(_WIN64)
    fprintf(stderr, "%s with code '%lu'.\n", message, GetLastError());
#else
    fprintf(stderr, "%s with `%s`.\n", message, dlerror());
#endif
}

Load the library file

Downloaded the correct library file for your platform and point library_path to the file.

const char *library_path = "./cobra/libpv_cobra.so"; // adjust per platform
void *dl_handle = open_dl(library_path);

Load dynamic library functions

Load all required function symbols for Cobra Voice Activity Detection:

// Load Cobra initialization function
pv_status_t (*pv_cobra_init_func)(const char *, const char *, pv_cobra_t **) =
    load_symbol(dl_handle, "pv_cobra_init");

// Load Cobra delete function
void (*pv_cobra_delete_func)(pv_cobra_t *) = load_symbol(dl_handle, "pv_cobra_delete");

// Load Cobra process function
pv_status_t (*pv_cobra_process_func)(pv_cobra_t *, const int16_t *, float *) =
    load_symbol(dl_handle, "pv_cobra_process");

// The number of audio samples per frame that Cobra accepts
int32_t(*pv_cobra_frame_length_func)() = load_symbol(dl_handle, "pv_cobra_frame_length");

Part 3. Implement Real-Time Voice Activity Detection (VAD) in C

Now that we've set up dynamic loading, we can implement the Cobra Voice Activity Detection API.

Initialize Cobra VAD

Sign up for an account on Picovoice Console for free and obtain your AccessKey
Replace ${ACCESS_KEY} with your AccessKey

Call pv_cobra_init to create a Cobra VAD instance:

static const char* access_key = "${ACCESS_KEY}"; // replace with your AccessKey
const char *device = "best"; // or "cpu", "gpu", "cpu:0", "gpu:1", etc.

pv_cobra_t *cobra;
const pv_status_t status = pv_cobra_init_func(
    access_key,
    device,
    &cobra);

The device parameter lets you choose what hardware the engine runs on.

Detect voice activity

Capture audio with PvRecorder and pass the recorded audio frames to Cobra VAD for voice detection:

const int32_t frame_length = pv_cobra_frame_length_func();
int16_t *frame = malloc(frame_length * sizeof(int16_t));

while (true) {
    pv_recorder_status_t recorder_status = pv_recorder_read_func(recorder, frame);

    float voice_probability = 0.f;
    status = pv_cobra_process_func(cobra, frame, &voice_probability);

    printf("\rVoice probability: %.2f", voice_probability);
    fflush(stdout);
}

Explanation:

pv_cobra_frame_length: Required number of samples per frame.
pv_cobra_process: Analyzes an audio frame and returns a voice probability score between 0.0 and 1.0.
- voice_probability: A value closer to 1.0 indicates high confidence that voice is present; closer to 0.0 indicates silence or non-speech audio.

Cleanup Resources

free(frame);
pv_cobra_delete_func(cobra);
close_dl(dl_handle);

Full Working Example: C Voice Detection Application Code

Here is the complete cobra_tutorial.c file you can copy, build, and run (complete with error handling and PvRecorder implementation):

cobra_tutorial.c

#if defined(_WIN32) || defined(_WIN64)
#include <windows.h>
#else
#include <dlfcn.h>
#endif

#include <getopt.h>
#include <signal.h>
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>

#include "pv_cobra.h"
#include "pv_recorder.h"

// Update as needed
#define COBRA_LIB_PATH "./cobra/libpv_cobra.so"
#define PVRECORDER_LIB_PATH "./pvrecorder/libpv_recorder.so"
#define PICOVOICE_ACCESS_KEY "YOUR_ACCESS_KEY_HERE"

static volatile bool is_interrupted = false;

void interrupt_handler(int _) {
    (void) _;
    is_interrupted = true;
}

static void *open_dl(const char *dl_path) {
#if defined(_WIN32) || defined(_WIN64)
    return LoadLibrary(dl_path);
#else
    return dlopen(dl_path, RTLD_NOW);
#endif
}

static void *load_symbol(void *handle, const char *symbol) {
#if defined(_WIN32) || defined(_WIN64)
    return GetProcAddress((HMODULE) handle, symbol);
#else
    return dlsym(handle, symbol);
#endif
}

static void close_dl(void *handle) {
#if defined(_WIN32) || defined(_WIN64)
    FreeLibrary((HMODULE) handle);
#else
    dlclose(handle);
#endif
}

static void print_dl_error(const char *message) {
#if defined(_WIN32) || defined(_WIN64)
    fprintf(stderr, "%s with code '%lu'.\n", message, GetLastError());
#else
    fprintf(stderr, "%s with `%s`.\n", message, dlerror());
#endif
}

int main(void) {
    signal(SIGINT, interrupt_handler);

    // PvRecorder dynamic loading
    void *recorder_dl_handle = open_dl(PVRECORDER_LIB_PATH);
    if (!recorder_dl_handle) {
        fprintf(stderr, "failed to load dynamic library at `%s`.\n", PVRECORDER_LIB_PATH);
        exit(1);
    }

    const char *(*pv_recorder_status_to_string_func)(pv_recorder_status_t) =
        load_symbol(recorder_dl_handle, "pv_recorder_status_to_string");
    if (!pv_recorder_status_to_string_func) {
        print_dl_error("failed to load `pv_recorder_status_to_string`");
        exit(1);
    }

    pv_recorder_status_t (*pv_recorder_init_func)(const int32_t, const int32_t, const int32_t, pv_recorder_t **) =
        load_symbol(recorder_dl_handle, "pv_recorder_init");
    if (!pv_recorder_init_func) {
        print_dl_error("failed to load `pv_recorder_init`");
        exit(1);
    }

    pv_recorder_status_t (*pv_recorder_start_func)(pv_recorder_t *) =
        load_symbol(recorder_dl_handle, "pv_recorder_start");
    if (!pv_recorder_start_func) {
        print_dl_error("failed to load `pv_recorder_start`");
        exit(1);
    }

    pv_recorder_status_t (*pv_recorder_read_func)(pv_recorder_t *, int16_t *) =
        load_symbol(recorder_dl_handle, "pv_recorder_read");
    if (!pv_recorder_read_func) {
        print_dl_error("failed to load `pv_recorder_read`");
        exit(1);
    }

    pv_recorder_status_t (*pv_recorder_stop_func)(pv_recorder_t *) =
        load_symbol(recorder_dl_handle, "pv_recorder_stop");
    if (!pv_recorder_stop_func) {
        print_dl_error("failed to load `pv_recorder_stop`");
        exit(1);
    }

    void (*pv_recorder_delete_func)(pv_recorder_t *) = load_symbol(recorder_dl_handle, "pv_recorder_delete");
    if (!pv_recorder_delete_func) {
        print_dl_error("failed to load `pv_recorder_delete`");
        exit(1);
    }

    // Cobra dynamic loading
    void *dl_handle = open_dl(COBRA_LIB_PATH);
    if (!dl_handle) {
        fprintf(stderr, "failed to load dynamic library at `%s`.\n", COBRA_LIB_PATH);
        exit(1);
    }

    const char *(*pv_status_to_string_func)(pv_status_t) = load_symbol(dl_handle, "pv_status_to_string");
    if (!pv_status_to_string_func) {
        print_dl_error("failed to load `pv_status_to_string`");
        exit(1);
    }

    pv_status_t (*pv_cobra_init_func)(const char *, const char *, pv_cobra_t **) =
        load_symbol(dl_handle, "pv_cobra_init");
    if (!pv_cobra_init_func) {
        print_dl_error("failed to load `pv_cobra_init`");
        exit(1);
    }

    void (*pv_cobra_delete_func)(pv_cobra_t *) = load_symbol(dl_handle, "pv_cobra_delete");
    if (!pv_cobra_delete_func) {
        print_dl_error("failed to load `pv_cobra_delete`");
        exit(1);
    }

    pv_status_t (*pv_cobra_process_func)(pv_cobra_t *, const int16_t *, float *) =
        load_symbol(dl_handle, "pv_cobra_process");
    if (!pv_cobra_process_func) {
        print_dl_error("failed to load 'pv_cobra_process'");
        exit(1);
    }

    int32_t(*pv_cobra_frame_length_func)() = load_symbol(dl_handle, "pv_cobra_frame_length");
    if (!pv_cobra_frame_length_func) {
        print_dl_error("failed to load `pv_cobra_frame_length`");
        exit(1);
    }

    const char *device = "best"; // Update as needed

    pv_cobra_t *cobra = NULL;
    pv_status_t status = pv_cobra_init_func(
        PICOVOICE_ACCESS_KEY,
        device,
        &cobra);
    if (status != PV_STATUS_SUCCESS) {
        fprintf(stderr, "Failed to init with `%s`", pv_status_to_string_func(status));
    }

    const int32_t frame_length = pv_cobra_frame_length_func();
    const int32_t device_index = -1; // -1 == default device
    const int32_t buffered_frame_count = 10;

    pv_recorder_t *recorder = NULL;
    pv_recorder_status_t recorder_status = pv_recorder_init_func(
        frame_length,
        device_index,
        buffered_frame_count,
        &recorder);
    if (recorder_status != PV_RECORDER_STATUS_SUCCESS) {
        fprintf(stderr, "Failed to initialize device with %s.\n", pv_recorder_status_to_string_func(recorder_status));
        exit(1);
    }

    recorder_status = pv_recorder_start_func(recorder);
    if (recorder_status != PV_RECORDER_STATUS_SUCCESS) {
        fprintf(stderr, "Failed to start device with %s.\n", pv_recorder_status_to_string_func(recorder_status));
        exit(1);
    }

    int16_t *frame = malloc(frame_length * sizeof(int16_t));

    printf("Listening... Press Ctrl+C to stop.\n");
    while (!is_interrupted) {
        pv_recorder_status_t recorder_status = pv_recorder_read_func(recorder, frame);
        if (recorder_status != PV_RECORDER_STATUS_SUCCESS) {
            fprintf(
                stderr,
                "Failed to read audio frames with %s.\n",
                pv_recorder_status_to_string_func(recorder_status));
            exit(1);
        }

        float voice_probability = 0.f;
        status = pv_cobra_process_func(cobra, frame, &voice_probability);
        if (status != PV_STATUS_SUCCESS) {
            fprintf(stderr, "Failed to process audio frame.\n");
            break;
        }

        printf("\rVoice probability: %.2f", voice_probability);
        fflush(stdout);
    }
    free(frame);

    recorder_status = pv_recorder_stop_func(recorder);
    if (recorder_status != PV_RECORDER_STATUS_SUCCESS) {
        fprintf(stderr, "Failed to stop device with %s.\n", pv_recorder_status_to_string_func(recorder_status));
        exit(1);
    }

    printf("Stopped.\n");
    pv_recorder_delete_func(recorder);
    pv_cobra_delete_func(cobra);
    close_dl(recorder_dl_handle);
    close_dl(dl_handle);
}

Before running, update:

COBRA_LIB_PATH to point to the correct Cobra VAD library for your platform
PVRECORDER_LIB_PATH to point to the correct PvRecorder library for your platform
PICOVOICE_ACCESS_KEY with your AccessKey from Picovoice Console

This is a simplified example but includes all the necessary components to get started. Check out the Cobra C demo on GitHub for a complete demo application.

Compile and Run Your C Voice Activity Detection Application

Build and run the application:

Linux (gcc) and Raspberry Pi (gcc)

gcc -std=c99 -O2 -Wall -Wextra -I./pvrecorder/include -I./cobra/include -o cobra_tutorial cobra_tutorial.c -ldl

./cobra_tutorial

macOS (clang)

clang -std=c99 -O2 -Wall -Wextra -I./pvrecorder/include -I./cobra/include -o cobra_tutorial cobra_tutorial.c

./cobra_tutorial

Windows (MinGW)

gcc -std=c99 -O2 -Wall -Wextra -I./pvrecorder/include -I./cobra/include -o cobra_tutorial.exe cobra_tutorial.c

./cobra_tutorial.exe

Conclusion

You now have a complete cross-platform voice activity detection implementation in C that works on Linux, Windows, macOS, and Raspberry Pi. The dynamic loading approach gives you flexibility to deploy the same codebase across all supported platforms without modification.

This tutorial covered the fundamentals: setting up the Cobra VAD library, implementing platform-specific dynamic loading, processing audio frames for voice detection, and interpreting probability scores. You can build on this foundation by integrating Cobra into:

Voice-controlled applications that respond only when someone is speaking
Real-time speech-to-text systems (or use Cheetah Streaming Speech-to-Text, which already includes built-in voice activity detection)
Noise suppression tools that clean only speech segments
Other efficient speech-processing pipelines that activate exclusively when voice is detected

Start Building

Troubleshooting Common VAD Issues

VAD consistently reports low voice probability

Make sure the input audio matches the format required by the VAD engine. Cobra VAD expects 16 kHz, 16-bit linear PCM, with frames exactly the length returned by pv_cobra_frame_length. Using a different sample rate or incorrect frame size often results in poor detection performance.

Compilation fails due to missing headers

Check that all include paths in your compiler command correspond to your folder structure. The -I flags should reference directories containing header files—not the files themselves. Also, confirm that all required headers have been downloaded and placed correctly.

Access key authentication errors

If initialization fails with an authentication error, ensure that you've replaced the placeholder AccessKey with your actual key from Picovoice Console.

Frequently Asked Questions

Can Cobra Voice Activity Detection be used with pre-recorded audio files?

Yes. Read the audio file in fixed-size frames and process each frame sequentially, just as you would with live microphone input.

How do I choose a threshold for voice detection?

A common approach is to treat the VAD output as a probability and apply a threshold (for example, 0.5) based on your tolerance for false positives versus missed speech.

How do I integrate Cobra Voice Activity Detection with speech-to-text systems?

Use Cobra to detect voice activity, then only send audio segments with detected speech to your STT engine. This reduces processing costs and improves efficiency. You can combine Cobra with Cheetah Streaming Speech-to-Text for a complete voice processing pipeline.