🎯 Voice AI Consulting
Get dedicated support and consultation to ensure your specific needs are met.
Consult an AI Expert

At a high level, speaker recognition determines who is speaking. It includes speaker verification (authentication of a known user—also called voice biometrics, Voice ID, or speaker authentication) and speaker identification (selecting one speaker from many). These capabilities are used for secure access control, passive identity checks, and personalized multi-user voice interfaces.

Building speaker recognition into C applications requires real-time audio processing and consistent performance across major platforms like Linux, Windows, macOS, and even Raspberry Pi. Developers often compare open-source options like pyannote, SpeechBrain, and WeSpeaker with cloud-based voice biometric services, but both come with trade-offs. Cloud services add latency, which is one of the most important criteria for evaluating real-time speaker recognition, require constant connectivity, and send sensitive audio off-device, while open-source frameworks often need heavy tuning to reach production-quality accuracy and resource usage.

For applications that need low-latency, private, on-device verification or identification—such as smart home systems, secure authentication flows, or multi-user AI assistants—local processing can deliver the most reliable results.

This comprehensive tutorial demonstrates how to build cross-platform speaker recognition in C using Picovoice Eagle, a high-performance on-device speaker recognition engine. Eagle processes voice data locally and identifies speakers in real time with proven advantages over alternatives: higher accuracy and lower resource utilization as demonstrated by independent open-source benchmarks.

What You'll Learn

This tutorial covers a complete implementation of speaker recognition in C, including:

  1. Capturing live microphone audio across all major platforms
  2. Dynamic library loading for cross-platform compatibility
  3. Enrolling speaker profiles with quality feedback
  4. Exporting and importing speaker profiles for persistent storage
  5. Performing real-time speaker recognition with confidence scores
  6. Error handling and cleanup best practices

By the end of this guide, you'll have two complete C applications:

  1. enrollment.c - Enrolls new speakers and saves their voice profiles
  2. recognition.c - Uses saved voice profiles to identify speakers in real time

Important: This tutorial builds on How to Record Audio in C. Ensure your audio capture environment is properly configured before continuing.

Table of Contents


Prerequisites

Before starting this tutorial, ensure you have:

  • C99-compatible compiler
  • Windows users: MinGW
  • A Picovoice AccessKey (Get your free AccessKey)
  • Working microphone for audio capture

Supported Platforms

Eagle Speaker Recognition supports the following platforms:

  • Linux: x86_64
  • macOS: x86_64 and arm64
  • Windows: x86_64 and arm64
  • Raspberry Pi: Models 3, 4, and 5

Project Setup

This is the folder structure used in this tutorial. You can organize your files differently if you like, but make sure to update the paths in the examples accordingly:

For instructions on audio capture using pvrecorder, refer to the prerequisite tutorial: How to Record Audio in C.

Step 1: Add Eagle Library Files

Set up the Eagle Speaker Recognition library:

  1. Create a folder named eagle/ in your project root.
  2. Download the Eagle header files from GitHub (pv_eagle.h and picovoice.h) and place them in:
  1. Download the appropriate Eagle library (libpv_eagle.so, libpv_eagle.dylib, or libpv_eagle.dll) for your platform and place it in:
  1. Download the Eagle model file (eagle_params.pv) and place it in:

Your Eagle directory should now contain the library file, model file, and include folder with both header files.

Enrollment & Recognition Setup

We'll create two separate C applications: enrollment.c for enrolling speakers, and recognition.c for identifying them. The setup process is the same for both files. Start by creating both files, then add the required headers and cross-platform dynamic loading functions to each.

These helpers work identically whether you're using PvRecorder, Cheetah Streaming Speech-to-Text, Rhino Speech-to-Intent, Eagle Speaker Recognition, or other Picovoice engines.

Step 2: Include Required Headers

Add the necessary headers to your C file:

Understanding the Headers

  • Windows-specific: windows.h provides LoadLibrary to load a shared library and GetProcAddress to retrieve individual function pointers
  • Unix-specific: dlfcn.h provides dlopen and dlsym, which provide the same functionality as LoadLibrary and GetProcAddress
  • UTF8_COMPOSITION_FLAG and NULL_TERMINATED: Constants for writing and reading speaker profiles on Windows
  • signal.h: Enables Ctrl+C interrupt handling

Step 3: Define Dynamic Loading Helper Functions

Eagle Speaker Recognition ships as a shared library (.so, .dylib, or .dll depending on platform). Rather than statically linking at compile time, we'll load the library dynamically at runtime.

We'll build a set of helper functions that handle:

  1. Opening shared libraries across different operating systems
  2. Fetching function pointers from loaded libraries
  3. Graceful cleanup and error handling

3a. Open the Shared Library

This function opens a shared library and returns a handle.

3b. Load Function Symbols

Retrieves a function pointer from the loaded library by symbol name.

3c. Close the Library

Properly closes the library handle and frees associated resources. Call this in your cleanup code.

3d. Print Platform-Specific Errors

Provides detailed error messages specific to each platform's dynamic loading mechanism.

Step 4: Add Interrupt Handling

For clean program termination on Ctrl+C, add interrupt handling:

This allows users to gracefully stop speaker enrollment or recognition.

Step 5: Load Required Libraries

Load the Eagle Speaker Recognition library:


Part 1: Implementing Speaker Enrollment

Eagle Speaker Recognition requires two distinct phases: enrollment and recognition. During enrollment, users speak into the microphone to create a unique voice profile. This section builds the enrollment application (enrollment.c).

Understanding Speaker Enrollment

The enrollment process:

  1. Captures audio samples from the user's voice
  2. Analyzes acoustic features to build a speaker profile
  3. Provides real-time feedback on audio quality
  4. Requires a minimum duration of speech for reliability
  5. Exports a binary profile file for later recognition

Step 6: Load Eagle Profiler Functions

Load all necessary Eagle functions for enrollment:

Step 7: Initialize Eagle Profiler

The Eagle Profiler is the enrollment component that builds speaker profiles:

The device parameter lets you choose what hardware the engine runs on.

You can set it to 'best' to automatically pick the most suitable option, or specify a device yourself. For example, 'gpu' uses the first available GPU, while 'gpu:0' or 'gpu:1' targets a specific GPU. If you want to run on the CPU, use 'cpu', or control the number of CPU threads with something like 'cpu:4'.

Important: Replace ${ACCESS_KEY} with your actual AccessKey from Picovoice Console.

Step 8: Determine Minimum Enrollment Samples

Eagle requires a minimum amount of audio for reliable enrollment:

Step 9: Perform Speaker Enrollment

Capture audio with PvRecorder and enroll the speaker with real-time progress feedback:

Enrollment tips for users:

  • Speak naturally in a quiet environment
  • Avoid background noise, music, or other voices
  • Speak different phrases or sentences for variety

Step 10: Export Speaker Profile

Once enrollment reaches 100%, export the speaker profile:

Step 11: Write Speaker Profile to File

Save the profile to a file for later use in recognition:

Step 12: Cleanup Enrollment Resources

Properly release all allocated resources:


Part 2: Implementing Real-Time Speaker Recognition

Now that you have enrolled a speaker profile, let's build the recognition application (recognition.c).

Understanding Speaker Recognition

The recognition process works as follows:

  1. Loads one or more pre-enrolled speaker profiles from the enrollment stage
  2. Initializes Eagle Speaker Recognition with these profiles
  3. Captures live audio frames from the microphone
  4. Eagle Speaker Recognition processes each audio frame
  5. Returns confidence scores ranging from 0.0 (no match) to 1.0 (perfect match) for each enrolled speaker

Higher scores indicate stronger voice matches.

Step 13: Load Eagle Recognizer Functions

Load all required function symbols for recognition:

Step 14: Load Speaker Profile from File

Begin your recognition application by loading the previously enrolled speaker profile:

This example loads a single speaker profile, but Eagle Speaker Recognition supports multiple profiles for multi-user identification.

Step 15: Initialize Eagle Recognizer

Initialize the Eagle recognizer with your enrolled speaker profile(s):

Step 16: Perform Real-Time Speaker Recognition

Start the recognition loop to continuously identify the speaker:

Understanding recognition scores:

  • 0.0 - 0.3: Very low confidence, likely not the enrolled speaker
  • 0.3 - 0.5: Low to moderate confidence, uncertain match
  • 0.5 - 0.7: Good confidence, probable match
  • 0.7 - 1.0: High confidence, strong match

Step 17: Cleanup Recognition Resources

Properly release all allocated resources and close libraries:

Best practice: Always ensure cleanup code runs even if errors occur during recognition.


Complete Example: Speaker Enrollment & Recognition in C

Here are the full enrollment.c and recognition.c files you can copy, build, and run. They are complete with error handling and PvRecorder implementation for audio capture. Before building and running, replace ${ACCESS_KEY} with your Picovoice AccessKey and update file paths according to your project setup.

Enrollment Complete Code

Here is enrollment.c:

Recognition Complete Code

Here is recognition.c:

This is a simplified example but includes all the necessary components to get started. Check out the Eagle Speaker Recognition C demo on GitHub for a complete demo application.

Compiling and Running Your Applications

Compile Your Applications

Run Enrollment

Follow the on-screen prompts to enroll a speaker.

Run Recognition

Speak into the microphone and observe the real-time confidence scores. Press Ctrl+C to stop recognition.

Troubleshooting Common Issues

Failed to open Eagle library

  • Verify the library path matches your platform (.so, .dylib, or .dll) and architecture (x86_64 vs ARM)
  • Ensure the library file exists in the specified location

Failed to initialize Eagle

  • Invalid AccessKey - verify your key at console.picovoice.ai
  • Invalid model path - ensure eagle_params.pv exists and path is correct

Low or inconsistent recognition scores

  • Minimize background noise during both enrollment and recognition
  • Verify microphone volume levels are adequate
  • Re-enroll if audio conditions have significantly changed

Frequently Asked Questions

Can Eagle Speaker Recognition recognize multiple speakers simultaneously?
Yes. Eagle can handle multiple enrolled speakers and return scores for each.
Do I need to re-enroll speakers periodically?
No. Speaker profiles are stable over time. Re-enrollment is only needed if the person's voice changes significantly or if audio conditions change dramatically.
Can Eagle work offline?
Yes. Eagle performs all processing on-device without requiring network connectivity. Internet is required only for licensing and usage tracking.
What audio format does Eagle expect?
Eagle requires 16-bit linear PCM audio at 16 kHz sample rate, single channel (mono). PvRecorder automatically provides audio in this format.