Eagle Speaker Recognition

Speaker recognition for personalization and access control

Enroll users in seconds and identify who's speaking in a word or two. Verify users from a single command and detect speaker changes in conversations in real time. Text-independent. Language-agnostic. On-device.

Click on "Enroll a speaker"
to get started
0.18%
Equal Error Rate vs. SpeechBrain 0.49%
4.5 MB
Model Size vs. SpeechBrain 117.5 MB
Any
No language or passphrase restriction
What is Eagle Speaker Recognition?

Only production-ready, real-time optimized speaker recognition

Eagle Speaker Recognition identifies who is speaking by comparing voice characteristics against enrolled speaker profiles. It answers the question every voice interface needs to answer before it can personalise the experience or secure access: "Is this the right person?"

Two cloud alternatives have exited the market: Azure AI Speaker Recognition in September 2025 and Amazon Connect Voice ID in May 2026. Open-source alternatives, SpeechBrain and pyannote, are Python research frameworks with higher error rates (lower accuracy) and compute requirements, without cross-platform support.

Eagle Speaker Recognition is the only viable production-ready option for real-time applications that can be deployed at scale. It enrolls speakers in seconds from any natural speech — no passphrase, no scripted phrases. It identifies enrolled speakers in real time from a single short utterance, fast enough to verify a speaker using a wake word or a single spoken command, and detect speaker changes in conversations. Eagle Speaker Recognition runs entirely on-device; no audio is transmitted to any server, no cloud round-trip, no network dependency.

Developer Experience

Add real-time voice authentication to any pipeline in minutes

Eagle Speaker Recognition separates into two components. The profiler handles enrollment, it takes short audio segments, provides real-time feedback on audio quality and enrollment progress, and creates a compact speaker profile. The recogniser handles identification and processes audio frame by frame and returns a similarity score per enrolled speaker. No pipeline rearchitecture. No cloud service to authenticate. Use Eagle Speaker Recognition with its native SDKs for Python, C, NodeJS, iOS, Android, and Web.

OPEN-SOURCE EAGLE SPEAKER RECOGNITION BENCHMARK

Lowest Error Rate, Highest Efficiency

Eagle Speaker Recognition is benchmarked against SpeechBrain and pyannote and achieved the lowest error rate (highest accuracy) - 2.7× more accurate than SpeechBrain and 3.9× more accurate than pyannote - with the smallest model that requires just 4.5 MB - 10× smaller than pyannote and 26× smaller than SpeechBrain.

Equal Error Rate (EER) — lower is better
Eagle Speaker Recognition0.18%
SpeechBrain Speaker Recognition0.49%
pyannote Speaker Recognition0.70%
Model Size (MB to initialize) — lower is better
Eagle Speaker Recognition4.5 MB
SpeechBrain Speaker Recognition46.5 MB
pyannote Speaker Recognition117.5 MB
Ready to integrate? Check our docs to start building or talk to the sales team about enterprise deployment.
Capabilities

Why enterprises choose Eagle Speaker Recognition

Eagle is an enterprise-ready on-device speaker recognition engine built for real-time identification and verification at scale. It runs entirely offline across platforms, requires no GPU, and is private by architecture.

01Production ReadyBoth major cloud speaker recognition APIs have been retired. Azure AI Speaker Recognition was retired in September 2025 and Amazon Connect Voice ID in May 2026. Open-source alternatives — SpeechBrain and pyannote — are Python-only research frameworks with no mobile or embedded support and no production SDK. Eagle Speaker Recognition is the only production-ready, cross-platform, on-device speaker recognition engine available to enterprises today.
02Enrollment in secondsEagle Speaker Recognition enrollment completes in seconds from any natural speech — no passphrase, no scripted phrases, no separate session. The profiler returns real-time audio quality feedback so the product can guide users through enrollment naturally without a dedicated flow.
03Accurate – 0.18% EEROpen-source Speaker Recognition Benchmark shows Eagle Speaker Recognition achieves the lowest error rate with 0.18% EER, 2.7× lower than SpeechBrain (0.49%) and 3.9× lower than pyannote (0.70%). EER measures the point where false acceptance and false rejection rates are equal. A lower EER means fewer impostors get through and fewer genuine users get blocked.
04Efficient – 4.5 MBEagle Speaker Recognition requires 4.5 MB storage to initialize — 10× smaller than pyannote (46.5 MB) and 26× smaller than SpeechBrain (117.5 MB). Runs on mobile devices, embedded platforms, Raspberry Pi, and web browsers. Well-suited for OTA updates. No GPU or dedicated AI accelerator required.
05Short-utterance precision for personalized wake wordsMost speaker recognition engines require 2–3 seconds of continuous speech for reliable identification, with accuracy degrading significantly below one second. Eagle Speaker Recognition integrates with Porcupine Wake Word Detection for a two-stage personalised activation pipeline where Porcupine detects the wake word, Eagle verifies the speaker, and the device activates only for the enrolled user. Used in XR glasses, smart earbuds, smartwatches, and laptops, where shared-environment false activations are a security and user experience concern.
Click on "Enroll a speaker"
to get started
06Short-utterance precision for meetingsEagle Speaker Recognition can processes audio continuously and return a similarity score per enrolled speaker per sample, without waiting for an utterance to end. Applications can detect speaker changes mid-conversation, route interactions based on who is speaking, and personalise responses in real time. Suitable for meeting speaker tracking, access control, and live conversation analytics.
07Cross-PlatformEagle Speaker Recognition runs on every platform your product ships — Android, Chrome, Edge, Firefox, iOS, Linux, macOS, Raspberry Pi, Safari, and Windows — across AMD, Intel, NVIDIA, and Qualcomm hardware.
08Text-independent and language-agnosticEagle Speaker Recognition identifies speakers based on voice characteristics, not on what is said or in what language. No passphrase is required. No language is needed. Deployable in any market without per-language setup.
09Private by designEagle Speaker Recognition processes audio entirely on-device without transmitting audio data to any server, making Eagle Speaker Recognition GDPR, HIPAA, CCPA, and CJIS compliant by architecture — not policy.
10Enterprise ReadyEagle Speaker Recognition is production-grade and enterprise-ready. Picovoice offers flexible licensing, dedicated engineering support, NDA-protected custom model training, and SLA-backed response times for teams shipping at scale.

Ship it.
On device.

Ship it. On device.

FAQ

Common questions about speaker recognition

+
What is speaker recognition?

Speaker Recognition deals with speaker identification and verification using distinguishable voice characteristics. It focuses on "who" rather than "what".

+
What's speaker identification?

Speaker Identification, also known as Speaker Search or Speaker Spotting, is a specialized application of speaker recognition that determines the identity of an unknown speaker by comparing their voice characteristics with those of known speakers.

+
What's speaker verification?

Speaker Verification, also known as Voice Biometrics, Voice Authentication, and Voiceprinting, is a subset of speaker recognition that focuses on verifying individuals' identities using unique voice patterns.

+
What's the difference between speaker verification and speaker identification?

Speaker Identification and Speaker Verification are both subsets of Speaker Recognition. If a Speaker Recognition engine does a one-to-one match to verify the claimed identity, it's called Speaker Verification. If it does a one-to-many match, i.e., determines the speaker's identity within a group of enrolled speakers, it's called Speaker Identification.

+
What are the use cases and applications of Speaker Recognition?

Eagle Speaker Recognition is used wherever knowing who is speaking enables a better or more secure experience. Smart devices and wearables use it to personalise voice interfaces and restrict activation to enrolled users only — ensuring a smartwatch, XR glasses, or smart earbuds respond only to their owner. Contact centres use it to authenticate callers by voice, eliminating security questions and reducing handle time. Legal and healthcare teams use it to attribute speech in recorded conversations for documentation, compliance, and evidence discovery. Meeting intelligence platforms use it to identify participants by voice for CRM integration, action item assignment, and conversation analytics. IoT and smart home devices use it to deliver personalised responses depending on which household member is speaking. Learn more about speaker recognition use cases.

+
How can I select the best speaker recognition engine?

The best speaker recognition engine varies among enterprises, depending on their priorities and needs. Performance, Platform Support, Scalability, Compliance, Ease of Use, Developer-Friendliness, Availability of Support, and the Total Cost of Ownership are the most important factors to consider before a decision.

Equal Error Rate (EER) is the standard accuracy metric measuring where false acceptance and false rejection rates are equal. For example, Eagle Speaker Recognition achieves the lowest error rate with 0.18% EER, 2.7× lower than SpeechBrain (0.49%) and 3.9× lower than pyannote (0.70%).

+
What is the difference between speaker recognition and speaker diarization?

Speaker diarization segments audio by speaker and returns anonymous labels — Speaker 1, Speaker 2 — without knowing who the speakers are. Speaker recognition identifies known, enrolled speakers by profile. Diarization is used for meeting transcripts and multi-speaker audio analytics. Speaker recognition is used for authentication, access control, and personalisation. For batch speaker diarization, see Falcon Speaker Diarization. For real-time streaming diarization, see Bluebird Streaming Speaker Diarization.

+
Is Eagle Speaker Recognition a good alternative to Azure AI Speaker Recognition?

Yes, Eagle Speaker Recognition is a great alternative to Azure AI Speaker Recognition. Azure AI Speaker Recognition was retired in September 2025 and is no longer available. Before retirement, it was accessible only to enterprise customers approved through a limited access programme. Eagle Speaker Recognition is available via the Picovoice Console and runs entirely on-device.

+
Is Eagle Speaker Recognition a good alternative to Amazon Connect Voice ID?

Yes. Amazon Connect Voice ID was retired in May 2026. Before retirement, it required 30 seconds of customer speech for enrollment, was only available as part of the Amazon Connect contact centre platform, and could not be used as a standalone SDK in a custom application. Eagle Speaker Recognition completes enrollment in seconds from any natural speech, works as a standalone cross-platform engine in any application, and processes all audio on-device.

+
How does Eagle Speaker Recognition compare to SpeechBrain Speaker Recognition?

SpeechBrain Speaker Recognition is an open-source speech toolkit with speaker recognition capabilities. It achieves 0.49% EER versus Eagle's 0.18% — 2.7× higher error rate — with a model size of 117.5 MB versus Eagle's 4.5 MB, making it 26× larger. It is a research framework with community-only support and no production SDK.

+
How does Eagle Speaker Recognition compare to pyannote Speaker Recognition?

pyannote achieves 0.70% EER versus Eagle's 0.18% — 3.9× higher error rate — with a model size of 46.5 MB versus Eagle's 4.5 MB, 10× larger. Open-source pyannote is Python-only, supports Linux and macOS, requires HuggingFace account setup and manual model condition acceptance, and often needs retraining for real-world deployment. Eagle Speaker Recognition runs entirely on-device with a production-grade cross-platform engine.

+
Do you have a benchmark comparing Eagle Speaker Recognition to alternatives?

Yes. Picovoice publishes an open-source speaker recognition benchmark, comparing Eagle Speaker Recognition against SpeechBrain Speaker Recognition and pyannote Speaker Recognition on the VoxConverse dataset. Eagle achieves 0.18% EER, the lowest of all benchmarked engines — 2.7× lower than SpeechBrain (0.49%) and 3.9× lower than pyannote (0.70%). Eagle's model requires 4.5 MB to initialize, the lowest of all benchmarked engines: 117.5 MB for SpeechBrain and 46.5 MB for pyannote. Azure AI Speaker Recognition and Amazon Connect Voice ID are not included as they've retired.

+
Can I combine Eagle Speaker Recognition with Porcupine Wake Word Detection?

Yes. Eagle and Porcupine integrate directly for a two-stage personalised wake word pipeline where Porcupine detects the wake word, Eagle verifies the speaker, and the device activates only for the enrolled user. This enables personalised and secure voice activation for XR glasses, smart earbuds, smartwatches, laptops, and any shared device where unauthorised activation is a security or user experience concern.

+
Does Eagle Speaker Recognition require a passphrase?

No. Eagle Speaker Recognition is text-independent. It identifies speakers based on voice characteristics alone, regardless of what is said. No passphrase, no scripted phrases, no language restriction. Speakers enroll and are identified using any natural speech.

+
Is Eagle Speaker Recognition language-independent?

Yes. Eagle Speaker Recognition is language-agnostic, trained on diverse speech corpora spanning multiple languages and dialects, removing language dependency entirely. Language-dependent speaker recognition engines degrade when the speaker's language differs from the training data. For example, an engine trained on English may perform significantly worse on German or Hindi speakers. Eagle's model is trained to capture speaker identity from voice characteristics that are consistent across languages, meaning performance does not degrade based on what language the speaker uses, whether they code-switch mid-conversation, or whether their language was represented in the training data. This makes Eagle Speaker Recognition suitable for global deployments without per-language configuration or per-market retraining.

+
How long does enrollment take with Eagle Speaker Recognition?

Enrollment completes in seconds from any natural speech. The Eagle profiler provides real-time feedback on enrollment progress based on audio quality and diversity of the sounds. So the application can guide users through enrollment without a dedicated scripted session.

+
Does Eagle Speaker Recognition support real-time speaker identification?

Yes. Eagle Speaker Recognition processes audio continuously and returns a similarity score per enrolled speaker per frame in real time, without waiting for an utterance to end. This enables real-time speaker change detection, live personalization, and access control during ongoing conversations.

+
How many speakers can Eagle Speaker Recognition enroll?

There is no limit on enrolled speaker profiles. Profiles are stored locally, scaling without cloud infrastructure, per-seat fees, or service limits.

+
Does Eagle Speaker Recognition require a GPU?

No. Eagle Speaker Recognition runs on standard CPU hardware — laptops, desktops, mobile devices, and embedded platforms, including Raspberry Pi. No GPU, no dedicated AI accelerator, and no special runtime required.

+
Is Eagle Speaker Recognition GDPR, HIPAA, and CJIS compliant?

Yes. Audio and enrolled voice profiles are processed entirely on-device and never transmitted to any server. Eagle Speaker Recognition is compliant with GDPR, HIPAA, CCPA, and CJIS by architecture — not policy. Picovoice cannot access end-user audio or enrolled voice profiles.

+
Which platforms does Eagle Speaker Recognition support?
+
How do I get technical support for Eagle Speaker Recognition?

Picovoice docs, blog, Medium posts, and GitHub are great resources to learn about voice AI, Picovoice technology, and how to detect who is speaking. Enterprise customers get dedicated support specific to their applications from Picovoice Product & Engineering teams. Reach out to your Picovoice contact or contact sales to discuss support options.

+
How can I get informed about updates and upgrades?

Version changes appear in the and LinkedIn. Subscribing to GitHub is the best way to get notified of patch releases. If you enjoy building with Eagle Speaker Recognition, show it by giving a GitHub star!