Every person has unique and distinctive voice characteristics, similar to fingerprints. Voice Biometrics
, a subset of Speaker Recognition, is the technology that uses these characteristics to verify individuals. In other words, Voice Biometrics
uses voice data to check whether an individual is who they claim to be. Speaker Verification
, Voice Authentication
, and Voiceprinting
are other terms that refer to Voice Biometrics
.
Authentication relies on passwords, which users forget frequently. Voice Biometrics
, i.e., Voice Authentication
, is a convenient alternative as users do not need to memorize lengthy passwords or carry identification cards. A user can simply speak for authentication. Hence, call centers, mobile and online applications, and IoT devices offer it as a standalone solution or a part of a Multi-Factor Authentication (MFA):
- Customer Care: Citibank uses Voiceprints to verify customers' identities when they call the bank.
- Payment: Amazon Alexa allows users to make purchases with their voice, authenticating them using Voice Biometrics.
- Automation: Google Nest Hub Max uses Voice Match to recognize the voices of different users and provide personalized content and access across Nest devices such as thermostats or smart locks.
- Access: Monument Health in the Mayo Clinic network uses Voice Biometrics to authenticate healthcare providers when they access electronic health records.
How does Voice Biometrics work?
Voice Biometrics
verifies a person by comparing their Voice Samples against an original Voice Template
. Hence, the first step in Voice Biometrics
is Enrollment
, creating the original Voice Template
. Enrollment
and Voiceprint Extraction
are interchangeable terms that refer to the same process.
Voice Biometrics
engines capture users’ speech data.Voice Biometrics
engines process and analyze the data.Voice Biometrics
engines create a uniqueVoiceprint
, known asVoiceID
andSpeakerID
.
The second step, Comparison
, determines whether the new voice input belongs to the original speaker.
Voice Biometrics
engines capture users’ speech data.Voice Biometrics
engines process data and compare it against the existing voiceprint (i.e.,VoiceID
).Voice Biometrics
engines share a score. The higher the score, the more likely the sample belongs to the person they claim to be.
Please note some legacy Voice Biometrics
engines may return a binary answer instead of a score. They allow developers to choose one of the pre-defined threshold (sensitivity) levels, and engines provide a positive response, such as “pass.” or a negative one, such as “fail.” The result for the end users may be the same. However, this legacy approach limits developers’ visibility and makes them dependent on the vendors to adjust threshold levels.
Interested in adding Voice Biometrics
to your application? Start building for free!