Eagle Speaker Recognition
Web API
API Reference for the Eagle Web SDK (npm).
Eagle
Class for using the recognizer component of the Eagle Speaker Recognition engine on the main application thread. The recognizer processes incoming audio in consecutive frames and emits a similarity score for each enrolled speaker.
Eagle.create()
Creates an instance of the recognizer component of the Eagle Speaker Recognition engine.
Parameters
accessKey
string : AccessKey obtained from Picovoice Console.model
EagleModel : Eagle model options.speakerProfiles
EagleProfile[] | EagleProfile : One or more Eagle speaker profiles. These can be constructed usingEagleProfiler
.
Returns
Eagle
: An instance of the Eagle.
Eagle.process()
Processes a frame of audio and returns a list of similarity scores for each speaker profile.
Parameters
pcm
Int16Array : A frame of audio samples. The number of samples per frame can be attained by calling.frameLength
. The incoming audio needs to have a sample rate equal to.sampleRate
and be 16-bit linearly-encoded. Eagle operates on single-channel audio.
Returns
- number[] : A list of similarity scores for each speaker profile. A higher score indicates that the voice belongs to the corresponding speaker. The range is [0, 1] with 1.0 representing a perfect match.
Eagle.reset()
Resets the internal state of the engine. It is best to call before processing a new sequence of audio (e.g. a new voice interaction). This ensures that the accuracy of the engine is not affected by a change in audio context.
Eagle.release()
Releases resources acquired by Eagle.
Eagle.frameLength
Number of audio samples per frame expected by Eagle (i.e. length of the array passed into .process()
)
Eagle.sampleRate
Audio sample rate accepted by Eagle.
Eagle.version
Version of Eagle.
EagleModel
Eagle model type.
base64
string: The model file (.pv
) in base64 string to initialize Koala.publicPath
string: The model file (.pv
) path relative to the public directory.customWritePath
string : Custom path to save the model in storage. Set to a different name to use multiple models across Eagle instances.forceWrite
boolean : Flag to overwrite the model in storage even if it exists.version
number : Version of the model file. Increment to update the model file in storage.
EagleProfile
Eagle speaker profile. Can be created by calling .export()
after a successful speaker enrollment.
bytes
Uint8Array: Binary array containing the Eagle speaker profile.
EagleProfiler
Class for using the profiler component of the Eagle Speaker Recognition engine on the main thread of your application. The profiler is responsible for enrolling a speaker given a set of utterances and exporting a speaker profile.
EagleProfiler.create()
Creates an instance of the profiler component of the Eagle Speaker Recognition engine.
Parameters
accessKey
string : AccessKey obtained from Picovoice Console.model
EagleModel : Eagle model options.
Returns
EagleProfiler
: An instance of the EagleProfiler.
EagleProfiler.enroll()
Enrolls a speaker. This function should be called multiple times with different utterances of the same speaker
until the enrollment percentage reaches 100.0
at which point a speaker voice profile can be exported using .export()
. Any further enrollment can be used to improve the speaker voice
profile.
The minimum number of required samples can be obtained by
calling .minEnrollSamples
.
The audio data used for enrollment should satisfy the following requirements:
- only one speaker should be present in the audio
- the speaker should be speaking in a normal voice
- the audio should contain no speech from other speakers and no other sounds (e.g. music)
- it should be captured in a quiet environment with no background noise
Parameters
pcm
Int16Array : Audio data. The audio needs to have a sample rate equal to.sampleRate
and be 16-bit linearly-encoded. EagleProfiler operates on single-channel audio.
Returns
- EagleProfilerEnrollResult : The percentage of completeness of the speaker enrollment process along with the feedback code corresponding to the last enrollment attempt.
EagleProfiler.export()
Exports the speaker profile of the current session. Will throw error if the profile is not ready.
Returns
- EagleProfile : The Eagle speaker profile.
EagleProfiler.reset()
Resets the internal state of Eagle Profiler. It should be called before starting a new enrollment session.
EagleProfiler.release()
Releases resources acquired by Eagle Profiler.
EagleProfiler.minEnrollSamples
The minimum length of the input pcm required by .enroll()
.
EagleProfiler.sampleRate
Audio sample rate accepted by Eagle.
EagleProfiler.version
Version of Eagle.
EagleProfilerEnrollFeedback
AUDIO_OK
: The audio is good for enrollment.AUDIO_TOO_SHORT
: Audio length is insufficient for enrollment, i.e. it is shorter than.min_enroll_samples
.UNKNOWN_SPEAKER
: There is another speaker in the audio that is different from the speaker being enrolled. Too much background noise may cause this error as well.NO_VOICE_FOUND
: The audio does not contain any voice, i.e. it is silent or has a low signal-to-noise ratio.QUALITY_ISSUE
: The audio quality is too low for enrollment due to a bad microphone or recording environment.
EagleProfilerEnrollResult
Type for storing the results of .enroll()
.
feedback
EagleProfilerEnrollFeedback : A feedback code corresponding to the last enrollment attempt.percentage
number : The percentage of completeness of the speaker enrollment process.
EagleProfilerWorker
Class for using the profiler component of the Eagle Speaker Recognition engine on a worker thread. The profiler is responsible for enrolling a speaker given a set of utterances and exporting a speaker profile.
EagleProfilerWorker.create()
Creates an instance of the profiler component of the Eagle Speaker Recognition engine.
Parameters
accessKey
string : AccessKey obtained from Picovoice Console.model
EagleModel : Eagle model options.
Returns
EagleProfilerWorker
: An instance of the EagleProfilerWorker.
EagleProfilerWorker.enroll()
Enrolls a speaker. This function should be called multiple times with different utterances of the same speaker
until the enrollment percentage reaches 100.0
at which point a speaker voice profile can be exported using .export()
. Any further enrollment can be used to improve the speaker voice
profile.
The minimum number of required samples can be obtained by
calling .minEnrollSamples
.
The audio data used for enrollment should satisfy the following requirements:
- only one speaker should be present in the audio
- the speaker should be speaking in a normal voice
- the audio should contain no speech from other speakers and no other sounds (e.g. music)
- it should be captured in a quiet environment with no background noise
Parameters
pcm
Int16Array : Audio data. The audio needs to have a sample rate equal to.sampleRate
and be 16-bit linearly-encoded. EagleProfilerWorker operates on single-channel audio.
Returns
- EagleProfilerEnrollResult : The percentage of completeness of the speaker enrollment process along with the feedback code corresponding to the last enrollment attempt.
EagleProfilerWorker.export()
Exports the speaker profile of the current session. Will throw error if the profile is not ready.
Returns
- EagleProfile : The Eagle speaker profile.
EagleProfilerWorker.reset()
Resets the internal state of Eagle Profiler. It should be called before starting a new enrollment session.
EagleProfilerWorker.release()
Releases resources acquired by Eagle Profiler.
EagleProfilerWorker.terminate()
Force terminates the instance of EagleProfilerWorker
.
EagleProfilerWorker.minEnrollSamples
The minimum length of the input pcm required by .enroll()
.
EagleProfilerWorker.sampleRate
Audio sample rate accepted by Eagle.
EagleProfilerWorker.version
Version of Eagle.
EagleWorker
Class for using the recognizer component of the Eagle Speaker Recognition engine on a worker thread. The recognizer processes incoming audio in consecutive frames and emits a similarity score for each enrolled speaker.
EagleWorker.create()
Creates an instance of the recognizer component of the Eagle Speaker Recognition engine.
Parameters
accessKey
string : AccessKey obtained from Picovoice Console.model
EagleModel : Eagle model options.speakerProfiles
EagleProfile[] | EagleProfile : One or more Eagle speaker profiles. These can be constructed usingEagleProfiler
.
Returns
EagleWorker
: An instance of the EagleWorker.
EagleWorker.process()
Processes a frame of audio and returns a list of similarity scores for each speaker profile.
Parameters
pcm
Int16Array : A frame of audio samples. The number of samples per frame can be attained by calling.frameLength
. The incoming audio needs to have a sample rate equal to.sampleRate
and be 16-bit linearly-encoded. Eagle operates on single-channel audio.
Returns
- number[] : A list of similarity scores for each speaker profile. A higher score indicates that the voice belongs to the corresponding speaker. The range is [0, 1] with 1.0 representing a perfect match.
EagleWorker.reset()
Resets the internal state of the engine. It is best to call before processing a new sequence of audio (e.g. a new voice interaction). This ensures that the accuracy of the engine is not affected by a change in audio context.
EagleWorker.release()
Releases resources acquired by EagleWorker
EagleWorker.terminate()
Force terminates the instance of EagleWorker
.
EagleWorker.frameLength
Number of audio samples per frame expected by Eagle (i.e. length of the array passed into .process()
)
EagleWorker.sampleRate
Audio sample rate accepted by Eagle.
EagleWorker.version
Version of Eagle.