Speaker Identification with Node.js

🚀 Best-in-class Voice AI!

Build compliant and low-latency AI apps running entirely on mobile without sending user data to 3rd party servers.

Speaker Recognition, or Speaker Identification, analyzes distinctive voice characteristics to identify and verify speakers, enabling voice authentication and personalized services. However, many Speaker Recognition applications rely on cloud-based services, leading to frequent latency issues that hinder user experience. Fortunately, Picovoice's Eagle Speaker Recognition offers on-device Speaker Recognition, maintaining accuracy while bypassing the limitations inherent in cloud-based services.

Speaker Recognition typically involves two steps: Enrollment, where a speaker's voice is registered using a short clip of audio to produce a Speaker Profile, and Recognition, where the Speaker Profile is used to detect when that speaker is speaking given an audio stream.

Let's see how the Eagle Speaker Recognition Node.js SDK can be used to implement a speaker recognition app!

Setup

Install @picovoice/eagle-node using npm. We will also be using @picovoice/pvrecorder-node to get cross-platform audio, so install that as well:

npm install @picovoice/eagle-node @picovoice/pvrecorder-node

@picovoice/eagle-node will perform the speaker enrollment and recognition
@picovoice/pvrecorder-node will be used to record microphone audio

You will also need a Picovoice AccessKey, which can be obtained with a free Picovoice Console account.

Enroll a speaker

Import @picovoice/eagle-node and create an instance of the EagleProfiler class:

const { EagleProfiler } = require("@picovoice/eagle-node")

const accessKey = "${ACCESS_KEY}" // Obtained from the Picovoice Console
const eagleProfiler = new EagleProfiler(accessKey)

Now, import @picovoice/pvrecorder-node and create an instance of the recorder as well. Use the EagleProfiler's .minEnrollSamples as the frameLength, and call .start():

const { PvRecorder } = require("@picovoice/pvrecorder-node")

const enrollRecorder = new PvRecorder(eagleProfiler.minEnrollSamples)
enrollRecorder.start()

To stop recording audio, call enrollRecorder.stop()

enrollRecorder.stop()

Each call to .read() will return a single audioFrame that you can then pass to .enroll() to enroll a speaker. The return value provides feedback on the audio quality and Enrollment percentage. Use the percentage value to know when Enrollment is done and another speaker can be enrolled.

const { EnrollProgress } = require("@picovoice/eagle-node")

let percentage = 0
while (percentage < 100) {
  const audioFrame = await enrollRecorder.read()
  const result = await eagleProfiler.enroll(audioFrame)
  console.log(result.feedback)
  percentage = result.percentage
}

Once Enrollment reaches 100%, export the speaker profile to use in the next step, Speaker Recognition:

const speakerProfile = eagleProfiler.export()

The returned speakerProfile is an instance of a Uint8Array, so it can also be saved and reused if needed.

Profiles can be made for additional users by calling the .reset() function on the EagleProfiler, and repeating the .enroll() step.

Once profiles have been created for all speakers, don't forget to clean up used resources:

enrollRecorder.release()
eagleProfiler.release()

Perform recognition

Import @picovoice/eagle-node and create an instance of the Eagle class, using the speaker profile(s) created by the Enrollment step:

const { Eagle } = require("@picovoice/eagle-node")

const accessKey = "${ACCESS_KEY}" // Obtained from the Picovoice Console
const eagle = new Eagle(accessKey, speakerProfile)

Now set up PvRecorder to use with Eagle:

const { PvRecorder } = require("@picovoice/pvrecorder-node")

const recognizerRecorder = new PvRecorder(eagle.frameLength)
recognizerRecorder.start()

Pass audio frames into the eagle.process() function to get the speaker scores:

while (true) {
  const audioFrame = recognizerRecorder.read()
  const scores: number[] = eagle.process(audioFrame)
}

When finished, don't forget to clean up used resources:

recognizerRecorder.release()
eagle.release()

Putting It All Together

Here is an example program bringing together everything that has been shown so far:

const {
  EagleProfiler,
  EnrollProgress,
  Eagle,
} = require("@picovoice/eagle-node")
const { PvRecorder } = require("@picovoice/pvrecorder-node")

const accessKey = "${ACCESS_KEY}" // Obtained from the Picovoice Console (https://console.picovoice.ai/)

// Step 1: Enrollment
let eagleProfiler
let speakerProfile

try {
  eagleProfiler = new EagleProfiler(accessKey)
} catch (e) {}

const enrollRecorder = new PvRecorder(eagleProfiler.minEnrollSamples)

enrollRecorder.start()

let percentage = 0
while (percentage < 100) {
  const audioFrame = await enrollRecorder.read()
  const result = await eagleProfiler.enroll(audioFrame)
  percentage = result.percentage
}

enrollRecorder.stop()

speakerProfile = eagleProfiler.export()

enrollRecorder.release()
eagleProfiler.release()

// Step 2: Recognition
let eagle

try {
  eagle = new Eagle(accessKey, speakerProfile)
} catch (e) {}

const recognizerRecorder = new PvRecorder(eagle.frameLength)

recognizerRecorder.start()

while (true) {
  const audioFrame = recognizerRecorder.read()
  const scores: number[] = eagle.process(audioFrame)
  console.log(scores)
}

recognizerRecorder.stop()

recognizerRecorder.release()
eagle.release()

Next Steps

See the GitHub Node.js Demo for a more complete example. You can also view the Node.js API docs for details on the package.

If Node.js is not your language of choice, Eagle Speaker Recognition has SDKs in a number of different languages and platforms:

# Speaker Enrollment
o = pveagle.create_profiler(access_key)
while percentage < 100:
  percentage, feedback = o.enroll(
    get_next_enroll_audio_data())
speaker_profile = o.export()

# Speaker Recognition
eagle = pveagle.create_recognizer(
  access_key,
  speaker_profile)
while True:
  scores = eagle.process(
    get_next_audio_frame())
Build with Python

Have you seen our other Node.js tutorials? Don’t forget to check out Real-time Transcription with Node.js, Batch Transcription with Node.js, and Voice Activity Detection with Node.js.

Real-time Speaker Identification with Node.js

Setup

Enroll a speaker

Perform recognition

Putting It All Together

Next Steps

More from Picovoice