Speaker Diarization
technology is a process that automatically segments and labels an audio recording based on different speakers' voices. It is often used in applications that involve transcription and analysis, in settings such as call centers, meetings, and broadcast media.
Picovoice's Falcon Speaker Diarization provides a fast and easy method for performing diarization on device.
The Falcon Speaker Diarization
engine is available for Android versions 5.0 (SDK 21) and later.
Falcon Speaker Diarization Android SDK
To integrate the Falcon Speaker Diarization
Android SDK into your Android project, ensure you have included mavenCentral()
in your top-level build.gradle
file, then add the following dependency to your app’s build.gradle
file:
This example uses AndroidVoiceProcessor to record audio.
Sign up for Picovoice Console
Sign up for Picovoice Console for free and copy your AccessKey
. It handles authentication and authorization.
Usage
Permissions
To enable AccessKey
validation and recording with your Android device's microphone, add the following to the app's AndroidManifest.xml file:
Initialization
Create an instance of the engine with the Falcon Builder
class by passing in the AccessKey
from the previous step and the Android app context,
and get the singleton instance of VoiceProcessor
:
Recording Audio Frames
Falcon Speaker Diarization
processes audio in chunks, also known as audio frames
. The .frameLength
property gives the number of audio samples per frame that are required by Falcon
, while the .sampleRate
property gives the audio sample rate that is required. Audio samples must be 16-bit integers.
Use VoiceProcessor.addFrameListener
to add a listener to VoiceProcessor
that receives audio frames and passes them along to Falcon
for processing:
Start processing audio using VoiceProcessor.start
by passing in the desired frame length and Falcon
's audio sample rate as arguments:
This will start VoiceProcessor
and the audio frames are passed to the listeners as mentioned above.
To stop processing audio, call VoiceProcessor.stop
:
Processing Audio Frames
Falcon
's .process()
method takes in a short[]
, so simply convert the ArrayList
to the required format and pass it to falcon
for processing:
The returned segments
variable represents an array of segments, each of which includes the segment's timing and speaker information.
Clean up
Call VoiceProcessor.clearFrameListeners
and Falcon.delete
to clear any allocated resources:
Working Example
For a complete working project, take a look at the Falcon Speaker Diarization Android Demo.
For more information, check out the Falcon Speaker Diarization product page or refer to the Falcon Speaker Diarization Android SDK quick start guide.
Start Building