Speaker Diarization in Android Tutorial

🚀 Best-in-class Voice AI!

Build compliant and low-latency AI applications running entirely on mobile without sharing user data with 3rd parties.

Speaker Diarization technology is a process that automatically segments and labels an audio recording based on different speakers' voices. It is often used in applications that involve transcription and analysis, in settings such as call centers, meetings, and broadcast media.

Picovoice's Falcon Speaker Diarization provides a fast and easy method for performing diarization on device.

The Falcon Speaker Diarization engine is available for Android versions 5.0 (SDK 21) and later.

Falcon Speaker Diarization Android SDK

To integrate the Falcon Speaker Diarization Android SDK into your Android project, ensure you have included mavenCentral() in your top-level build.gradle file, then add the following dependency to your app’s build.gradle file:

dependencies {
    implementation 'ai.picovoice:falcon-android:${LATEST_VERSION}'
    implementation 'ai.picovoice:android-voice-processor:${LATEST_VERSION}'
}

This example uses AndroidVoiceProcessor to record audio.

Check out falcon-android on Maven Central Repository and android-voice-processor on Maven Central Repository.

Usage

Permissions

To enable AccessKey validation and recording with your Android device's microphone, add the following to the app's AndroidManifest.xml file:

<uses-permission android:name="android.permission.INTERNET" />
<uses-permission android:name="android.permission.RECORD_AUDIO" />

Initialization

Create an instance of the engine with the Falcon Builder class by passing in the AccessKey from the previous step and the Android app context, and get the singleton instance of VoiceProcessor:

import ai.picovoice.falcon.*;

final String accessKey = "${ACCESS_KEY}"; // AccessKey obtained from Picovoice Console
try {
    Falcon falcon = new Falcon.Builder()
        .setAccessKey(accessKey)
        .build(appContext);
        
    VoiceProcessor voiceProcessor = VoiceProcessor.getInstance();
} catch (FalconException ex) { }

Recording Audio Frames

Falcon Speaker Diarization processes audio in chunks, also known as audio frames. The .frameLength property gives the number of audio samples per frame that are required by Falcon, while the .sampleRate property gives the audio sample rate that is required. Audio samples must be 16-bit integers.

Use VoiceProcessor.addFrameListener to add a listener to VoiceProcessor that receives audio frames and passes them along to Falcon for processing:

import ai.picovoice.android.voiceprocessor.*;

private final ArrayList<Short> pcmData = new ArrayList<>();

voiceProcessor.addFrameListener(frame -> {
    for (short sample : frame) {
        pcmData.add(sample);
    }
});

Start processing audio using VoiceProcessor.start by passing in the desired frame length and Falcon's audio sample rate as arguments:

private static final int FRAME_LENGTH = 512;
voiceProcessor.start(FRAME_LENGTH, falcon.getSampleRate());

This will start VoiceProcessor and the audio frames are passed to the listeners as mentioned above.

To stop processing audio, call VoiceProcessor.stop:

voiceProcessor.stop();

Processing Audio Frames

Falcon's .process() method takes in a short[], so simply convert the ArrayList to the required format and pass it to falcon for processing:

short[] pcmDataArray = new short[pcmData.size()];
for (int i = 0; i < pcmData.size(); i++) {
    pcmDataArray[i] = pcmData.get(i);
}

FalconSegment[] segments = falcon.process(pcmDataArray);
for (FalconSegment segment : segments) {
  System.out.format(
     "%5d - %5.2f - %5.2f\n",
     segment.getSpeakerTag(),
     segment.getStartSec(),
     segment.getEndSec());
}

The returned segments variable represents an array of segments, each of which includes the segment's timing and speaker information.

Clean up

Call VoiceProcessor.clearFrameListeners and Falcon.delete to clear any allocated resources:

voiceProcessor.clearFrameListeners();
falcon.delete();

Working Example

For a complete working project, take a look at the Falcon Speaker Diarization Android Demo.

For more information, check out the Falcon Speaker Diarization product page or refer to the Falcon Speaker Diarization Android SDK quick start guide.

Start Building

Android Speaker Diarization

Falcon Speaker Diarization Android SDK

Usage

Permissions

Initialization

Recording Audio Frames

Processing Audio Frames

Clean up

Working Example

More from Picovoice

Android Speaker Diarization

Falcon Speaker Diarization Android SDK

Sign up for Picovoice Console

Usage

Permissions

Initialization

Recording Audio Frames

Processing Audio Frames

Clean up

Working Example

More from Picovoice