Leopard Speech-to-Text
Android Quick Start

Platforms

Android (5.0+, API 21+)

Requirements

Picovoice Account and AccessKey
Android Studio
Android device with USB debugging enabled or Android simulator

Picovoice Account & AccessKey

Signup or Login to Picovoice Console to get your AccessKey. Make sure to keep your AccessKey secret.

Quick Start

Setup

Install Android Studio.
Include mavenCentral() repository in the top-level build.gradle. Then add the following to the app's build.gradle:

dependencies {
    // ...
    implementation 'ai.picovoice:leopard-android:${LATEST_VERSION}' // replace with latest version
}

Add the following to the app's AndroidManifest.xml file to enable recording with an Android device's microphone:

<uses-permission android:name="android.permission.INTERNET" />
<uses-permission android:name="android.permission.RECORD_AUDIO" />

Model File

Add the Leopard Speech-to-Text model file to your Android application:

Create a custom model using the Picovoice Console or use a default language model.
Add the model as a bundled resource by placing it under the ${ANDROID_APP}/src/main/assets directory of your Android project.

Usage

Create an instance of the engine with the Leopard Speech-to-Text Builder by passing in your AccessKey, model file and the Android app context:

import ai.picovoice.leopard.*;

final String accessKey = "${ACCESS_KEY}"; // AccessKey provided by Picovoice Console (https://console.picovoice.ai/)
final String modelPath = "${MODEL_PATH}"; // path relative to the assets folder or absolute path to file on device

try {
    Leopard leopard = new Leopard.Builder()
      .setAccessKey(accessKey)
      .setModelPath(modelPath)
      .build(appContext);
} catch (LeopardException ex) { }

Transcribe an audio file by providing the absolute path to the file on device:

File audioFile = new File("${AUDIO_FILE_PATH}");
LeopardTranscript transcript = leopard.processFile(audioFile.getAbsolutePath());

Transcribe raw audio data (sample rate of 16 kHz, 16-bit linearly encoded and 1 channel):

short[] getAudioData() {
    // ...
}
LeopardTranscript transcript = leopard.process(getAudioData());

When done, release resources explicitly:

leopard.delete();

Word Metadata

Along with the transcript, Leopard Speech-to-Text returns metadata for each transcribed word. Available metadata items are:

Start Time: Indicates when the word started in the transcribed audio. Value is in seconds.
End Time: Indicates when the word ended in the transcribed audio. Value is in seconds.
Confidence: Leopard Speech-to-Text's confidence that the transcribed word is accurate. It is a number within [0, 1].
Speaker Tag: If speaker diarization is enabled on initialization, the speaker tag is a non-negative integer identifying unique speakers, with 0 reserved for unknown speakers. If speaker diarization is not enabled, the value will always be -1.

Demo

For the Leopard Speech-to-Text Android SDK, we offer demo applications that demonstrate how to use the Speech-to-Text engine on audio recordings.

Setup

Clone the Leopard Speech-to-Text repository from GitHub using HTTPS:

git clone --recurse-submodules https://github.com/Picovoice/leopard.git

Usage

Open the Android demo using Android Studio.
Copy your AccessKey from Picovoice Console into the ACCESS_KEY variable in MainActivity.java.
Go to Build > Select Build Variant... and select the language you would like to run the demo in (e.g. enDebug -> English, itRelease -> Italian)
Run the application using a connected Android device or using an Android simulator.

Resources

Package

leopard-android on Maven Central

API

leopard-android API Docs

GitHub

Benchmark

Speech-to-Text Benchmark

Was this doc helpful?

Issue with this doc?

Leopard Speech-to-Text Android Quick Start

Platforms

Requirements

Picovoice Account & AccessKey

Quick Start

Setup

Model File

Usage

Word Metadata

Demo

Setup

Usage

Resources

Package

API

GitHub

Benchmark

Leopard Speech-to-Text
Android Quick Start