Leopard Speech-to-Text
Android Quick Start
Platforms
- Android (5.0+, API 21+)
Requirements
- Picovoice Account and AccessKey
- Android Studio
- Android device with USB debugging enabled or Android simulator
Picovoice Account & AccessKey
Signup or Login to Picovoice Console to get your AccessKey
.
Make sure to keep your AccessKey
secret.
Quick Start
Setup
Install Android Studio.
Include
mavenCentral()
repository in the top-levelbuild.gradle
. Then add the following to the app'sbuild.gradle
:
- Add the following to the app's
AndroidManifest.xml
file to enable recording with an Android device's microphone:
Model File
Add the Leopard Speech-to-Text model file to your Android application:
- Create a custom model using the Picovoice Console or use a default language model.
- Add the model as a bundled resource by placing it under the
${ANDROID_APP}/src/main/assets
directory of your Android project.
Usage
Create an instance of the engine with the Leopard Speech-to-Text Builder by passing in your AccessKey, model file and the Android app context:
Transcribe an audio file by providing the absolute path to the file on device:
Transcribe raw audio data (sample rate of 16 kHz, 16-bit linearly encoded and 1 channel):
When done, release resources explicitly:
Word Metadata
Along with the transcript, Leopard Speech-to-Text returns metadata for each transcribed word. Available metadata items are:
- Start Time: Indicates when the word started in the transcribed audio. Value is in seconds.
- End Time: Indicates when the word ended in the transcribed audio. Value is in seconds.
- Confidence: Leopard Speech-to-Text's confidence that the transcribed word is accurate. It is a number within
[0, 1]
. - Speaker Tag: If speaker diarization is enabled on initialization, the speaker tag is a non-negative integer identifying unique speakers, with
0
reserved for unknown speakers. If speaker diarization is not enabled, the value will always be-1
.
Demo
For the Leopard Speech-to-Text Android SDK, we offer demo applications that demonstrate how to use the Speech-to-Text engine on audio recordings.
Setup
Clone the Leopard Speech-to-Text repository from GitHub using HTTPS:
Usage
- Open the Android demo using Android Studio.
- Copy your
AccessKey
from Picovoice Console into theACCESS_KEY
variable in MainActivity.java. - Go to
Build > Select Build Variant...
and select the language you would like to run the demo in (e.g. enDebug -> English, itRelease -> Italian) - Run the application using a connected Android device or using an Android simulator.