This article serves as a comprehensive guide for adding on-device Speech Recognition
to an Android app.
In the world of software, there is often confusion about the exact meaning of Speech Recognition. Most assume it refers solely to Speech-to-Text
features. However, Speech-to-Text represents only a single facet of Speech Recognition. Additional technologies under to umbrella of Speech Recognition include Wake Word Detection
, Voice Command Recognition
, and Voice Activity Detection
(VAD
).
Here's a handy guide for selecting an appropriate Speech Recognition approach for your Android application:
- Identify if a person is speaking and when -> Cobra VAD
- Recognize specific phrases or words -> Porcupine Wake Word
- Understand voice commands and extracting intent with details (i.e. slot values) -> Rhino Speech-to-Intent
- Transcribe speech to text in real time -> Cheetah Streaming Speech-to-Text
- Batch speech to text transcription of large volumes of audio data -> Leopard Speech-to-Text
There are also SDKs available for iOS
, as well as cross-platform mobile frameworks Flutter
and React Native
.
Cobra VAD
- To integrate the
Cobra VAD
SDK into your Android project, ensure you have includedmavenCentral()
in your top-levelbuild.gradle
file, then add the following dependency to your app’sbuild.gradle
file:
Sign up for a free Picovoice Console account and obtain your
AccessKey
. TheAccessKey
is only required for authentication and authorization.Add the following to the app's AndroidManifest.xml file to enable recording with an Android device's microphone.
- Create an instance of the VAD engine:
- Find the probability of voice by passing in audio frames to the
.process
function:
For further details, visit the Cobra VAD product page or refer to Cobra's Android SDK quick start guide.
Porcupine Wake Word
- To integrate the
Porcupine Wake Word
SDK into your Android project, ensure you have includedmavenCentral()
in your top-levelbuild.gradle
file, then add the following dependency to your app’sbuild.gradle
file:
Sign up for a free Picovoice Console account and obtain your
AccessKey
. TheAccessKey
is only required for authentication and authorization.Add the following to the app's AndroidManifest.xml file to enable recording with an Android device's microphone.
Create a custom wake word model using Picovoice Console.
Download the
.ppn
model file and copy it into your Android assets folder (${ANDROID_APP}/src/main/assets
).Initialize the Porcupine Wake Word engine with the
.ppn
file name (or path relative to theassets
folder):
- Detect the keyword by passing in audio frames to the
.process
function:
For further details, visit the Porcupine Wake Word product page or refer to Porcupine's Android SDK quick start guide.
Rhino Speech-to-Intent
- To integrate the
Rhino Speech-to-Intent
SDK into your Android project, ensure you have includedmavenCentral()
in your top-levelbuild.gradle
file, then add the following dependency to your app’sbuild.gradle
file:
Sign up for a free Picovoice Console account and obtain your
AccessKey
. TheAccessKey
is only required for authentication and authorization.Add the following to the app's AndroidManifest.xml file to enable recording with an Android device's microphone.
Create a custom context model using Picovoice Console.
Download the
.rhn
model file and copy it into your Android assets folder (${ANDROID_APP}/src/main/assets
).Initialize the Rhino Speech-to-Intent engine with the
.rhn
file name (or path relative to theassets
folder):
- Infer the user's intent by passing in audio frames to the
.process
function:
For further details, visit the Rhino Speech-to-Intent product page or refer to Rhino's Android SDK quick start guide.
Cheetah Streaming Speech-to-Text
- To integrate the
Cheetah Streaming Speech-to-Text
SDK into your Android project, ensure you have includedmavenCentral()
in your top-levelbuild.gradle
file, then add the following dependency to your app’sbuild.gradle
file:
Sign up for a free Picovoice Console account and obtain your
AccessKey
. TheAccessKey
is only required for authentication and authorization.Add the following to the app's AndroidManifest.xml file to enable recording with an Android device's microphone.
Download the
.pv
language model file from the Cheetah GitHub repository and copy it into your Android assets folder (${ANDROID_APP}/src/main/assets
).Initialize the Cheetah Streaming Speech-to-Text engine with the
.pv
file name (or path relative to theassets
folder):
- Transcribe speech to text in real time by passing in audio frames to the
.process
function:
For further details, visit the Cheetah Streaming Speech-to-Text product page or refer to Cheetah's Android SDK quick start guide.
Leopard Speech-to-Text
- To integrate the
Leopard Speech-to-Text
SDK into your Android project, ensure you have includedmavenCentral()
in your top-levelbuild.gradle
file, then add the following dependency to your app’sbuild.gradle
file:
Sign up for a free Picovoice Console account and obtain your
AccessKey
. TheAccessKey
is only required for authentication and authorization.Add the following to the app's AndroidManifest.xml file to enable recording with an Android device's microphone.
Download the
.pv
language model file from the Leopard GitHub repository and copy it into your Android assets folder (${ANDROID_APP}/src/main/assets
).Create an instance of Leopard for speech-to-text transcription:
- Transcribe speech to text by passing an audio file to the
.processFile
function:
For further details, visit the Leopard Speech-to-Text product page or refer to Leopard's Android SDK quick start guide.