Leopard Speech-to-Text
Java API
API Reference for the Java Leopard SDK (leopard-java)
package: ai.picovoice.leopard
Leopard
Class for the Leopard Speech-to-Text engine.
Leopard must be initialized using the Leopard.Builder() Class. Resources should be
cleaned when you are done using the delete() function.
Leopard.delete()
Releases resources acquired by Leopard.
Leopard.getSampleRate()
Getter for required audio sample rate for PCM data.
Returns
int: Required audio sample rate for PCM data.
Leopard.getVersion()
Getter for version.
Returns
String: CurrentLeopardversion.
Leopard.process()
Processes given audio data and returns its transcription. The incoming audio needs to have a sample rate equal
to .getSampleRate() and be 16-bit linearly-encoded. Furthermore, Leopard
operates on single channel audio. If you wish to process data in a different sample rate or format consider
using .processFile().
Parameters
pcmshort[] : A frame of audio samples.
Returns
LeopardTranscript: Inferred transcription and word metadata.
Throws
LeopardException: if there is an error while processing the audio frame.
Leopard.processFile()
Processes a given audio file and returns its transcription.
Parameters
pathString : Absolute path to the audio file. The supported audio file formats are:3gp (AMR),FLAC,MP3,MP4/m4a (AAC),Ogg,WAVandWebM.
Returns
LeopardTranscript: Inferred transcription and word metadata.
Throws
LeopardException: if there is an error while processing the audio file.
Leopard.getAvailableDevices()
Retrieves a list of available hardware devices that Leopard can use to run inference.
Parameters
libraryPathString : Path to a native Leopard library. Set tonullto use default library.
Returns
String[]: List of available hardware devices thatLeopardcan use to run inference.
Throws
LeopardException: If the library file cannot be loaded.
Leopard.getAvailableDevices()
Retrieves a list of available hardware devices that Leopard can use to run inference.
Returns
String[]: List of available hardware devices thatLeopardcan use to run inference.
Throws
LeopardException: If the default library file cannot be loaded.
Leopard.Builder
Builder for creating an instance of Leopard with a mixture of default arguments.
Parameters
accessKeyString : AccessKey obtained from Picovoice Console.
Leopard.Builder.build()
Creates an instance of Leopard Speech-to-Text engine.
Returns
Leopard: An instance of Leopard Speech-to-Text engine.
Throws
LeopardException: If an error occurs while creating an instance of Leopard Speech-to-Text engine.
Leopard.Builder.setAccessKey()
Sets the AccessKey of the builder.
Parameters
accessKeyString : AccessKey obtained from Picovoice Console.
Returns
Leopard.Builder: Modified Leopard.Builder object.
Leopard.Builder.setModelPath()
Sets the model path of the builder.
Parameters
modelPathString : Absolute path to the file containing model parameters (.pv).
Returns
Leopard.Builder: Modified Leopard.Builder object.
Leopard.Builder.setDevice()
Sets the device of the builder. If not set it will be set to the default device.
Parameters
deviceString : String representation of the device (e.g., CPU or GPU) to use. If set tobest, the most suitable device is selected automatically. If set togpu, the engine uses the first available GPU device. To select a specific GPU device, set this argument togpu:${GPU_INDEX}, where${GPU_INDEX}is the index of the target GPU. If set tocpu, the engine will run on the CPU with the default number of threads. To specify the number of threads, set this argument tocpu:${NUM_THREADS}, where${NUM_THREADS}is the desired number of threads.
Returns
Leopard.Builder: The instance of Leopard.Builder object.
Leopard.Builder.setLibraryPath()
Sets the library path of the builder.
Parameters
libraryPathString : Absolute path to the native Leopard library.
Returns
Leopard.Builder: Modified Leopard.Builder object.
Leopard.Builder.setEnableAutomaticPunctuation()
Setter for enabling automatic punctuation insertion.
Parameters
enableAutomaticPunctuationboolean : Set totrueto enable automatic punctuation insertion.
Returns
Leopard.Builder: Modified Leopard.Builder object.
Leopard.Builder.setEnableDiarization()
Setter for enabling speaker diarization.
Parameters
enableDiarizationboolean : Set totrueto enable speaker diarization, which allows Leopard to differentiate speakers as part of the transcription process. Word metadata will include aspeakerTagto identify unique speakers.
Returns
Leopard.Builder: Modified Leopard.Builder object.
LeopardTranscript
Class that contains transcription results returned from Leopard.process()
and Leopard.processFile().
Parameters
transcriptStringString : Inferred transcription.wordArrayLeopardTranscript.Word[] : Transcribed words and their associated metadata.
LeopardTranscript.getTranscriptString()
Getter for the inferred transcription.
Returns
String: Inferred transcription.
LeopardTranscript.getWordArray()
Getter for transcribed words and their associated metadata.
Returns
LeopardTranscript.Word[]: Transcribed words and their associated metadata.
LeopardTranscript.Word
Class for storing word metadata from a LeopardTranscript.
Parameters
wordString : Transcribed word.confidencefloat : Transcription confidence. It is a number within [0, 1].startSecfloat : Start of word in seconds.endSecfloat : End of word in seconds.speakerTagint : The speaker tag is-1if diarization is not enabled during initialization; otherwise, it's a non-negative integer identifying unique speakers, with0reserved for unknown speakers.
LeopardTranscript.Word.getWord()
Getter for the transcribed word.
Returns
String: Transcribed word.
LeopardTranscript.Word.getConfidence()
Getter for the transcription confidence.
Returns
float: Transcription confidence. It is a number within [0, 1].
LeopardTranscript.Word.getStartSec()
Getter for the start of word in seconds.
Returns
float: Start of word in seconds.
LeopardTranscript.Word.getEndSec()
Getter for the end of word in seconds.
Returns
float: End of word in seconds.
LeopardTranscript.Word.getSpeakerTag()
Getter for the speaker tag.
Returns
int: Speaker tag associated with speaker.
LeopardException
Exception thrown if an error occurs within Leopard Speech-to-Text engine.
Exceptions: