Leopard Speech-to-Text
Node.js API
API Reference for the Node.js Leopard SDK (npm)
Leopard
Class for the Leopard Speech-to-Text engine.
Leopard can be initialized using the class constructor().
Resources should be cleaned when you are done using the release() method.
Leopard.constructor()
Leopard constructor.
Parameters
accessKeystring : AccessKey obtained from Picovoice Console.optionsLeopardOptions : Optional init configuration arguments.
Returns
Leopard: An instance of Leopard platform.
Leopard.release()
Releases resources acquired by Leopard.
Leopard.sampleRate
Getter for audio sample rate accepted by Leopard.
Returns
number: Audio sample rate accepted by Leopard.
Leopard.version
Getter for version.
Returns
string: CurrentLeopardversion.
Leopard.process()
Processes given audio data with the speech-to-text engine. The incoming audio needs to have a sample rate equal
to .sampleRate and be 16-bit linearly-encoded. Leopard operates on single-channel audio.
Parameters
pcmArray<number> : Audio data.
Returns
LeopardTranscript: Inferred transcription.
Leopard.processFile()
Processes an audio file with the speech-to-text engine.
Parameters
audioPathstring : Absolute path to the audio file. The supported formats are:FLAC,MP3,Ogg,WAV,WebM,MP4/m4a (AAC), and3gp (AMR)
Returns
LeopardTranscript: Inferred transcription.
Leopard.listAvailableDevices()
Lists all available devices that Leopard can use for inference. Each entry in the list can be the device argument of the constructor.
Parameters
optionsLeopardInputOptions : Optional input configuration arguments.
Returns
- string[] : List of all available devices that Leopard can use.
LeopardOptions
Leopard init options type.
modelPathstring : Path to the file containing model parameters (.pv).devicestring? : String representation of the device (e.g., CPU or GPU) to use for inference. If set tobest, picoLLM picks the most suitable device. If set togpu, the engine uses the first available GPU device. To select a specific GPU device, set this argument togpu:${GPU_INDEX}, where${GPU_INDEX}is the index of the target GPU. If set tocpu, the engine will run on the CPU with the default number of threads. To specify the number of threads, set this argument tocpu:${NUM_THREADS}, where${NUM_THREADS}is the desired number of threads.libraryPathstring : Path to the Leopard dynamic library (.node).enableAutomaticPunctuationboolean : Whether to enable automatic punctuation. Default is false.enableDiarizationboolean : Whether to enable diarization. Set totrueto enable speaker diarization, which allows Leopard to differentiate speakers as part of the transcription process. Word metadata will include aspeaker_tagto identify unique speakers.
LeopardInputOptions
Leopard input options type.
libraryPathstring : Path to the Leopard dynamic library (.node).
LeopardWord
Object which contains a transcribed word and their associated metadata.
wordstring : Transcribed word.startSecnumber : Start of word in seconds.endSecnumber : End of word in seconds.confidencenumber : Transcription confidence. It is a number within [0, 1].speakerTagnumber : The speaker tag is-1if diarization is not enabled during initialization; otherwise, it's a non-negative integer identifying unique speakers, with0reserved for unknown speakers.
LeopardTranscript
Object which contains the transcription results of the engine:
transcriptstring : Inferred transcription.wordsLeopardWord[] : transcribed words and its associated metadata.
Errors
Exceptions thrown if an error occurs within Leopard Speech-to-Text engine.
Exceptions: