Leopard Speech-to-Text
Web API
API Reference for the Leopard Web SDK (leopard-web)
Leopard
Class for the Leopard Speech-to-Text engine.
Leopard.create()
Creates an instance of Leopard Speech-to-Text engine using '.pv' file in public directory. The model size is large, hence it will try to use the existing one if it exists, otherwise saves the model in storage.
Parameters
accessKeystring : AccessKey obtained from Picovoice Console.modelLeopardModel : Leopard model options.optionsLeopardOptions : Optional configuration arguments.
Returns
Leopard: An instance of the Leopard engine.
Leopard.process()
Processes audio. The required sample rate can be retrieved from '.sampleRate'. The audio needs to be 16-bit linearly-encoded. Furthermore, the engine operates on single-channel audio.
Parameters
pcmInt16Array : Audio data.
Returns
LeopardTranscript: The inferred transcript with metadata.
Leopard.release()
Releases resources acquired by the Leopard Web SDK.
Leopard.sampleRate
Audio sample rate accepted by Leopard.
Leopard.version
Leopard version string.
LeopardModel
Leopard model type.
base64string: The model file (.pv) in base64 string to initialize Leopard.publicPathstring: The model file (.pv) path relative to the public directory.customWritePathstring : Custom path to save the model in storage. Set to a different name to use multiple models acrossleopardinstances.forceWriteboolean : Flag to overwrite the model in storage even if it exists.versionnumber : Version of the model file. Increment to update the model file in storage.
LeopardOptions
Leopard options type.
enableAutomaticPunctuationboolean : Flag to enable automatic punctuation insertion.enableDiarizationboolean : Flag to enable automatic punctuation insertion. Set totrueto enable speaker diarization, which allows Leopard to differentiate speakers as part of the transcription process. Word metadata will include aspeakerTagto identify unique speakers.
LeopardTranscript
Leopard transcript type.
transcriptstring : Inferred transcript of process.wordsLeopardWord[] : Metadata of the transcript.
LeopardWord
Leopard metadata type.
wordstring : A word in the transcript.startSecnumber : Position in seconds where the word starts.endSecnumber : Position in seconds where the word ends.confidencenumber : Number between 0 and 1, indication the confidence level of the word.speakerTagnumber : Speaker tag is-1if diarization is not enabled during initialization; otherwise, it's a non-negative integer identifying unique speakers, with0reserved for unknown speakers.
LeopardWorker
A class for creating new instances of the LeopardWorker.
LeopardWorker.create()
Creates an instance of LeopardWorker using '.pv' file in public directory. The model size is large, hence it will try to use the existing one if it exists, otherwise saves the model in storage.
Parameters
accessKeystring : AccessKey obtained from Picovoice Console.modelLeopardModel : Leopard model options.optionsLeopardOptions : Optional configuration arguments.
Returns
LeopardWorker: An instance ofLeopardWorker.
LeopardWorker.process()
Processes audio. The required sample rate can be retrieved from '.sampleRate'. The audio needs to be 16-bit linearly-encoded. Furthermore, the engine operates on single-channel audio.
Parameters
pcmInt16Array : Audio data.optionsObject : Optional process arguments.options.transferboolean : Optional flag to indicate if the buffer should be transferred or not. If set to true, input buffer array will be transferred to the worker.options.transferCallback(pcm: Int16Array) => void : Optional callback containing a new Int16Array with contents frompcm. Use this callback to get the input pcm when using transfer.
Returns
LeopardTranscript: The inferred transcript with metadata.
LeopardWorker.release()
Releases resources acquired by the Leopard Web SDK.
LeopardWorker.terminate()
Force terminates the instance of LeopardWorker.
LeopardWorker.sampleRate
Audio sample rate accepted by Leopard.
LeopardWorker.version
Leopard version string.