Leopard Speech-to-Text
Node.js API

API Reference for the Node.js Leopard SDK (npm)

Leopard

class Leopard {}

Class for the Leopard Speech-to-Text engine.

Leopard can be initialized using the class constructor(). Resources should be cleaned when you are done using the release() method.

Leopard.`constructor()`

Leopard.constructor(
  accessKey: string,
  options: LeopardOptions = {}
)

Leopard constructor.

Parameters

accessKey string : AccessKey obtained from Picovoice Console.
options LeopardOptions: Optional configuration arguments:
- modelPath string : Path to the file containing model parameters (.pv).
- libraryPath string : Path to the Leopard dynamic library (.node).
- enableAutomaticPunctuation boolean : Whether to enable automatic punctuation. Default is false.
- enableDiarization boolean : Whether to enable diarization. Set to true to enable speaker diarization, which allows Leopard to differentiate speakers as part of the transcription process. Word metadata will include a speaker_tag to identify unique speakers.

Returns

Leopard: An instance of Leopard platform.

Leopard.`release()`

Leopard.release()

Releases resources acquired by Leopard.

Leopard.`sampleRate`

Leopard.sampleRate()

Getter for audio sample rate accepted by Leopard.

Returns

number: Audio sample rate accepted by Leopard.

Leopard.`version`

Leopard.version()

Getter for version.

Returns

string: Current Leopard version.

Leopard.`process()`

Leopard.process(pcm)

Processes given audio data with the speech-to-text engine. The incoming audio needs to have a sample rate equal to .sampleRate and be 16-bit linearly-encoded. Leopard operates on single-channel audio.

Parameters

pcm Array<number> : Audio data.

Returns

LeopardTranscript: Inferred transcription.

Leopard.`processFile()`

Leopard.processFile(audioPath)

Processes an audio file with the speech-to-text engine.

Parameters

audioPath string : Absolute path to the audio file. The supported formats are: FLAC, MP3, Ogg, WAV, WebM, MP4/m4a (AAC), and 3gp (AMR)

Returns

LeopardTranscript: Inferred transcription.

LeopardWord

LeopardWord

Object which contains a transcribed word and their associated metadata.

word string : Transcribed word.
startSec number : Start of word in seconds.
endSec number : End of word in seconds.
confidence number : Transcription confidence. It is a number within [0, 1].
speakerTag number : The speaker tag is -1 if diarization is not enabled during initialization; otherwise, it's a non-negative integer identifying unique speakers, with 0 reserved for unknown speakers.

LeopardTranscript

LeopardTranscript

Object which contains the transcription results of the engine:

transcript string : Inferred transcription.
words LeopardWord[] : transcribed words and its associated metadata.

Errors

Exceptions thrown if an error occurs within Leopard Speech-to-Text engine.

Exceptions:

class PvStatusOutOfMemoryError        extends Error {}
class PvStatusIoError                 extends Error {}
class PvStatusInvalidArgumentError    extends Error {}
class PvStatusStopIterationError      extends Error {}
class PvStatusKeyError                extends Error {}
class PvStatusInvalidStateError       extends Error {}
class PvStatusRuntimeError            extends Error {}
class PvStatusActivationError         extends Error {}
class PvStatusActivationLimitReached  extends Error {}
class PvStatusActivationThrottled     extends Error {}
class PvStatusActivationRefused       extends Error {}

Was this doc helpful?

Issue with this doc?

Leopard Speech-to-Text Node.js API

Leopard Speech-to-Text
Node.js API