Leopard Speech-to-Text
Web API

API Reference for the Leopard Web SDK (leopard-web)

Leopard

class Leopard {}

Class for the Leopard Speech-to-Text engine.

Leopard.`create()`

static async function create(
  accessKey: string,
  model: LeopardModel,
  options: LeopardOptions = {}
): Promise<Leopard>

Creates an instance of Leopard Speech-to-Text engine using '.pv' file in public directory. The model size is large, hence it will try to use the existing one if it exists, otherwise saves the model in storage.

Parameters

accessKey string : AccessKey obtained from Picovoice Console.
model LeopardModel : Leopard model options.
options LeopardOptions : Optional configuration arguments.

Returns

Leopard : An instance of the Leopard engine.

Leopard.`process()`

async function process(pcm: Int16Array): Promise<LeopardTranscript>

Processes audio. The required sample rate can be retrieved from '.sampleRate'. The audio needs to be 16-bit linearly-encoded. Furthermore, the engine operates on single-channel audio.

Parameters

pcm Int16Array : Audio data.

Returns

LeopardTranscript : The inferred transcript with metadata.

Leopard.`release()`

async function release(): Promise<void>

Releases resources acquired by the Leopard Web SDK.

Leopard.`sampleRate`

get sampleRate(): number

Audio sample rate accepted by Leopard.

Leopard.`version`

get version(): string

Leopard version string.

Leopard.`listAvailableDevices()`

public static async listAvailableDevices(): Promise<string[]>

Lists all available devices that Leopard can use for inference. Each entry in the list can be the used as the device argument for the .create() method.

Returns

string[] : List of all available devices that Leopard can use for inference.

LeopardModel

type LeopardModel = {
  base64?: string;
  publicPath?: string;
  customWritePath?: string;
  forceWrite?: boolean;
  version?: number;
}

Leopard model type.

base64 string: The model file (.pv) in base64 string to initialize Leopard.
publicPath string: The model file (.pv) path relative to the public directory.
customWritePath string : Custom path to save the model in storage. Set to a different name to use multiple models across leopard instances.
forceWrite boolean : Flag to overwrite the model in storage even if it exists.
version number : Version of the model file. Increment to update the model file in storage.

LeopardOptions

type LeopardOptions = {
  deivce?: string;
  enableAutomaticPunctuation?: boolean;
  enableDiarization?: boolean;
}

Leopard options type.

device string : Optional. String representation of the device (e.g., CPU or GPU) to use. If set to best, the most suitable device is selected automatically. If set to gpu, the engine uses the first available GPU device. To select a specific GPU device, set this argument to gpu:${GPU_INDEX}, where ${GPU_INDEX} is the index of the target GPU. If set to cpu, the engine will run on the CPU with the default number of threads. To specify the number of threads, set this argument to cpu:${NUM_THREADS}, where ${NUM_THREADS} is the desired number of threads.
enableAutomaticPunctuation boolean : Flag to enable automatic punctuation insertion.
enableDiarization boolean : Flag to enable automatic punctuation insertion. Set to true to enable speaker diarization, which allows Leopard to differentiate speakers as part of the transcription process. Word metadata will include a speakerTag to identify unique speakers.

LeopardTranscript

type LeopardTranscript = {
  transcript: string;
  words: LeopardWord[];
}

Leopard transcript type.

transcript string : Inferred transcript of process.
words LeopardWord[] : Metadata of the transcript.

LeopardWord

type LeopardWord = {
  word: string;
  startSec: number;
  endSec: number;
  confidence: number;
  speakerTag: number;
}

Leopard metadata type.

word string : A word in the transcript.
startSec number : Position in seconds where the word starts.
endSec number : Position in seconds where the word ends.
confidence number : Number between 0 and 1, indication the confidence level of the word.
speakerTag number : Speaker tag is -1 if diarization is not enabled during initialization; otherwise, it's a non-negative integer identifying unique speakers, with 0 reserved for unknown speakers.

LeopardWorker

class LeopardWorker {}

A class for creating new instances of the LeopardWorker.

LeopardWorker.`create()`

static async create(
  accessKey: string,
  model: LeopardModel,
  options: LeopardOptions = {},
): Promise<LeopardWorker>

Creates an instance of LeopardWorker using '.pv' file in public directory. The model size is large, hence it will try to use the existing one if it exists, otherwise saves the model in storage.

Parameters

accessKey string : AccessKey obtained from Picovoice Console.
model LeopardModel : Leopard model options.
options LeopardOptions : Optional configuration arguments.

Returns

LeopardWorker : An instance of LeopardWorker.

LeopardWorker.`process()`

async function process(
  pcm: Int16Array,
  options?: { transfer?: boolean, transferCallback?: (data: Int16Array) => void }
): Promise<LeopardTranscript>

Processes audio. The required sample rate can be retrieved from '.sampleRate'. The audio needs to be 16-bit linearly-encoded. Furthermore, the engine operates on single-channel audio.

Parameters

pcm Int16Array : Audio data.
options Object : Optional process arguments.
options.transfer boolean : Optional flag to indicate if the buffer should be transferred or not. If set to true, input buffer array will be transferred to the worker.
options.transferCallback (pcm: Int16Array) => void : Optional callback containing a new Int16Array with contents from pcm. Use this callback to get the input pcm when using transfer.

Returns

LeopardTranscript : The inferred transcript with metadata.

LeopardWorker.`release()`

async function release(): Promise<void>

Releases resources acquired by the Leopard Web SDK.

LeopardWorker.`terminate()`

async function terminate(): Promise<void>

Force terminates the instance of LeopardWorker.

LeopardWorker.`sampleRate`

get sampleRate(): number

Audio sample rate accepted by Leopard.

LeopardWorker.`version`

get version(): string

Leopard version string.

Was this doc helpful?

Issue with this doc?

Leopard Speech-to-Text Web API

Leopard Speech-to-Text
Web API