Leopard Speech-to-Text
React Native API

API Reference for the React Native Leopard SDK (npm)

Leopard

class Leopard { }

Class for the Leopard Speech-to-Text engine.

Leopard.`getAvailableDevices()`

public static async getAvailableDevices(): Promise<string[]>

Gets all available devices that Leopard can use for inference. Each entry in the list can be the device argument of the constructor.

Returns

Promise<string[]>: Array of all available devices that Leopard can use for inference.

Leopard.`create()`

public static async create(
  accessKey: string,
  modelPath: string,
  device?: string,
  options: LeopardOptions = {}
): Promise<Leopard>

Leopard constructor.

Parameters

accessKey string : AccessKey obtained from Picovoice Console.
modelPath string : Path to the file containing model parameters (.pv). Can be either a path that is relative to the assets/resource folder or an absolute path to the file on device.
device string : Optional. String representation of the device (e.g., CPU or GPU) to use for inference. If set to best, the most suitable device is selected automatically. If set to gpu, the engine uses the first available GPU device. To select a specific GPU device, set this argument to gpu:${GPU_INDEX}, where ${GPU_INDEX} is the index of the target GPU. If set to cpu, the engine will run on the CPU with the default number of threads. To specify the number of threads, set this argument to cpu:${NUM_THREADS}, where ${NUM_THREADS} is the desired number of threads.
options LeopardOptions : Optional configuration arguments:
- enableAutomaticPunctuation boolean : Whether to enable automatic punctuation.
- enableDiarization boolean : Whether to enable diarization. Set to true to enable speaker diarization, which allows Leopard to differentiate speakers as part of the transcription process. Word metadata will include a speakerTag to identify unique speakers.

Returns

Promise<Leopard>: An instance of Leopard platform.

Leopard.`delete()`

async delete()

Releases resources acquired by Leopard.

Leopard.`sampleRate`

get sampleRate()

Getter for audio sample rate accepted by Leopard.

Returns

number: Audio sample rate accepted by Leopard.

Leopard.`version`

get version()

Getter for version.

Returns

string: Current Leopard version.

Leopard.`process()`

async process(frame: number[]): Promise<LeopardTranscript>

Processes given audio data with the speech-to-text engine. The incoming audio needs to have a sample rate equal to .sampleRate and be 16-bit linearly-encoded. Leopard operates on single-channel audio.

Parameters

frame number[] : A frame of audio samples.

Returns

Promise<LeopardTranscript>: LeopardTranscript object which contains the transcription results of the engine.

Leopard.`processFile()`

async processFile(audioPath: string): Promise<string>

Processes an audio file with the speech-to-text engine.

Parameters

audioPath string : Absolute path to the audio file. The supported formats are: 3gp (AMR), FLAC, MP3, MP4/m4a (AAC), Ogg, WAV and WebM.

Returns

Promise<LeopardTranscript>: LeopardTranscript object which contains the transcription results of the engine.

LeopardError

class LeopardError extends Error { }

Exception thrown if an error occurs within Leopard Speech-to-Text engine.

Exceptions:

class LeopardActivationError           extends LeopardError { }
class LeopardActivationLimitError      extends LeopardError { }
class LeopardActivationRefusedError    extends LeopardError { }
class LeopardActivationThrottledError  extends LeopardError { }
class LeopardIOError                   extends LeopardError { }
class LeopardInvalidArgumentError      extends LeopardError { }
class LeopardInvalidStateError         extends LeopardError { }
class LeopardKeyError                  extends LeopardError { }
class LeopardMemoryError               extends LeopardError { }
class LeopardRuntimeError              extends LeopardError { }
class LeopardStopIterationError        extends LeopardError { }

LeopardOptions

type LeopardOptions = {
  enableAutomaticPunctuation?: boolean;
  enableDiarization?: boolean;
}

Class containing optional configuration parameters for Leopard.

enableAutomaticPunctuation boolean : Flag to enable automatic punctuation insertion.
enableDiarization boolean : Flag to enable diarization.

LeopardTranscript

type LeopardTranscript = {
  transcript: string;
  words: LeopardWord[];
}

Class containing results from a Leopard process function.

transcript string : Inferred transcription.
words LeopardWord[] : Transcribed words and their associated metadata.

LeopardWord

type LeopardWord = {
  word: string;
  startSec: number;
  endSec: number;
  confidence: number;
  speakerTag: number;
}

Class containing results word transcribed by Leopard and their associated metadata.

word string : Transcribed word.
startSec number : Start of word in seconds.
endSec number : End of word in seconds.
confidence number : Transcription confidence. It is a number within [0, 1].
speakerTag number : Speaker tag is -1 if diarization is not enabled during initialization; otherwise, it's a non-negative integer identifying unique speakers, with 0 reserved for unknown speakers.

Was this doc helpful?

Issue with this doc?

Leopard Speech-to-Text React Native API

Leopard Speech-to-Text
React Native API