Cheetah Speech-to-Text
Web API

API Reference for the Cheetah Web SDK (cheetah-web)

Cheetah

class Cheetah {}

Class for the Cheetah Speech-to-Text engine.

Cheetah.`create()`

static async function create(
  accessKey: string,
  transcriptCallback: (cheetahTranscript: CheetahTranscript) => void,
  model: CheetahModel,
  options: CheetahOptions = {}
): Promise<Cheetah>

Creates an instance of Cheetah Speech-to-Text engine using '.pv' file in public directory. The model size is large, hence it will try to use the existing one if it exists, otherwise saves the model in storage.

Parameters

accessKey string : AccessKey obtained from Picovoice Console.
transcriptCallback (cheetahTranscript: CheetahTranscript) => void : User-defined callback to run after receiving transcript result.
model CheetahModel : Cheetah model options.
options CheetahOptions : Optional configuration arguments.

Returns

Cheetah : An instance of the Cheetah engine.

Cheetah.`process()`

async function process(pcm: Int16Array): Promise<void>

Processes a frame of audio. The required sample rate can be retrieved from .sampleRate and the length of frame (number of audio samples per frame) can be retrieved from .frameLength. The audio needs to be 16-bit linearly-encoded. Furthermore, the engine operates on single-channel audio.

Parameters

pcm Int16Array : A frame of audio.

Returns

CheetahTranscript : Any newly-transcribed speech (if none is available then an empty string is returned) and a flag indicating if an endpoint has been detected.

Cheetah.`flush()`

async function flush(): Promise<CheetahTranscript>

Processes any remaining audio data and returns its transcription.

Returns

CheetahTranscript: Inferred transcription object.

Cheetah.`release()`

async function release(): Promise<void>

Releases resources acquired by the Cheetah Web SDK.

Cheetah.`frameLength`

get frameLength(): number

Number of audio samples per frame.

Cheetah.`sampleRate`

get sampleRate(): number

Audio sample rate accepted by Cheetah.

Cheetah.`version`

get version(): string

Cheetah version string.

CheetahModel

type CheetahModel = {
  base64?: string;
  publicPath?: string;
  customWritePath?: string;
  forceWrite?: boolean;
  version?: number;
}

Cheetah model type.

base64 string: The model file (.pv) in base64 string to initialize Cheetah.
publicPath string: The model file (.pv) path relative to the public directory.
customWritePath string : Custom path to save the model in storage. Set to a different name to use multiple models across cheetah instances.
forceWrite boolean : Flag to overwrite the model in storage even if it exists.
version number : Version of the model file. Increment to update the model file in storage.

CheetahOptions

type CheetahOptions = {
  endpointDurationSec?: number
  enableAutomaticPunctuation?: boolean;
  processErrorCallback?: (error: string) => void
}

Cheetah options type.

endpointDurationSec number : Duration of endpoint in seconds. A speech endpoint is detected when there is a chunk of audio (with a duration specified herein) after an utterance without any speech in it. Set to 0 to disable endpoint detection.
enableAutomaticPunctuation boolean : Flag to enable automatic punctuation insertion.
processErrorCallback boolean : User-defined callback invoked if any error happens while processing the audio stream. Its only input argument is the error message. NOTE: This is only used by CheetahWorker.

CheetahTranscript

type CheetahTranscript = {
  transcript: string;
  isEndpoint: boolean;
  isFlushed: boolean;
}

Cheetah transcript type.

transcript string : transcript returned from Cheetah.
isEndpoint boolean : Whether the transcript has an endpoint.
isFlushed boolean : Whether the engine called Cheetah.flush().

CheetahWorker

class CheetahWorker {}

A class for creating new instances of the CheetahWorker.

CheetahWorker.`create()`

static async create(
  accessKey: string, 
  transcriptCallback: (cheetahTranscript: CheetahTranscript) => void,
  model: CheetahModel,
  options: CheetahOptions = {},
): Promise<CheetahWorker>

Creates an instance of CheetahWorker using '.pv' file in public directory. The model size is large, hence it will try to use the existing one if it exists, otherwise saves the model in storage.

Parameters

accessKey string : AccessKey obtained from Picovoice Console.
transcriptCallback (cheetahTranscript: CheetahTranscript) => void : User-defined callback to run after receiving transcript result.
model CheetahModel : Cheetah model options.
options CheetahOptions : Optional configuration arguments.

Returns

CheetahWorker : An instance of CheetahWorker.

CheetahWorker.`process()`

async function process(pcm: Int16Array): void

The transcript result will be supplied with the callback provided when initializing the worker either by 'fromBase64' or 'fromPublicDirectory'.

Parameters

pcm Int16Array : A frame of audio.

CheetahWorker.`flush()`

async function flush(): void

Processes any remaining audio data and returns its transcription.

CheetahWorker.`release()`

async function release(): Promise<void>

Releases resources acquired by the Cheetah Web SDK.

CheetahWorker.`terminate()`

async function terminate(): Promise<void>

Force terminates the instance of CheetahWorker.

CheetahWorker.`frameLength`

get frameLength(): number

Number of audio samples per frame.

CheetahWorker.`sampleRate`

get sampleRate(): number

Audio sample rate accepted by Cheetah.

CheetahWorker.`version`

get version(): string

Cheetah version string.

Was this doc helpful?

Issue with this doc?

Cheetah Speech-to-Text Web API

Cheetah Speech-to-Text
Web API