Picovoice Wordmark
Start Building
Introduction
Introduction
AndroidC.NETiOSLinuxmacOSNode.jsPythonRaspberry PiWebWindows
AndroidC.NETiOSNode.jsPythonWeb
SummaryPicovoice picoLLMGPTQ
Introduction
AndroidC.NETFlutteriOSJavaLinuxmacOSNode.jsPythonRaspberry PiReactReact NativeRustWebWindows
AndroidC.NETFlutteriOSJavaNode.jsPythonReactReact NativeRustWeb
SummaryPicovoice LeopardAmazon TranscribeAzure Speech-to-TextGoogle ASRGoogle ASR (Enhanced)IBM Watson Speech-to-TextWhisper Speech-to-Text
FAQ
Introduction
AndroidC.NETFlutteriOSJavaLinuxmacOSNode.jsPythonRaspberry PiReactReact NativeRustWebWindows
AndroidC.NETFlutteriOSJavaNode.jsPythonReactReact NativeRustWeb
SummaryPicovoice Cheetah
FAQ
Introduction
AndroidC.NETiOSLinuxmacOSNode.jsPythonRaspberry PiWebWindows
AndroidC.NETiOSNode.jsPythonWeb
SummaryAmazon PollyAzure TTSElevenLabsOpenAI TTSPicovoice Orca
Introduction
AndroidCiOSLinuxmacOSPythonRaspberry PiWebWindows
AndroidCiOSPythonWeb
SummaryPicovoice KoalaMozilla RNNoise
Introduction
AndroidCiOSLinuxmacOSNode.jsPythonRaspberry PiWebWindows
AndroidCNode.jsPythoniOSWeb
SummaryPicovoice EaglepyannoteSpeechBrainWeSpeaker
Introduction
AndroidCiOSLinuxmacOSPythonRaspberry PiWebWindows
AndroidCiOSPythonWeb
SummaryPicovoice FalconAmazon TranscribeAzure Speech-to-TextGoogle Speech-to-Textpyannote
Introduction
AndroidArduinoCChrome.NETEdgeFirefoxFlutteriOSJavaLinuxmacOSMicrocontrollerNode.jsPythonRaspberry PiReactReact NativeRustSafariUnityWebWindows
AndroidC.NETFlutteriOSJavaMicrocontrollerNode.jsPythonReactReact NativeRustUnityWeb
SummaryPorcupineSnowboyPocketSphinx
Wake Word TipsFAQ
Introduction
AndroidCChrome.NETEdgeFirefoxFlutteriOSJavaLinuxmacOSNode.jsPythonRaspberry PiReactReact NativeRustSafariUnityWebWindows
AndroidC.NETFlutteriOSJavaNode.jsPythonReactReact NativeRustUnityWeb
SummaryPicovoice RhinoGoogle DialogflowAmazon LexIBM WatsonMicrosoft LUIS
Expression SyntaxFAQ
Introduction
AndroidC.NETiOSLinuxmacOSNode.jsPythonRaspberry PiRustWebWindows
AndroidC.NETiOSNode.jsPythonRustWeb
SummaryPicovoice CobraWebRTC VAD
FAQ
Introduction
AndroidC.NETFlutteriOSNode.jsPythonReact NativeRustUnityWeb
AndroidC.NETFlutteriOSNode.jsPythonReact NativeRustUnityWeb
Introduction
C.NETNode.jsPython
C.NETNode.jsPython
FAQGlossary

Cheetah Speech-to-Text
Web API

API Reference for the Cheetah Web SDK (cheetah-web)


Cheetah

class Cheetah {}

Class for the Cheetah Speech-to-Text engine.


Cheetah.create()

static async function create(
accessKey: string,
transcriptCallback: (cheetahTranscript: CheetahTranscript) => void,
model: CheetahModel,
options: CheetahOptions = {}
): Promise<Cheetah>

Creates an instance of Cheetah Speech-to-Text engine using '.pv' file in public directory. The model size is large, hence it will try to use the existing one if it exists, otherwise saves the model in storage.

Parameters

  • accessKey string : AccessKey obtained from Picovoice Console.
  • transcriptCallback (cheetahTranscript: CheetahTranscript) => void : User-defined callback to run after receiving transcript result.
  • model CheetahModel : Cheetah model options.
  • options CheetahOptions : Optional configuration arguments.

Returns

  • Cheetah : An instance of the Cheetah engine.

Cheetah.process()

async function process(pcm: Int16Array): Promise<void>

Processes a frame of audio. The required sample rate can be retrieved from .sampleRate and the length of frame (number of audio samples per frame) can be retrieved from .frameLength. The audio needs to be 16-bit linearly-encoded. Furthermore, the engine operates on single-channel audio.

Parameters

  • pcm Int16Array : A frame of audio.

Returns

  • CheetahTranscript : Any newly-transcribed speech (if none is available then an empty string is returned) and a flag indicating if an endpoint has been detected.

Cheetah.flush()

async function flush(): Promise<CheetahTranscript>

Processes any remaining audio data and returns its transcription.

Returns

  • CheetahTranscript: Inferred transcription object.

Cheetah.release()

async function release(): Promise<void>

Releases resources acquired by the Cheetah Web SDK.


Cheetah.frameLength

get frameLength(): number

Number of audio samples per frame.


Cheetah.sampleRate

get sampleRate(): number

Audio sample rate accepted by Cheetah.


Cheetah.version

get version(): string

Cheetah version string.


CheetahModel

type CheetahModel = {
base64?: string;
publicPath?: string;
customWritePath?: string;
forceWrite?: boolean;
version?: number;
}

Cheetah model type.

  • base64 string: The model file (.pv) in base64 string to initialize Cheetah.
  • publicPath string: The model file (.pv) path relative to the public directory.
  • customWritePath string : Custom path to save the model in storage. Set to a different name to use multiple models across cheetah instances.
  • forceWrite boolean : Flag to overwrite the model in storage even if it exists.
  • version number : Version of the model file. Increment to update the model file in storage.

CheetahOptions

type CheetahOptions = {
endpointDurationSec?: number
enableAutomaticPunctuation?: boolean;
processErrorCallback?: (error: string) => void
}

Cheetah options type.

  • endpointDurationSec number : Duration of endpoint in seconds. A speech endpoint is detected when there is a chunk of audio (with a duration specified herein) after an utterance without any speech in it. Set to 0 to disable endpoint detection.
  • enableAutomaticPunctuation boolean : Flag to enable automatic punctuation insertion.
  • processErrorCallback boolean : User-defined callback invoked if any error happens while processing the audio stream. Its only input argument is the error message. NOTE: This is only used by CheetahWorker.

CheetahTranscript

type CheetahTranscript = {
transcript: string;
isEndpoint: boolean;
isFlushed: boolean;
}

Cheetah transcript type.

  • transcript string : transcript returned from Cheetah.
  • isEndpoint boolean : Whether the transcript has an endpoint.
  • isFlushed boolean : Whether the engine called Cheetah.flush().

CheetahWorker

class CheetahWorker {}

A class for creating new instances of the CheetahWorker.


CheetahWorker.create()

static async create(
accessKey: string,
transcriptCallback: (cheetahTranscript: CheetahTranscript) => void,
model: CheetahModel,
options: CheetahOptions = {},
): Promise<CheetahWorker>

Creates an instance of CheetahWorker using '.pv' file in public directory. The model size is large, hence it will try to use the existing one if it exists, otherwise saves the model in storage.

Parameters

  • accessKey string : AccessKey obtained from Picovoice Console.
  • transcriptCallback (cheetahTranscript: CheetahTranscript) => void : User-defined callback to run after receiving transcript result.
  • model CheetahModel : Cheetah model options.
  • options CheetahOptions : Optional configuration arguments.

Returns

  • CheetahWorker : An instance of CheetahWorker.

CheetahWorker.process()

async function process(pcm: Int16Array): void

Processes a frame of audio. The required sample rate can be retrieved from .sampleRate and the length of frame (number of audio samples per frame) can be retrieved from .frameLength. The audio needs to be 16-bit linearly-encoded. Furthermore, the engine operates on single-channel audio.

The transcript result will be supplied with the callback provided when initializing the worker either by 'fromBase64' or 'fromPublicDirectory'.

Parameters

  • pcm Int16Array : A frame of audio.

CheetahWorker.flush()

async function flush(): void

Processes any remaining audio data and returns its transcription.


CheetahWorker.release()

async function release(): Promise<void>

Releases resources acquired by the Cheetah Web SDK.


CheetahWorker.terminate()

async function terminate(): Promise<void>

Force terminates the instance of CheetahWorker.


CheetahWorker.frameLength

get frameLength(): number

Number of audio samples per frame.


CheetahWorker.sampleRate

get sampleRate(): number

Audio sample rate accepted by Cheetah.


CheetahWorker.version

get version(): string

Cheetah version string.

Was this doc helpful?

Issue with this doc?

Report a GitHub Issue
Cheetah Speech-to-Text Web API
  • Cheetah
  • create()
  • process()
  • flush()
  • release()
  • frameLength
  • sampleRate
  • version
  • CheetahModel
  • CheetahOptions
  • CheetahTranscript
  • CheetahWorker
  • create()
  • process()
  • flush()
  • release()
  • terminate()
  • frameLength
  • sampleRate
  • version
Voice AI
  • Leopard Speech-to-Text
  • Cheetah Streaming Speech-to-Text
  • Orca Text-to-Speech
  • Koala Noise Suppression
  • Eagle Speaker Recognition
  • Falcon Speaker Diarization
  • Porcupine Wake Word
  • Rhino Speech-to-Intent
  • Cobra Voice Activity Detection
Local LLM
  • picoLLM Inference
  • picoLLM Compression
  • picoLLM GYM
Resources
  • Docs
  • Console
  • Blog
  • Use Cases
  • Playground
Sales & Services
  • Consulting
  • Foundation Plan
  • Enterprise Plan
  • Enterprise Support
Company
  • About us
  • Careers
Follow Picovoice
  • LinkedIn
  • GitHub
  • X
  • YouTube
  • AngelList
Subscribe to our newsletter
Terms of Use
Privacy Policy
© 2019-2025 Picovoice Inc.