Picovoice WordmarkPicovoice Console
Introduction
Introduction
AndroidC.NETFlutterlink to GoiOSJavaNvidia JetsonLinuxmacOSNodejsPythonRaspberry PiReact NativeRustWebWindows
AndroidC.NETFlutterlink to GoiOSJavaNodejsPythonReact NativeRustWeb
SummaryPicovoice LeopardAmazon TranscribeAzure Speech-to-TextGoogle ASRGoogle ASR (Enhanced)IBM Watson Speech-to-Text
FAQ
Introduction
AndroidC.NETFlutterlink to GoiOSJavaNodejsPythonReact NativeRustWeb
AndroidC.NETFlutterlink to GoiOSJavaNodejsPythonReact NativeRustWeb
FAQ
Introduction
AndroidCiOSLinuxmacOSPythonWebWindows
AndroidCiOSPythonWeb
SummaryOctopus Speech-to-IndexGoogle Speech-to-TextMozilla DeepSpeech
FAQ
Introduction
AndroidAngularArduinoBeagleBoneCChrome.NETEdgeFirefoxFlutterlink to GoiOSJavaNvidia JetsonLinuxmacOSMicrocontrollerNodejsPythonRaspberry PiReactReact NativeRustSafariUnityVueWebWindows
AndroidAngularC.NETFlutterlink to GoiOSJavaMicrocontrollerNodejsPythonReactReact NativeRustUnityVueWeb
SummaryPorcupineSnowboyPocketSphinx
Wake Word TipsFAQ
Introduction
AndroidAngularBeagleBoneCChrome.NETEdgeFirefoxFlutterlink to GoiOSJavaNvidia JetsonlinuxmacOSNodejsPythonRaspberry PiReactReact NativeRustSafariUnityVueWebWindows
AndroidAngularC.NETFlutterlink to GoiOSJavaNodejsPythonReactReact NativeRustUnityVueWeb
SummaryPicovoice RhinoGoogle DialogflowAmazon LexIBM WatsonMicrosoft LUIS
Expression SyntaxFAQ
Introduction
AndroidBeagleboneCiOSNvidia JetsonLinuxmacOSPythonRaspberry PiRustWebWindows
AndroidCiOSPythonRustWeb
SummaryPicovoice CobraWebRTC VAD
FAQ
Introduction
AndroidAngularArduinoBeagleBoneC.NETFlutterlink to GoiOSJavaNvidia JetsonMicrocontrollerNodejsPythonRaspberry PiReactReact NativeRustUnityVueWeb
AndroidAngularCMicrocontroller.NETFlutterlink to GoiOSJavaNodejsPythonReactReact NativeRustUnityVueWeb
Picovoice SDK - FAQ
IntroductionSTM32F407G-DISC1 (Arm Cortex-M4)STM32F411E-DISCO (Arm Cortex-M4)STM32F769I-DISCO (Arm Cortex-M7)IMXRT1050-EVKB (Arm Cortex-M7)
FAQGlossary

Cheetah Speech-to-Text
Web API


API Reference for the Cheetah Web SDK(cheetah-web)


Cheetah

class Cheetah {}

Class for the Cheetah Speech-to-Text engine.


Cheetah.create()

static async function create(
accessKey: string,
transcriptCallback: (cheetahTranscript: CheetahTranscript) => void,
model: CheetahModel,
options: CheetahOptions = {}
): Promise<Cheetah>

Creates an instance of Cheetah Speech-to-Text engine using '.pv' file in public directory. The model size is large, hence it will try to use the existing one if it exists, otherwise saves the model in storage.

Parameters

  • accessKey string : AccessKey obtained from Picovoice Console.
  • transcriptCallback (cheetahTranscript: CheetahTranscript) => void : User-defined callback to run after receiving transcript result.
  • model CheetahModel : Cheetah model options.
  • options CheetahOptions : Optional configuration arguments.

Returns

  • Cheetah : An instance of the Cheetah engine.

Cheetah.process()

async function process(pcm: Int16Array): Promise<void>

Processes a frame of audio. The required sample rate can be retrieved from .sampleRate and the length of frame (number of audio samples per frame) can be retrieved from .frameLength. The audio needs to be 16-bit linearly-encoded. Furthermore, the engine operates on single-channel audio.

Parameters

  • pcm Int16Array : A frame of audio.

Returns

  • CheetahTranscript : Any newly-transcribed speech (if none is available then an empty string is returned) and a flag indicating if an endpoint has been detected.

Cheetah.flush()

async function flush(): Promise<CheetahTranscript>

Processes any remaining audio data and returns its transcription.

Returns

  • CheetahTranscript: Inferred transcription object.

Cheetah.release()

async function release(): Promise<void>

Releases resources acquired by the Cheetah Web SDK.


Cheetah.frameLength

get frameLength(): number

Number of audio samples per frame.


Cheetah.sampleRate

get sampleRate(): number

Audio sample rate accepted by Cheetah.


Cheetah.version

get version(): string

Cheetah version string.


CheetahModel

type CheetahModel = {
base64?: string;
publicPath?: string;
customWritePath?: string;
forceWrite?: boolean;
version?: number;
}

Cheetah model type.

  • base64 string: The model file (.pv) in base64 string to initialize Cheetah.
  • publicPath string: The model file (.pv) path relative to the public directory.
  • customWritePath string : Custom path to save the model in storage. Set to a different name to use multiple models across cheetah instances.
  • forceWrite boolean : Flag to overwrite the model in storage even if it exists.
  • version number : Version of the model file. Increment to update the model file in storage.

CheetahOptions

type CheetahOptions = {
endpointDurationSec?: number
enableAutomaticPunctuation?: boolean;
processErrorCallback?: (error: string) => void
}

Cheetah options type.

  • endpointDurationSec number : Duration of endpoint in seconds. A speech endpoint is detected when there is a chunk of audio (with a duration specified herein) after an utterance without any speech in it. Set to 0 to disable endpoint detection.
  • enableAutomaticPunctuation boolean : Flag to enable automatic punctuation insertion.
  • processErrorCallback boolean : User-defined callback invoked if any error happens while processing the audio stream. Its only input argument is the error message. NOTE: This is only used by CheetahWorker.

CheetahTranscript

type CheetahTranscript = {
transcript: string;
isEndpoint: boolean;
}

Cheetah transcript type.

  • transcript string : transcript returned from Cheetah.
  • isEndpoint boolean : Whether the transcript has an endpoint.

CheetahWorker

class CheetahWorker {}

A class for creating new instances of the CheetahWorker.


CheetahWorker.create()

static async create(
accessKey: string,
transcriptCallback: (cheetahTranscript: CheetahTranscript) => void,
model: CheetahModel,
options: CheetahOptions = {},
): Promise<CheetahWorker>

Creates an instance of CheetahWorker using '.pv' file in public directory. The model size is large, hence it will try to use the existing one if it exists, otherwise saves the model in storage.

Parameters

  • accessKey string : AccessKey obtained from Picovoice Console.
  • transcriptCallback (cheetahTranscript: CheetahTranscript) => void : User-defined callback to run after receiving transcript result.
  • model CheetahModel : Cheetah model options.
  • options CheetahOptions : Optional configuration arguments.

Returns

  • CheetahWorker : An instance of CheetahWorker.

CheetahWorker.process()

async function process(pcm: Int16Array): void

Processes a frame of audio. The required sample rate can be retrieved from .sampleRate and the length of frame (number of audio samples per frame) can be retrieved from .frameLength. The audio needs to be 16-bit linearly-encoded. Furthermore, the engine operates on single-channel audio.

The transcript result will be supplied with the callback provided when initializing the worker either by 'fromBase64' or 'fromPublicDirectory'.

Parameters

  • pcm Int16Array : A frame of audio.

CheetahWorker.flush()

async function flush(): void

Processes any remaining audio data and returns its transcription.


CheetahWorker.release()

async function release(): Promise<void>

Releases resources acquired by the Cheetah Web SDK.


CheetahWorker.terminate()

async function terminate(): Promise<void>

Force terminates the instance of CheetahWorker.


CheetahWorker.frameLength

get frameLength(): number

Number of audio samples per frame.


CheetahWorker.sampleRate

get sampleRate(): number

Audio sample rate accepted by Cheetah.


CheetahWorker.version

get version(): string

Cheetah version string.

Was this doc helpful?

Issue with this doc?

Report a GitHub Issue
Cheetah Speech-to-Text Web API
  • Cheetah
  • create()
  • process()
  • flush()
  • release()
  • frameLength
  • sampleRate
  • version
  • CheetahModel
  • CheetahOptions
  • CheetahTranscript
  • CheetahWorker
  • create()
  • process()
  • flush()
  • release()
  • terminate()
  • frameLength
  • sampleRate
  • version
Platform
  • Leopard Speech-to-Text
  • Cheetah Streaming Speech-to-Text
  • Octopus Speech-to-Index
  • Porcupine Wake Word
  • Rhino Speech-to-Intent
  • Cobra Voice Activity Detection
Resources
  • Docs
  • Console
  • Blog
  • Demos
Sales
  • Pricing
  • Starter Tier
  • Enterprise
Company
  • Careers
Follow Picovoice
  • LinkedIn
  • GitHub
  • Twitter
  • Medium
  • YouTube
  • AngelList
Subscribe to our newsletter
Terms of Use
Privacy Policy
© 2019-2022 Picovoice Inc.