Picovoice Wordmark
Start Building
Introduction
Introduction
AndroidC.NETiOSLinuxmacOSNode.jsPythonRaspberry PiWebWindows
AndroidC.NETiOSNode.jsPythonWeb
SummaryPicovoice picoLLMGPTQ
Introduction
AndroidC.NETFlutteriOSJavaLinuxmacOSNode.jsPythonRaspberry PiReactReact NativeWebWindows
AndroidC.NETFlutteriOSJavaNode.jsPythonReactReact NativeWeb
SummaryPicovoice LeopardAmazon TranscribeAzure Speech-to-TextGoogle ASRGoogle ASR (Enhanced)IBM Watson Speech-to-TextWhisper Speech-to-Text
FAQ
Introduction
AndroidC.NETFlutteriOSJavaLinuxmacOSNode.jsPythonRaspberry PiReactReact NativeWebWindows
AndroidC.NETFlutteriOSJavaNode.jsPythonReactReact NativeWeb
SummaryPicovoice CheetahAzure Real-Time Speech-to-TextAmazon Transcribe StreamingGoogle Streaming ASR
FAQ
Introduction
AndroidC.NETiOSLinuxmacOSNode.jsPythonRaspberry PiWebWindows
AndroidC.NETiOSNode.jsPythonWeb
SummaryAmazon PollyAzure TTSElevenLabsOpenAI TTSPicovoice Orca
Introduction
AndroidCiOSLinuxmacOSPythonRaspberry PiWebWindows
AndroidCiOSPythonWeb
SummaryPicovoice KoalaMozilla RNNoise
Introduction
AndroidCiOSLinuxmacOSNode.jsPythonRaspberry PiWebWindows
AndroidCNode.jsPythoniOSWeb
SummaryPicovoice EaglepyannoteSpeechBrainWeSpeaker
Introduction
AndroidCiOSLinuxmacOSPythonRaspberry PiWebWindows
AndroidCiOSPythonWeb
SummaryPicovoice FalconAmazon TranscribeAzure Speech-to-TextGoogle Speech-to-Textpyannote
Introduction
AndroidArduinoCChrome.NETEdgeFirefoxFlutteriOSJavaLinuxmacOSMicrocontrollerNode.jsPythonRaspberry PiReactReact NativeSafariWebWindows
AndroidC.NETFlutteriOSJavaMicrocontrollerNode.jsPythonReactReact NativeWeb
SummaryPicovoice PorcupineSnowboyPocketSphinx
Wake Word TipsFAQ
Introduction
AndroidArduinoCChrome.NETEdgeFirefoxFlutteriOSJavaLinuxmacOSMicrocontrollerNode.jsPythonRaspberry PiReactReact NativeSafariWebWindows
AndroidC.NETFlutteriOSJavaMicrocontrollerNode.jsPythonReactReact NativeWeb
SummaryPicovoice RhinoGoogle DialogflowAmazon LexIBM WatsonMicrosoft LUIS
Expression SyntaxFAQ
Introduction
AndroidArduinoC.NETiOSLinuxmacOSMicrocontrollerNode.jsPythonRaspberry PiWebWindows
AndroidC.NETiOSMicrocontrollerNode.jsPythonWeb
SummaryPicovoice CobraWebRTC VADSilero VAD
FAQ
Introduction
AndroidC.NETFlutteriOSNode.jsPythonReact NativeWeb
AndroidC.NETFlutteriOSNode.jsPythonReact NativeWeb
Introduction
C.NETNode.jsPython
C.NETNode.jsPython
FAQGlossary

Cheetah Speech-to-Text
Node.js API

API Reference for the Node.js Cheetah SDK (npm)


Cheetah

class Cheetah { }

Class for the Cheetah Speech-to-Text engine.

Cheetah can be initialized using the class constructor(). Resources should be cleaned when you are done using the release() method.


Cheetah.constructor()

Cheetah.constructor(
accessKey: string,
options: CheetahOptions = {}
)

Cheetah constructor.

Parameters

  • accessKey string : AccessKey obtained from Picovoice Console.
  • options CheetahOptions: Optional configuration arguments:
    • modelPath string : Path to the file containing model parameters (.pv).
    • device string? : String representation of the device (e.g., CPU or GPU) to use for inference. If set to best, picoLLM picks the most suitable device. If set to gpu, the engine uses the first available GPU device. To select a specific GPU device, set this argument to gpu:${GPU_INDEX}, where ${GPU_INDEX} is the index of the target GPU. If set to cpu, the engine will run on the CPU with the default number of threads. To specify the number of threads, set this argument to cpu:${NUM_THREADS}, where ${NUM_THREADS} is the desired number of threads.
    • libraryPath string : Path to the Cheetah dynamic library (.node).
    • endpointDuration number : Duration of endpoint in seconds. A speech endpoint is detected when there is a chunk of audio (with a duration specified herein) after an utterance without any speech in it. Set duration to 0 to disable this. Default is 1 second.
    • enableAutomaticPunctuation boolean : Whether to enable automatic punctuation. Default is false.

Returns

  • Cheetah: An instance of Cheetah platform.

Cheetah.release()

Cheetah.release()

Releases resources acquired by Cheetah.


Cheetah.frameLength

Cheetah.frameLength

Getter for number of audio samples per frame.

Returns

  • number: Number of audio samples per frame.

Cheetah.sampleRate

Cheetah.sampleRate()

Getter for audio sample rate accepted by Cheetah.

Returns

  • number: Audio sample rate accepted by Cheetah.

Cheetah.version

Cheetah.version()

Getter for version.

Returns

  • string: Current Cheetah version.

Cheetah.listAvailableDevices()

Cheetah.listAvailableDevices(options: CheetahInputOptions = {}): string[]

Lists all available devices that Cheetah can use for inference. Each entry in the list can be the device argument of the constructor.

Parameters

  • options CheetahInputOptions : Optional input configuration arguments.

Returns

  • string[] : List of all available devices that Cheetah can use for inference.

CheetahOptions

type CheetahInitOptions = {
modelPath?: string;
device?: string;
endpointDurationSec?: number;
enableAutomaticPunctuation?: boolean;
};

Cheetah init options type.

  • modelPath string : The path to the Cheetah model (.pv).
  • device string? : String representation of the device (e.g., CPU or GPU) to use for inference. If set to best, picoLLM picks the most suitable device. If set to gpu, the engine uses the first available GPU device. To select a specific GPU device, set this argument to gpu:${GPU_INDEX}, where ${GPU_INDEX} is the index of the target GPU. If set to cpu, the engine will run on the CPU with the default number of threads. To specify the number of threads, set this argument to cpu:${NUM_THREADS}, where ${NUM_THREADS} is the desired number of threads.
  • endpointDurationSec number : Duration of endpoint in seconds. A speech endpoint is detected when there is a chunk of audio (with a duration specified herein) after an utterance without any speech in it. Set duration to 0 to disable this. Default is 1 second.
  • enableAutomaticPunctuation boolean : Flag to enable automatic punctuation insertion.

CheetahInputOptions

type CheetahInputOptions = {
libraryPath?: string;
};

Cheetah input options type.

  • libraryPath string : The path to the Cheetah dynamic library.

Cheetah.process()

Cheetah.process(pcm)

Processes a frame of the incoming audio stream with the speech-to-text engine. The number of samples per frame can be attained by calling .frameLength. The incoming audio needs to have a sample rate equal to .sampleRate and be 16-bit linearly-encoded. Cheetah operates on single-channel audio.

Parameters

  • pcm Array<number> : A frame of audio samples.

Returns

  • [string, boolean]: Transcription of any newly-transcribed speech (if none is available then an empty string is returned) and a flag indicating if an endpoint has been detected.

Cheetah.flush()

Cheetah.flush()

Marks the end of the audio stream, flushes internal state of the object, and returns any remaining transcribed text.

Returns

  • string: Any remaining transcribed text. If none is available then an empty string is returned.

Errors

Exceptions thrown if an error occurs within Cheetah Speech-to-Text engine.

Exceptions:

class PvStatusOutOfMemoryError extends Error {}
class PvStatusIoError extends Error {}
class PvStatusInvalidArgumentError extends Error {}
class PvStatusStopIterationError extends Error {}
class PvStatusKeyError extends Error {}
class PvStatusInvalidStateError extends Error {}
class PvStatusRuntimeError extends Error {}
class PvStatusActivationError extends Error {}
class PvStatusActivationLimitReached extends Error {}
class PvStatusActivationThrottled extends Error {}
class PvStatusActivationRefused extends Error {}

Was this doc helpful?

Issue with this doc?

Report a GitHub Issue
Cheetah Speech-to-Text Node.js API
  • Cheetah
  • constructor()
  • release()
  • frameLength
  • sampleRate
  • version
  • listAvailableDevices()
  • CheetahOptions
  • CheetahInputOptions
  • process()
  • flush()
  • Errors
Voice AI
  • picoLLM On-Device LLM
  • Leopard Speech-to-Text
  • Cheetah Streaming Speech-to-Text
  • Orca Text-to-Speech
  • Koala Noise Suppression
  • Eagle Speaker Recognition
  • Falcon Speaker Diarization
  • Porcupine Wake Word
  • Rhino Speech-to-Intent
  • Cobra Voice Activity Detection
Resources
  • Docs
  • Console
  • Blog
  • Use Cases
  • Playground
Sales & Services
  • Consulting
  • Foundation Plan
  • Enterprise Plan
  • Enterprise Support
Company
  • About us
  • Careers
Follow Picovoice
  • LinkedIn
  • GitHub
  • X
  • YouTube
  • AngelList
Subscribe to our newsletter
Terms of Use
Privacy Policy
© 2019-2025 Picovoice Inc.