Cheetah Speech-to-Text
Web API
API Reference for the Cheetah Web SDK (cheetah-web)
Cheetah
Class for the Cheetah Speech-to-Text engine.
Cheetah.create()
Creates an instance of Cheetah Speech-to-Text engine using '.pv' file in public directory. The model size is large, hence it will try to use the existing one if it exists, otherwise saves the model in storage.
Parameters
accessKeystring : AccessKey obtained from Picovoice Console.transcriptCallback(cheetahTranscript: CheetahTranscript) => void : User-defined callback to run after receiving transcript result.modelCheetahModel : Cheetah model options.optionsCheetahOptions : Optional configuration arguments.
Returns
Cheetah: An instance of the Cheetah engine.
Cheetah.process()
Processes a frame of audio. The required sample rate can be retrieved from .sampleRate and the length of frame (number of audio samples per frame) can be retrieved from .frameLength. The audio needs to be 16-bit linearly-encoded. Furthermore, the engine operates on single-channel audio.
Parameters
pcmInt16Array : A frame of audio.
Returns
CheetahTranscript: Any newly-transcribed speech (if none is available then an empty string is returned) and a flag indicating if an endpoint has been detected.
Cheetah.flush()
Processes any remaining audio data and returns its transcription.
Returns
CheetahTranscript: Inferred transcription object.
Cheetah.release()
Releases resources acquired by the Cheetah Web SDK.
Cheetah.frameLength
Number of audio samples per frame.
Cheetah.sampleRate
Audio sample rate accepted by Cheetah.
Cheetah.version
Cheetah version string.
CheetahModel
Cheetah model type.
base64string: The model file (.pv) in base64 string to initialize Cheetah.publicPathstring: The model file (.pv) path relative to the public directory.customWritePathstring : Custom path to save the model in storage. Set to a different name to use multiple models acrosscheetahinstances.forceWriteboolean : Flag to overwrite the model in storage even if it exists.versionnumber : Version of the model file. Increment to update the model file in storage.
CheetahOptions
Cheetah options type.
endpointDurationSecnumber : Duration of endpoint in seconds. A speech endpoint is detected when there is a chunk of audio (with a duration specified herein) after an utterance without any speech in it. Set to0to disable endpoint detection.enableAutomaticPunctuationboolean : Flag to enable automatic punctuation insertion.processErrorCallbackboolean : User-defined callback invoked if any error happens while processing the audio stream. Its only input argument is the error message. NOTE: This is only used byCheetahWorker.
CheetahTranscript
Cheetah transcript type.
transcriptstring : transcript returned from Cheetah.isEndpointboolean : Whether the transcript has an endpoint.isFlushedboolean : Whether the engine called Cheetah.flush().
CheetahWorker
A class for creating new instances of the CheetahWorker.
CheetahWorker.create()
Creates an instance of CheetahWorker using '.pv' file in public directory. The model size is large, hence it will try to use the existing one if it exists, otherwise saves the model in storage.
Parameters
accessKeystring : AccessKey obtained from Picovoice Console.transcriptCallback(cheetahTranscript: CheetahTranscript) => void : User-defined callback to run after receiving transcript result.modelCheetahModel : Cheetah model options.optionsCheetahOptions : Optional configuration arguments.
Returns
CheetahWorker: An instance ofCheetahWorker.
CheetahWorker.process()
Processes a frame of audio. The required sample rate can be retrieved from .sampleRate and the length of frame (number of audio samples per frame) can be retrieved from .frameLength. The audio needs to be 16-bit linearly-encoded. Furthermore, the engine operates on single-channel audio.
The transcript result will be supplied with the callback provided when initializing the worker either by 'fromBase64' or 'fromPublicDirectory'.
Parameters
pcmInt16Array : A frame of audio.
CheetahWorker.flush()
Processes any remaining audio data and returns its transcription.
CheetahWorker.release()
Releases resources acquired by the Cheetah Web SDK.
CheetahWorker.terminate()
Force terminates the instance of CheetahWorker.
CheetahWorker.frameLength
Number of audio samples per frame.
CheetahWorker.sampleRate
Audio sample rate accepted by Cheetah.
CheetahWorker.version
Cheetah version string.