Cheetah Speech-to-Text
Web API
API Reference for the Cheetah Web SDK (cheetah-web)
Cheetah
Class for the Cheetah Speech-to-Text engine.
Cheetah.create()
Creates an instance of Cheetah Speech-to-Text engine using '.pv' file in public directory. The model size is large, hence it will try to use the existing one if it exists, otherwise saves the model in storage.
Parameters
accessKey
string : AccessKey obtained from Picovoice Console.transcriptCallback
(cheetahTranscript: CheetahTranscript) => void : User-defined callback to run after receiving transcript result.model
CheetahModel : Cheetah model options.options
CheetahOptions : Optional configuration arguments.
Returns
Cheetah
: An instance of the Cheetah engine.
Cheetah.process()
Processes a frame of audio. The required sample rate can be retrieved from .sampleRate
and the length of frame (number of audio samples per frame) can be retrieved from .frameLength
. The audio needs to be 16-bit linearly-encoded. Furthermore, the engine operates on single-channel audio.
Parameters
pcm
Int16Array : A frame of audio.
Returns
CheetahTranscript
: Any newly-transcribed speech (if none is available then an empty string is returned) and a flag indicating if an endpoint has been detected.
Cheetah.flush()
Processes any remaining audio data and returns its transcription.
Returns
CheetahTranscript
: Inferred transcription object.
Cheetah.release()
Releases resources acquired by the Cheetah Web SDK.
Cheetah.frameLength
Number of audio samples per frame.
Cheetah.sampleRate
Audio sample rate accepted by Cheetah.
Cheetah.version
Cheetah version string.
CheetahModel
Cheetah model type.
base64
string: The model file (.pv
) in base64 string to initialize Cheetah.publicPath
string: The model file (.pv
) path relative to the public directory.customWritePath
string : Custom path to save the model in storage. Set to a different name to use multiple models acrosscheetah
instances.forceWrite
boolean : Flag to overwrite the model in storage even if it exists.version
number : Version of the model file. Increment to update the model file in storage.
CheetahOptions
Cheetah options type.
endpointDurationSec
number : Duration of endpoint in seconds. A speech endpoint is detected when there is a chunk of audio (with a duration specified herein) after an utterance without any speech in it. Set to0
to disable endpoint detection.enableAutomaticPunctuation
boolean : Flag to enable automatic punctuation insertion.processErrorCallback
boolean : User-defined callback invoked if any error happens while processing the audio stream. Its only input argument is the error message. NOTE: This is only used byCheetahWorker
.
CheetahTranscript
Cheetah transcript type.
transcript
string : transcript returned from Cheetah.isEndpoint
boolean : Whether the transcript has an endpoint.isFlushed
boolean : Whether the engine called Cheetah.flush()
.
CheetahWorker
A class for creating new instances of the CheetahWorker
.
CheetahWorker.create()
Creates an instance of CheetahWorker
using '.pv' file in public directory. The model size is large, hence it will try to use the existing one if it exists, otherwise saves the model in storage.
Parameters
accessKey
string : AccessKey obtained from Picovoice Console.transcriptCallback
(cheetahTranscript: CheetahTranscript) => void : User-defined callback to run after receiving transcript result.model
CheetahModel : Cheetah model options.options
CheetahOptions : Optional configuration arguments.
Returns
CheetahWorker
: An instance ofCheetahWorker
.
CheetahWorker.process()
Processes a frame of audio. The required sample rate can be retrieved from .sampleRate
and the length of frame (number of audio samples per frame) can be retrieved from .frameLength
. The audio needs to be 16-bit linearly-encoded. Furthermore, the engine operates on single-channel audio.
The transcript result will be supplied with the callback provided when initializing the worker either by 'fromBase64' or 'fromPublicDirectory'.
Parameters
pcm
Int16Array : A frame of audio.
CheetahWorker.flush()
Processes any remaining audio data and returns its transcription.
CheetahWorker.release()
Releases resources acquired by the Cheetah Web SDK.
CheetahWorker.terminate()
Force terminates the instance of CheetahWorker
.
CheetahWorker.frameLength
Number of audio samples per frame.
CheetahWorker.sampleRate
Audio sample rate accepted by Cheetah.
CheetahWorker.version
Cheetah version string.