Cheetah Speech-to-Text
React Native API
API Reference for the React Native Cheetah SDK (npm)
Cheetah
Class for the Cheetah Speech-to-Text engine.
Cheetah.create()
Cheetah constructor.
Parameters
accessKeystring : AccessKey obtained from Picovoice Console.modelPathstring : Path to the file containing model parameters (.pv). Can be relative to the assets/resource folder or an absolute path to the file on device.optionsCheetahOptions: Optional configuration arguments:devicestring : String representation of the device (e.g., CPU or GPU) to use for inference. If set tobest, the most suitable device is selected automatically. If set togpu, the engine uses the first available GPU device. To select a specific GPU device, set this argument togpu:${GPU_INDEX}, where${GPU_INDEX}is the index of the target GPU. If set tocpu, the engine will run on the CPU with the default number of threads. To specify the number of threads, set this argument tocpu:${NUM_THREADS}, where${NUM_THREADS}is the desired number of threads.endpointDurationnumber : Duration of endpoint in seconds. A speech endpoint is detected when there is a chunk of audio (with a duration specified herein) after an utterance without any speech in it. Set duration to 0 to disable this. Default is 1 second.enableAutomaticPunctuationboolean : Whether to enable automatic punctuation.
Returns
Promise<Cheetah>: An instance of Cheetah platform.
Cheetah.delete()
Releases resources acquired by Cheetah.
Cheetah.frameLength
Getter for number of audio samples per frame.
Returns
number: Number of audio samples per frame.
Cheetah.sampleRate
Getter for audio sample rate accepted by Cheetah.
Returns
number: Audio sample rate accepted by Cheetah.
Cheetah.version
Getter for version.
Returns
string: CurrentCheetahversion.
Cheetah.process()
Processes a frame of the incoming audio stream with the speech-to-text engine. The number of samples per frame can be attained by calling .frameLength. The incoming audio needs to have a sample rate equal to .sampleRate and be 16-bit linearly-encoded. Cheetah operates on single-channel audio.
Parameters
framenumber[] : A frame of audio samples.
Returns
Promise<CheetahTranscript>: ACheetahTranscriptobject that contains any newly-transcribed speech (if none is available then an empty string is returned) and a flag indicating if an endpoint has been detected.
Cheetah.flush()
Marks the end of the audio stream, flushes internal state of the object, and returns any remaining transcribed text.
Returns
Promise<CheetahTranscript>: Any remaining transcribed text in aCheetahTranscriptobject. If none is available then an empty string is returned.
Cheetah.getAvailableDevices()
Gets all available devices that Cheetah can use for inference. Each entry in the list can be the device argument of the constructor.
Returns
- Promise<string[]>: Array of all available devices that Cheetah can use for inference.
CheetahError
Exception thrown if an error occurs within Cheetah Speech-to-Text engine.
Exceptions:
CheetahOptions
Cheetah options type.
endpointDurationSecnumber : Duration of endpoint in seconds. A speech endpoint is detected when there is a chunk of audio (with a duration specified herein) after an utterance without any speech in it. Set to0to disable endpoint detection.enableAutomaticPunctuationboolean : Flag to enable automatic punctuation insertion.
CheetahTranscript
Cheetah options type.
transcriptstring : Any newly-transcribed speech. If none is available then an empty string is returned.isEndpointboolean : Flag indicating if an endpoint has been detected.