Cheetah Speech-to-Text
.NET API
API Reference for the .NET Cheetah SDK (NuGet)
namespace: Pv
Cheetah
Class for the Cheetah Speech-to-Text engine.
Cheetah.Create()
Cheetah constructor.
Parameters
accessKeystring : AccessKey obtained from Picovoice Console.modelPathstring : Absolute path to the file containing model parameters (.pv).devicestring : String representation of the device (e.g., CPU or GPU) to use. If set tobest, the most suitable device is selected automatically. If set togpu, the engine uses the first available GPU device. To select a specific GPU device, set this argument togpu:${GPU_INDEX}, where${GPU_INDEX}is the index of the target GPU. If set tocpu, the engine will run on the CPU with the default number of threads. To specify the number of threads, set this argument tocpu:${NUM_THREADS}, where${NUM_THREADS}is the desired number of threads.endpointDurationSecfloat : Duration of endpoint in seconds. A speech endpoint is detected when there is a chunk of audio (with a duration specified herein) after an utterance without any speech in it. Set duration to 0 to disable this. Default is 1 second in the Builder.enableAutomaticPunctuationbool : Enable automatic punctuation. Default is false.
Returns
Cheetah: An instance of Cheetah Speech-To-Text engine.
Throws
CheetahException: If an error occurs while creating an instance of the Cheetah Speech-to-Text engine.
Cheetah.Process()
Processes a frame of audio and returns newly-transcribed text and a flag indicating if an endpoint has been detected.
Upon detection of an endpoint, the client may invoke .Flush() to retrieve any remaining
transcription.
The number of samples per frame can be attained by calling .FrameLength. The incoming
audio needs to have a sample rate equal to .SampleRate and be 16-bit linearly-encoded.
Furthermore, Cheetah operates on single-channel audio.
Parameters
pcmshort[] : A frame of audio samples.
Returns
CheetahTranscript: Inferred transcription object.
Throws
CheetahException: if there is an error while processing the audio frame.
Cheetah.Flush()
Processes any remaining audio data and returns its transcription.
Returns
CheetahTranscript: Inferred transcription object.
Throws
CheetahException: If there is an error while processing the audio frame.
Cheetah.FrameLength
Getter for number of audio samples per frame.
Returns
int: Number of audio samples per frame.
Cheetah.SampleRate
Getter for audio sample rate accepted by Picovoice.
Returns
int: Audio sample rate accepted by Picovoice.
Cheetah.Version
Getter for version.
Returns
string: CurrentCheetahversion.
Cheetah.GetAvailableDevices()
Retrieves a list of hardware devices that can be specified when constructing Cheetah.
Returns
string[]: An array of available hardware devices.
Throws
CheetahException: If an error occurs while retrieving the hardware devices.
CheetahTranscript
Class that contains Cheetah transcript data.
CheetahTranscript.Transcript
Getter for transcript data.
Returns
string: Inferred transcription.
CheetahTranscript.IsEndpoint
Getter for IsEndpoint flag.
Returns
boolean: Iftrue,Cheetahdetected a speech endpoint.
CheetahException
Exception thrown if an error occurs within Cheetah Speech-to-Text engine.
Exceptions: