Cheetah - C API
API Reference for the Cheetah C SDK.
pv_cheetah_t
typedef struct pv_cheetah pv_cheetah_t;
Container representing the Cheetah Speech-to-Text engine.
pv_cheetah_init()
pv_status_t pv_cheetah_init(const char *access_key,const char *model_path,float endpoint_duration_sec,pv_cheetah_t **object);
Create a Cheetah instance. Resources should be cleaned when you are done using the pv_cheetah_delete() function.
Parameters
access_key
const char * : AccessKey obtained from Picovoice Console.model_path
const char * : Absolute path to the file containing model parameters (.pv
).endpoint_duration
float : Duration of endpoint in seconds. A speech endpoint is detected when there is a segment of audio (with a duration specified herein) after an utterance without any speech in it. Set to0
to disable endpoint detection.object
pv_cheetah_t * * : Constructed instance of Cheetah.
Returns
- pv_status_t : Status code.
pv_cheetah_delete()
void pv_cheetah_delete(pv_cheetah_t *object);
Releases resources acquired by Cheetah.
Parameters
object
pv_cheetah_t * : Picovoice object.
pv_cheetah_process()
pv_status_t pv_cheetah_process(pv_cheetah_t *object,const int16_t *pcm,char **transcript,bool *is_endpoint);
Processes a frame of audio and returns newly-transcribed text and a flag indicating if an endpoint has been detected. Upon detection of an endpoint, the client may invoke pv_cheetah_flush()
to retrieve any remaining transcription. The caller is responsible for freeing the transcription buffer.
The number of samples per frame can be attained by calling pv_cheetah_frame_length()
. The incoming audio needs to have a sample rate equal to pv_sample_rate()
and be 16-bit linearly-encoded. Cheetah operates on single-channel audio.
Parameters
object
pv_cheetah_t * : Cheetah object.pcm
int16_t : A frame of audio samples.transcript
char * * : Inferred transcription.is_endpoint
bool * : Flag indicating if an endpoint has been detected. If endpoint is disabled then set toNULL
.
Returns
- pv_status_t : Status code.
pv_cheetah_flush()
pv_status_t pv_cheetah_flush(pv_cheetah_t *object, char **transcript);
Marks the end of the audio stream, flushes internal state of the object, and returns any remaining transcript. The caller is responsible for freeing the transcription buffer.
Parameters
object
pv_cheetah_t * : Cheetah object.transcript
char * * : Inferred transcription.
Returns
- pv_status_t : Status code.
pv_cheetah_version()
const char *pv_cheetah_version(void);
Getter for version.
Returns
- const char * : Cheetah version.
pv_porcupine_frame_length()
int32_t pv_cheetah_frame_length(void);
Getter for number of audio samples per frame.
Returns
- int32_t : Frame length.
pv_sample_rate()
int32_t pv_sample_rate(void);
Audio sample rate accepted by Cheetah.
Returns
- int32_t : Sample rate.
pv_status_t
typedef enum {PV_STATUS_SUCCESS = 0,PV_STATUS_OUT_OF_MEMORY,PV_STATUS_IO_ERROR,PV_STATUS_INVALID_ARGUMENT,PV_STATUS_STOP_ITERATION,PV_STATUS_KEY_ERROR,PV_STATUS_INVALID_STATE,PV_STATUS_RUNTIME_ERROR,PV_STATUS_ACTIVATION_ERROR,PV_STATUS_ACTIVATION_LIMIT_REACHED,PV_STATUS_ACTIVATION_THROTTLED,PV_STATUS_ACTIVATION_REFUSED} pv_status_t;
Status code enum.
pv_status_to_string()
const char *pv_status_to_string(pv_status_t status);
Parameters
status
int32_t : Status code.
Returns
- const char * : String representation of status code.