Leopard Speech-to-Text — C API
API Reference for the Leopard C SDK.
pv_leopard_t
Container representing the Leopard Speech-to-Text engine.
pv_leopard_init()
Create a Leopard instance. Resources should be cleaned when you are done using the pv_leopard_delete() function.
Parameters
access_key
const char * : AccessKey obtained from Picovoice Console.model_path
const char * : Absolute path to the file containing model parameters (.pv
).enable_automatic_punctuation
bool : Set totrue
to enable automatic punctuation insertion.object
pv_leopard_t * * : Constructed instance of Leopard.
Returns
- pv_status_t : Status code.
pv_leopard_delete()
Releases resources acquired by Leopard.
Parameters
object
pv_leopard_t * : Picovoice object.
pv_leopard_process()
Processes a given audio data and returns its transcription. The caller is responsible for freeing the transcription buffers. The audio needs to have a sample rate equal to pv_sample_rate()
and be 16-bit linearly-encoded. This function operates on single-channel audio.
Parameters
object
pv_leopard_t * : Leopard object.pcm
int16_t : A frame of audio samples.num_samples
int32_t : Number of audio samples to process.transcript
char * * : Inferred transcription.num_words
int32_t * : Number of transcribed words.words
pv_word_t * * : Transcribed words and their associated metadata.
Returns
- pv_status_t : Status code.
pv_leopard_process_file()
Processes a given audio file and returns its transcription. The caller is responsible for freeing the transcription buffers. The supported formats are: 3gp (AMR)
, FLAC
, MP3
, MP4/m4a (AAC)
, Ogg
, WAV
and WebM
.
Parameters
object
pv_leopard_t * : Leopard object.audio_path
const char * : Absolute path to the audio file.transcript
char * * : Inferred transcription.num_words
int32_t * : Number of transcribed words.words
pv_word_t * * : Transcribed words and their associated metadata.
Returns
- pv_status_t : Status code.
pv_leopard_version()
Getter for version.
Returns
- const char * : Leopard version.
pv_sample_rate()
Audio sample rate accepted by Leopard.
Returns
- int32_t : Sample rate.
pv_word_t
Struct for a transcribed word and its associated metadata.
pv_status_t
Status code enum.
pv_status_to_string()
Parameters
status
int32_t : Status code.
Returns
- const char * : String representation of status code.