Leopard Speech-to-Text
C API
API Reference for the Leopard C SDK.
pv_leopard_t
Container representing the Leopard Speech-to-Text engine.
pv_leopard_init()
Creates a Leopard instance. Resources should be cleaned when you are done using the pv_leopard_delete() function.
Parameters
access_keyconst char * : AccessKey obtained from Picovoice Console.model_pathconst char * : Absolute path to the file containing model parameters (.pv).devicechar * : String representation of the device (e.g., CPU or GPU) to use. If set tobest, the most suitable device is selected automatically. If set togpu, the engine uses the first available GPU device. To select a specific GPU device, set this argument togpu:${GPU_INDEX}, where${GPU_INDEX}is the index of the target GPU. If set tocpu, the engine will run on the CPU with the default number of threads. To specify the number of threads, set this argument tocpu:${NUM_THREADS}, where${NUM_THREADS}is the desired number of threads.enable_automatic_punctuationbool : Set totrueto enable automatic punctuation insertion.enable_diarizationbool: Set totrueto enable speaker diarization, which allows Leopard to differentiate speakers as part of the transcription process. Word metadata will include aspeaker_tagto identify unique speakers.objectpv_leopard_t * * : Constructed instance of Leopard.
Returns
- pv_status_t : Status code.
pv_leopard_delete()
Releases resources acquired by Leopard.
Parameters
objectpv_leopard_t * : Picovoice object.
pv_leopard_process()
Processes a given audio data and returns its transcription. The caller is responsible for freeing the transcript
and words buffers using pv_leopard_transcript_delete()
and pv_leopard_words_delete(), respectively. The audio needs to have a sample rate equal
to pv_sample_rate() and be 16-bit linearly-encoded. This function operates on single-channel audio.
Parameters
objectpv_leopard_t * : Leopard object.pcmint16_t : A frame of audio samples.num_samplesint32_t : Number of audio samples to process.transcriptchar * * : Inferred transcription.num_wordsint32_t * : Number of transcribed words.wordspv_word_t * * : Transcribed words and their associated metadata.
Returns
- pv_status_t : Status code.
pv_leopard_process_file()
Processes a given audio file and returns its transcription. The caller is responsible for freeing the transcript
and words buffers using pv_leopard_transcript_delete()
and pv_leopard_words_delete(), respectively. The supported formats
are: 3gp (AMR), FLAC, MP3, MP4/m4a (AAC), Ogg, WAV and WebM.
Parameters
objectpv_leopard_t * : Leopard object.audio_pathconst char * : Absolute path to the audio file.transcriptchar * * : Inferred transcription.num_wordsint32_t * : Number of transcribed words.wordspv_word_t * * : Transcribed words and their associated metadata.
Returns
- pv_status_t : Status code.
pv_leopard_transcript_delete()
Deletes transcript returned from pv_leopard_process() or pv_leopard_process_file().
Parameters
transcriptchar * : transcription string returned from pv_leopard_process() or pv_leopard_process_file().
pv_leopard_words_delete()
Deletes words returned from pv_leopard_process() or pv_leopard_process_file().
Parameters
wordspv_word_t * * : transcribed words returned from pv_leopard_process() or pv_leopard_process_file().
pv_leopard_version()
Getter for version.
Returns
- const char * : Leopard version.
pv_sample_rate()
Audio sample rate accepted by Leopard.
Returns
- int32_t : Sample rate.
pv_leopard_list_hardware_devices()
Gets a list of hardware devices that can be specified when calling pv_leopard_init().
Parameters
hardware_devicesconst char * * * : Array of available hardware devices. Devices are NULL terminated strings. The array must be freed usingpv_leopard_free_hardware_devices().num_hardware_devicesint32_t * : The number of devices in thehardware_devicesarray.
Returns
- pv_status_t : Returned status code.
pv_leopard_free_hardware_devices()
This function frees the memory allocated by pv_leopard_list_hardware_devices().
Parameters
hardware_devicesconst char * * * : Array of available hardware devices allocated bypv_leopard_list_hardware_devices().num_hardware_devicesint32_t * : The number of devices in thehardware_devicesarray.
pv_word_t
Struct for a transcribed word and its associated metadata.
pv_status_t
Status code enum.
pv_status_to_string()
Parameters
statusint32_t : Status code.
Returns
- const char * : String representation of status code.
pv_get_error_stack()
If a function returns a failure (any pv_status_t other than PV_STATUS_SUCCESS), this function can be
called to get a series of error messages related to the failure. This function can only be called only once per
failure status on another function. The memory for message_stack must be freed using pv_free_error_stack.
Regardless of the return status of this function, if message_stack is not NULL, then message_stack
contains valid memory. However, a failure status on this function indicates that future error messages
may not be reported.
Parameters
message_stackconst char * * * : Array of messages relating to the failure. Messages are NULL terminated strings. The array and messages must be freed usingpv_free_error_stack().message_stack_depthint32_t * : The number of messages in themessage_stackarray.
Returns
- pv_status_t : Returned status code.
pv_free_error_stack()
This function frees the memory used by error messages allocated by pv_get_error_stack().
Parameters
message_stackconst char * * * : Array of messages relating to the failure.