Picovoice Wordmark
Start Building
Introduction
Introduction
AndroidC.NETiOSLinuxmacOSNode.jsPythonRaspberry PiWebWindows
AndroidC.NETiOSNode.jsPythonWeb
SummaryPicovoice picoLLMGPTQ
Introduction
AndroidC.NETFlutteriOSJavaLinuxmacOSNode.jsPythonRaspberry PiReactReact NativeWebWindows
AndroidC.NETFlutteriOSJavaNode.jsPythonReactReact NativeWeb
SummaryPicovoice LeopardAmazon TranscribeAzure Speech-to-TextGoogle ASRGoogle ASR (Enhanced)IBM Watson Speech-to-TextWhisper Speech-to-Text
FAQ
Introduction
AndroidC.NETFlutteriOSJavaLinuxmacOSNode.jsPythonRaspberry PiReactReact NativeWebWindows
AndroidC.NETFlutteriOSJavaNode.jsPythonReactReact NativeWeb
SummaryPicovoice CheetahAzure Real-Time Speech-to-TextAmazon Transcribe StreamingGoogle Streaming ASR
FAQ
Introduction
AndroidC.NETiOSLinuxmacOSNode.jsPythonRaspberry PiWebWindows
AndroidC.NETiOSNode.jsPythonWeb
SummaryAmazon PollyAzure TTSElevenLabsOpenAI TTSPicovoice Orca
Introduction
AndroidCiOSLinuxmacOSPythonRaspberry PiWebWindows
AndroidCiOSPythonWeb
SummaryPicovoice KoalaMozilla RNNoise
Introduction
AndroidCiOSLinuxmacOSNode.jsPythonRaspberry PiWebWindows
AndroidCNode.jsPythoniOSWeb
SummaryPicovoice EaglepyannoteSpeechBrainWeSpeaker
Introduction
AndroidCiOSLinuxmacOSPythonRaspberry PiWebWindows
AndroidCiOSPythonWeb
SummaryPicovoice FalconAmazon TranscribeAzure Speech-to-TextGoogle Speech-to-Textpyannote
Introduction
AndroidArduinoCChrome.NETEdgeFirefoxFlutteriOSJavaLinuxmacOSMicrocontrollerNode.jsPythonRaspberry PiReactReact NativeSafariWebWindows
AndroidC.NETFlutteriOSJavaMicrocontrollerNode.jsPythonReactReact NativeWeb
SummaryPicovoice PorcupineSnowboyPocketSphinx
Wake Word TipsFAQ
Introduction
AndroidCChrome.NETEdgeFirefoxFlutteriOSJavaLinuxmacOSNode.jsPythonRaspberry PiReactReact NativeSafariWebWindows
AndroidC.NETFlutteriOSJavaNode.jsPythonReactReact NativeWeb
SummaryPicovoice RhinoGoogle DialogflowAmazon LexIBM WatsonMicrosoft LUIS
Expression SyntaxFAQ
Introduction
AndroidC.NETiOSLinuxmacOSNode.jsPythonRaspberry PiWebWindows
AndroidC.NETiOSNode.jsPythonWeb
SummaryPicovoice CobraWebRTC VADSilero VAD
FAQ
Introduction
AndroidC.NETFlutteriOSNode.jsPythonReact NativeUnityWeb
AndroidC.NETFlutteriOSNode.jsPythonReact NativeUnityWeb
Introduction
C.NETNode.jsPython
C.NETNode.jsPython
FAQGlossary

Leopard Speech-to-Text
C API

API Reference for the Leopard C SDK.


pv_leopard_t

typedef struct pv_leopard pv_leopard_t;

Container representing the Leopard Speech-to-Text engine.


pv_leopard_init()

pv_status_t pv_leopard_init(
const char *access_key,
const char *model_path,
const char *device,
bool enable_automatic_punctuation,
bool enable_diarization,
pv_leopard_t **object);

Creates a Leopard instance. Resources should be cleaned when you are done using the pv_leopard_delete() function.

Parameters

  • access_key const char * : AccessKey obtained from Picovoice Console.
  • model_path const char * : Absolute path to the file containing model parameters (.pv).
  • device char * : String representation of the device (e.g., CPU or GPU) to use. If set to best, the most suitable device is selected automatically. If set to gpu, the engine uses the first available GPU device. To select a specific GPU device, set this argument to gpu:${GPU_INDEX}, where ${GPU_INDEX} is the index of the target GPU. If set to cpu, the engine will run on the CPU with the default number of threads. To specify the number of threads, set this argument to cpu:${NUM_THREADS}, where ${NUM_THREADS} is the desired number of threads.
  • enable_automatic_punctuation bool : Set to true to enable automatic punctuation insertion.
  • enable_diarization bool: Set to true to enable speaker diarization, which allows Leopard to differentiate speakers as part of the transcription process. Word metadata will include a speaker_tag to identify unique speakers.
  • object pv_leopard_t * * : Constructed instance of Leopard.

Returns

  • pv_status_t : Status code.

pv_leopard_delete()

void pv_leopard_delete(pv_leopard_t *object);

Releases resources acquired by Leopard.

Parameters

  • object pv_leopard_t * : Picovoice object.

pv_leopard_process()

pv_status_t pv_leopard_process(
pv_leopard_t *object,
const int16_t *pcm,
int32_t num_samples,
char **transcript,
int32_t *num_words,
pv_word_t **words);

Processes a given audio data and returns its transcription. The caller is responsible for freeing the transcript and words buffers using pv_leopard_transcript_delete() and pv_leopard_words_delete(), respectively. The audio needs to have a sample rate equal to pv_sample_rate() and be 16-bit linearly-encoded. This function operates on single-channel audio.

Parameters

  • object pv_leopard_t * : Leopard object.
  • pcm int16_t : A frame of audio samples.
  • num_samples int32_t : Number of audio samples to process.
  • transcript char * * : Inferred transcription.
  • num_words int32_t * : Number of transcribed words.
  • words pv_word_t * * : Transcribed words and their associated metadata.

Returns

  • pv_status_t : Status code.

pv_leopard_process_file()

pv_status_t pv_leopard_process_file(
pv_leopard_t *object,
const char *audio_path,
char **transcript,
int32_t *num_words,
pv_word_t **words);

Processes a given audio file and returns its transcription. The caller is responsible for freeing the transcript and words buffers using pv_leopard_transcript_delete() and pv_leopard_words_delete(), respectively. The supported formats are: 3gp (AMR), FLAC, MP3, MP4/m4a (AAC), Ogg, WAV and WebM.

Parameters

  • object pv_leopard_t * : Leopard object.
  • audio_path const char * : Absolute path to the audio file.
  • transcript char * * : Inferred transcription.
  • num_words int32_t * : Number of transcribed words.
  • words pv_word_t * * : Transcribed words and their associated metadata.

Returns

  • pv_status_t : Status code.

pv_leopard_transcript_delete()

void pv_leopard_transcript_delete(char *transcript);

Deletes transcript returned from pv_leopard_process() or pv_leopard_process_file().

Parameters

  • transcript char * : transcription string returned from pv_leopard_process() or pv_leopard_process_file().

pv_leopard_words_delete()

void pv_leopard_words_delete(pv_word_t *words);

Deletes words returned from pv_leopard_process() or pv_leopard_process_file().

Parameters

  • words pv_word_t * * : transcribed words returned from pv_leopard_process() or pv_leopard_process_file().

pv_leopard_version()

const char *pv_leopard_version(void);

Getter for version.

Returns

  • const char * : Leopard version.

pv_sample_rate()

int32_t pv_sample_rate(void);

Audio sample rate accepted by Leopard.

Returns

  • int32_t : Sample rate.

pv_leopard_list_hardware_devices()

pv_status_t pv_leopard_list_hardware_devices(
char ***hardware_devices,
int32_t *num_hardware_devices);

Gets a list of hardware devices that can be specified when calling pv_leopard_init().

Parameters

  • hardware_devices const char * * * : Array of available hardware devices. Devices are NULL terminated strings. The array must be freed using pv_leopard_free_hardware_devices().
  • num_hardware_devices int32_t * : The number of devices in the hardware_devices array.

Returns

  • pv_status_t : Returned status code.

pv_leopard_free_hardware_devices()

void pv_leopard_free_hardware_devices(
char ***hardware_devices,
int32_t *num_hardware_devices);

This function frees the memory allocated by pv_leopard_list_hardware_devices().

Parameters

  • hardware_devices const char * * * : Array of available hardware devices allocated by pv_leopard_list_hardware_devices().
  • num_hardware_devices int32_t * : The number of devices in the hardware_devices array.

pv_word_t

typedef struct {
const char *word; /** Transcribed word. */
float start_sec; /** Start of word in seconds. */
float end_sec; /** End of word in seconds. */
float confidence; /** Transcription confidence. It is a number within [0, 1]. */
int32_t speaker_tag; /** The speaker tag is `-1` if diarization is not enabled during initialization;
* otherwise, it's a non-negative integer identifying unique speakers, with `0` reserved for
* unknown speakers.
*/
} pv_word_t;

Struct for a transcribed word and its associated metadata.


pv_status_t

typedef enum {
PV_STATUS_SUCCESS = 0,
PV_STATUS_OUT_OF_MEMORY,
PV_STATUS_IO_ERROR,
PV_STATUS_INVALID_ARGUMENT,
PV_STATUS_STOP_ITERATION,
PV_STATUS_KEY_ERROR,
PV_STATUS_INVALID_STATE,
PV_STATUS_RUNTIME_ERROR,
PV_STATUS_ACTIVATION_ERROR,
PV_STATUS_ACTIVATION_LIMIT_REACHED,
PV_STATUS_ACTIVATION_THROTTLED,
PV_STATUS_ACTIVATION_REFUSED
} pv_status_t;

Status code enum.


pv_status_to_string()

const char *pv_status_to_string(pv_status_t status);

Parameters

  • status int32_t : Status code.

Returns

  • const char * : String representation of status code.

pv_get_error_stack()

pv_status_t pv_get_error_stack(
char ***message_stack,
int32_t *message_stack_depth);

If a function returns a failure (any pv_status_t other than PV_STATUS_SUCCESS), this function can be called to get a series of error messages related to the failure. This function can only be called only once per failure status on another function. The memory for message_stack must be freed using pv_free_error_stack.

Regardless of the return status of this function, if message_stack is not NULL, then message_stack contains valid memory. However, a failure status on this function indicates that future error messages may not be reported.

Parameters

  • message_stack const char * * * : Array of messages relating to the failure. Messages are NULL terminated strings. The array and messages must be freed using pv_free_error_stack().
  • message_stack_depth int32_t * : The number of messages in the message_stack array.

Returns

  • pv_status_t : Returned status code.

pv_free_error_stack()

void pv_free_error_stack(char **message_stack);

This function frees the memory used by error messages allocated by pv_get_error_stack().

Parameters

  • message_stack const char * * * : Array of messages relating to the failure.

Was this doc helpful?

Issue with this doc?

Report a GitHub Issue
Leopard Speech-to-Text C API
  • pv_leopard_t
  • pv_leopard_init()
  • pv_leopard_delete()
  • pv_leopard_process()
  • pv_leopard_process_file()
  • pv_leopard_transcript_delete()
  • pv_leopard_words_delete()
  • pv_leopard_version()
  • pv_sample_rate()
  • pv_leopard_list_hardware_devices()
  • pv_leopard_free_hardware_devices()
  • pv_word_t
  • pv_status_t
  • pv_status_to_string()
  • pv_get_error_stack()
  • pv_free_error_stack()
Voice AI
  • picoLLM On-Device LLM
  • Leopard Speech-to-Text
  • Cheetah Streaming Speech-to-Text
  • Orca Text-to-Speech
  • Koala Noise Suppression
  • Eagle Speaker Recognition
  • Falcon Speaker Diarization
  • Porcupine Wake Word
  • Rhino Speech-to-Intent
  • Cobra Voice Activity Detection
Resources
  • Docs
  • Console
  • Blog
  • Use Cases
  • Playground
Sales & Services
  • Consulting
  • Foundation Plan
  • Enterprise Plan
  • Enterprise Support
Company
  • About us
  • Careers
Follow Picovoice
  • LinkedIn
  • GitHub
  • X
  • YouTube
  • AngelList
Subscribe to our newsletter
Terms of Use
Privacy Policy
© 2019-2025 Picovoice Inc.