Picovoice Wordmark
Start Building
Introduction
Introduction
AndroidC.NETiOSLinuxmacOSNode.jsPythonRaspberry PiWebWindows
AndroidC.NETiOSNode.jsPythonWeb
SummaryPicovoice picoLLMGPTQ
Introduction
AndroidC.NETFlutteriOSJavaLinuxmacOSNode.jsPythonRaspberry PiReactReact NativeRustWebWindows
AndroidC.NETFlutteriOSJavaNode.jsPythonReactReact NativeRustWeb
SummaryPicovoice LeopardAmazon TranscribeAzure Speech-to-TextGoogle ASRGoogle ASR (Enhanced)IBM Watson Speech-to-TextWhisper Speech-to-Text
FAQ
Introduction
AndroidC.NETFlutteriOSJavaLinuxmacOSNode.jsPythonRaspberry PiReactReact NativeRustWebWindows
AndroidC.NETFlutteriOSJavaNode.jsPythonReactReact NativeRustWeb
SummaryPicovoice Cheetah
FAQ
Introduction
AndroidC.NETiOSLinuxmacOSNode.jsPythonRaspberry PiWebWindows
AndroidC.NETiOSNode.jsPythonWeb
SummaryAmazon PollyAzure TTSElevenLabsOpenAI TTSPicovoice Orca
Introduction
AndroidCiOSLinuxmacOSPythonRaspberry PiWebWindows
AndroidCiOSPythonWeb
SummaryPicovoice KoalaMozilla RNNoise
Introduction
AndroidCiOSLinuxmacOSNode.jsPythonRaspberry PiWebWindows
AndroidCNode.jsPythoniOSWeb
SummaryPicovoice EaglepyannoteSpeechBrainWeSpeaker
Introduction
AndroidCiOSLinuxmacOSPythonRaspberry PiWebWindows
AndroidCiOSPythonWeb
SummaryPicovoice FalconAmazon TranscribeAzure Speech-to-TextGoogle Speech-to-Textpyannote
Introduction
AndroidArduinoCChrome.NETEdgeFirefoxFlutteriOSJavaLinuxmacOSMicrocontrollerNode.jsPythonRaspberry PiReactReact NativeRustSafariUnityWebWindows
AndroidC.NETFlutteriOSJavaMicrocontrollerNode.jsPythonReactReact NativeRustUnityWeb
SummaryPorcupineSnowboyPocketSphinx
Wake Word TipsFAQ
Introduction
AndroidCChrome.NETEdgeFirefoxFlutteriOSJavaLinuxmacOSNode.jsPythonRaspberry PiReactReact NativeRustSafariUnityWebWindows
AndroidC.NETFlutteriOSJavaNode.jsPythonReactReact NativeRustUnityWeb
SummaryPicovoice RhinoGoogle DialogflowAmazon LexIBM WatsonMicrosoft LUIS
Expression SyntaxFAQ
Introduction
AndroidC.NETiOSLinuxmacOSNode.jsPythonRaspberry PiRustWebWindows
AndroidC.NETiOSNode.jsPythonRustWeb
SummaryPicovoice CobraWebRTC VAD
FAQ
Introduction
AndroidC.NETFlutteriOSNode.jsPythonReact NativeRustUnityWeb
AndroidC.NETFlutteriOSNode.jsPythonReact NativeRustUnityWeb
Introduction
C.NETNode.jsPython
C.NETNode.jsPython
FAQGlossary

Leopard Speech-to-Text
C API

API Reference for the Leopard C SDK.


pv_leopard_t

typedef struct pv_leopard pv_leopard_t;

Container representing the Leopard Speech-to-Text engine.


pv_leopard_init()

pv_status_t pv_leopard_init(
const char *access_key,
const char *model_path,
bool enable_automatic_punctuation,
bool enable_diarization,
pv_leopard_t **object);

Creates a Leopard instance. Resources should be cleaned when you are done using the pv_leopard_delete() function.

Parameters

  • access_key const char * : AccessKey obtained from Picovoice Console.
  • model_path const char * : Absolute path to the file containing model parameters (.pv).
  • enable_automatic_punctuation bool : Set to true to enable automatic punctuation insertion.
  • enable_diarization bool: Set to true to enable speaker diarization, which allows Leopard to differentiate speakers as part of the transcription process. Word metadata will include a speaker_tag to identify unique speakers.
  • object pv_leopard_t * * : Constructed instance of Leopard.

Returns

  • pv_status_t : Status code.

pv_leopard_delete()

void pv_leopard_delete(pv_leopard_t *object);

Releases resources acquired by Leopard.

Parameters

  • object pv_leopard_t * : Picovoice object.

pv_leopard_process()

pv_status_t pv_leopard_process(
pv_leopard_t *object,
const int16_t *pcm,
int32_t num_samples,
char **transcript,
int32_t *num_words,
pv_word_t **words);

Processes a given audio data and returns its transcription. The caller is responsible for freeing the transcript and words buffers using pv_leopard_transcript_delete() and pv_leopard_words_delete(), respectively. The audio needs to have a sample rate equal to pv_sample_rate() and be 16-bit linearly-encoded. This function operates on single-channel audio.

Parameters

  • object pv_leopard_t * : Leopard object.
  • pcm int16_t : A frame of audio samples.
  • num_samples int32_t : Number of audio samples to process.
  • transcript char * * : Inferred transcription.
  • num_words int32_t * : Number of transcribed words.
  • words pv_word_t * * : Transcribed words and their associated metadata.

Returns

  • pv_status_t : Status code.

pv_leopard_process_file()

pv_status_t pv_leopard_process_file(
pv_leopard_t *object,
const char *audio_path,
char **transcript,
int32_t *num_words,
pv_word_t **words);

Processes a given audio file and returns its transcription. The caller is responsible for freeing the transcript and words buffers using pv_leopard_transcript_delete() and pv_leopard_words_delete(), respectively. The supported formats are: 3gp (AMR), FLAC, MP3, MP4/m4a (AAC), Ogg, WAV and WebM.

Parameters

  • object pv_leopard_t * : Leopard object.
  • audio_path const char * : Absolute path to the audio file.
  • transcript char * * : Inferred transcription.
  • num_words int32_t * : Number of transcribed words.
  • words pv_word_t * * : Transcribed words and their associated metadata.

Returns

  • pv_status_t : Status code.

pv_leopard_transcript_delete()

void pv_leopard_transcript_delete(char *transcript);

Deletes transcript returned from pv_leopard_process() or pv_leopard_process_file().

Parameters

  • transcript char * : transcription string returned from pv_leopard_process() or pv_leopard_process_file().

pv_leopard_words_delete()

void pv_leopard_words_delete(pv_word_t *words);

Deletes words returned from pv_leopard_process() or pv_leopard_process_file().

Parameters

  • words pv_word_t * * : transcribed words returned from pv_leopard_process() or pv_leopard_process_file().

pv_leopard_version()

const char *pv_leopard_version(void);

Getter for version.

Returns

  • const char * : Leopard version.

pv_sample_rate()

int32_t pv_sample_rate(void);

Audio sample rate accepted by Leopard.

Returns

  • int32_t : Sample rate.

pv_word_t

typedef struct {
const char *word; /** Transcribed word. */
float start_sec; /** Start of word in seconds. */
float end_sec; /** End of word in seconds. */
float confidence; /** Transcription confidence. It is a number within [0, 1]. */
int32_t speaker_tag; /** The speaker tag is `-1` if diarization is not enabled during initialization;
* otherwise, it's a non-negative integer identifying unique speakers, with `0` reserved for
* unknown speakers.
*/
} pv_word_t;

Struct for a transcribed word and its associated metadata.


pv_status_t

typedef enum {
PV_STATUS_SUCCESS = 0,
PV_STATUS_OUT_OF_MEMORY,
PV_STATUS_IO_ERROR,
PV_STATUS_INVALID_ARGUMENT,
PV_STATUS_STOP_ITERATION,
PV_STATUS_KEY_ERROR,
PV_STATUS_INVALID_STATE,
PV_STATUS_RUNTIME_ERROR,
PV_STATUS_ACTIVATION_ERROR,
PV_STATUS_ACTIVATION_LIMIT_REACHED,
PV_STATUS_ACTIVATION_THROTTLED,
PV_STATUS_ACTIVATION_REFUSED
} pv_status_t;

Status code enum.


pv_status_to_string()

const char *pv_status_to_string(pv_status_t status);

Parameters

  • status int32_t : Status code.

Returns

  • const char * : String representation of status code.

pv_get_error_stack()

pv_status_t pv_get_error_stack(
char ***message_stack,
int32_t *message_stack_depth);

If a function returns a failure (any pv_status_t other than PV_STATUS_SUCCESS), this function can be called to get a series of error messages related to the failure. This function can only be called only once per failure status on another function. The memory for message_stack must be freed using pv_free_error_stack.

Regardless of the return status of this function, if message_stack is not NULL, then message_stack contains valid memory. However, a failure status on this function indicates that future error messages may not be reported.

Parameters

  • message_stack const char * * * : Array of messages relating to the failure. Messages are NULL terminated strings. The array and messages must be freed using pv_free_error_stack().
  • message_stack_depth int32_t * : The number of messages in the message_stack array.

Returns

  • pv_status_t : Returned status code.

pv_free_error_stack()

void pv_free_error_stack(char **message_stack);

This function frees the memory used by error messages allocated by pv_get_error_stack().

Parameters

  • message_stack const char * * * : Array of messages relating to the failure.

Was this doc helpful?

Issue with this doc?

Report a GitHub Issue
Leopard Speech-to-Text C API
  • pv_leopard_t
  • pv_leopard_init()
  • pv_leopard_delete()
  • pv_leopard_process()
  • pv_leopard_process_file()
  • pv_leopard_transcript_delete()
  • pv_leopard_words_delete()
  • pv_leopard_version()
  • pv_sample_rate()
  • pv_word_t
  • pv_status_t
  • pv_status_to_string()
  • pv_get_error_stack()
  • pv_free_error_stack()
Voice AI
  • Leopard Speech-to-Text
  • Cheetah Streaming Speech-to-Text
  • Orca Text-to-Speech
  • Koala Noise Suppression
  • Eagle Speaker Recognition
  • Falcon Speaker Diarization
  • Porcupine Wake Word
  • Rhino Speech-to-Intent
  • Cobra Voice Activity Detection
Local LLM
  • picoLLM Inference
  • picoLLM Compression
  • picoLLM GYM
Resources
  • Docs
  • Console
  • Blog
  • Use Cases
  • Playground
Sales & Services
  • Consulting
  • Foundation Plan
  • Enterprise Plan
  • Enterprise Support
Company
  • About us
  • Careers
Follow Picovoice
  • LinkedIn
  • GitHub
  • X
  • YouTube
  • AngelList
Subscribe to our newsletter
Terms of Use
Privacy Policy
© 2019-2025 Picovoice Inc.