Picovoice Wordmark
Start Building
Introduction
Introduction
AndroidC.NETiOSLinuxmacOSNode.jsPythonRaspberry PiWebWindows
AndroidC.NETiOSNode.jsPythonWeb
SummaryPicovoice picoLLMGPTQ
Introduction
AndroidC.NETFlutteriOSJavaLinuxmacOSNode.jsPythonRaspberry PiReactReact NativeRustWebWindows
AndroidC.NETFlutteriOSJavaNode.jsPythonReactReact NativeRustWeb
SummaryPicovoice LeopardAmazon TranscribeAzure Speech-to-TextGoogle ASRGoogle ASR (Enhanced)IBM Watson Speech-to-TextWhisper Speech-to-Text
FAQ
Introduction
AndroidC.NETFlutteriOSJavaLinuxmacOSNode.jsPythonRaspberry PiReactReact NativeRustWebWindows
AndroidC.NETFlutteriOSJavaNode.jsPythonReactReact NativeRustWeb
SummaryPicovoice Cheetah
FAQ
Introduction
AndroidC.NETiOSLinuxmacOSNode.jsPythonRaspberry PiWebWindows
AndroidC.NETiOSNode.jsPythonWeb
SummaryAmazon PollyAzure TTSElevenLabsOpenAI TTSPicovoice Orca
Introduction
AndroidCiOSLinuxmacOSPythonRaspberry PiWebWindows
AndroidCiOSPythonWeb
SummaryPicovoice KoalaMozilla RNNoise
Introduction
AndroidCiOSLinuxmacOSNode.jsPythonRaspberry PiWebWindows
AndroidCNode.jsPythoniOSWeb
SummaryPicovoice EaglepyannoteSpeechBrainWeSpeaker
Introduction
AndroidCiOSLinuxmacOSPythonRaspberry PiWebWindows
AndroidCiOSPythonWeb
SummaryPicovoice FalconAmazon TranscribeAzure Speech-to-TextGoogle Speech-to-Textpyannote
Introduction
AndroidArduinoCChrome.NETEdgeFirefoxFlutteriOSJavaLinuxmacOSMicrocontrollerNode.jsPythonRaspberry PiReactReact NativeRustSafariUnityWebWindows
AndroidC.NETFlutteriOSJavaMicrocontrollerNode.jsPythonReactReact NativeRustUnityWeb
SummaryPorcupineSnowboyPocketSphinx
Wake Word TipsFAQ
Introduction
AndroidCChrome.NETEdgeFirefoxFlutteriOSJavaLinuxmacOSNode.jsPythonRaspberry PiReactReact NativeRustSafariUnityWebWindows
AndroidC.NETFlutteriOSJavaNode.jsPythonReactReact NativeRustUnityWeb
SummaryPicovoice RhinoGoogle DialogflowAmazon LexIBM WatsonMicrosoft LUIS
Expression SyntaxFAQ
Introduction
AndroidC.NETiOSLinuxmacOSNode.jsPythonRaspberry PiRustWebWindows
AndroidC.NETiOSNode.jsPythonRustWeb
SummaryPicovoice CobraWebRTC VAD
FAQ
Introduction
AndroidC.NETFlutteriOSNode.jsPythonReact NativeRustUnityWeb
AndroidC.NETFlutteriOSNode.jsPythonReact NativeRustUnityWeb
Introduction
C.NETNode.jsPython
C.NETNode.jsPython
FAQGlossary

Leopard Speech-to-Text
Python API

API Reference for the Python Leopard SDK (PyPI).


pvleopard.create()

def create(
access_key: str,
model_path: Optional[str] = None,
library_path: Optional[str] = None,
enable_automatic_punctuation: bool = False,
enable_diarization: bool = False) -> Leopard

Factory method for Leopard Speech-to-Text engine.

Parameters

  • access_key str : AccessKey obtained from Picovoice Console.
  • model_path Optional[str] : Absolute path to the file containing model parameters.
  • library_path Optional[str] : Absolute path to Leopard's dynamic library.
  • enable_automatic_punctuation bool : Set to True to enable automatic punctuation insertion.
  • enable_diarization bool : Set to true to enable speaker diarization, which allows Leopard to differentiate speakers as part of the transcription process. Word metadata will include a speakerTag to identify unique speakers.

Returns

  • Leopard : An instance of Leopard Speech-to-Text engine.

Throws

  • LeopardError

pvleopard.Leopard

class Leopard(object)

Class for the Leopard Speech-to-Text engine. Leopard can be initialized either using the module level create() function or directly using the class __init__() method. Resources should be cleaned when you are done using the delete() method.


pvleopard.Leopard.version

self.version: str

The version string of the Leopard library.


pvleopard.Leopard.sample_rate

self.sample_rate: int

The audio sample rate the Leopard accepts.


pvleopard.Leopard.__init__()

def __init__(
self,
access_key: str,
model_path: str,
library_path: str,
enable_automatic_punctuation: bool = False,
enable_diarization: bool = False) -> Leopard

__init__ method for Leopard Speech-to-Text engine.

Parameters

  • access_key str : AccessKey obtained from Picovoice Console.
  • model_path str : Absolute path to the file containing model parameters.
  • library_path str : Absolute path to Leopard's dynamic library.
  • enable_automatic_punctuation bool : Set to True to enable automatic punctuation insertion.
  • enable_diarization bool : Set to true to enable speaker diarization, which allows Leopard to differentiate speakers as part of the transcription process. Word metadata will include a speakerTag to identify unique speakers.

Returns

  • Leopard: An instance of Leopard Speech-to-Text engine.

Throws

  • LeopardError

pvleopard.Leopard.delete()

def delete(self)

Releases resources acquired by Leopard.


pvleopard.Leopard.Word

Word = namedtuple('Word', ['word', 'start_sec', 'end_sec', 'confidence', 'speaker_tag])

Metadata associated with a transcribed word.

  • word str : Transcribed word.
  • start_sec float : Start of word in seconds
  • end_sec float : End of word in seconds
  • confidence float : Transcription confidence.
  • speaker_tag int : Speaker tag is -1 if diarization is not enabled during initialization; otherwise, it's a non-negative integer identifying unique speakers, with 0 reserved for unknown speakers.

pvleopard.Leopard.process()

def process(self, pcm: Sequence[int]) -> Tuple[str, Sequence[Word]]

Processes a given audio data and returns its transcription. The audio needs to have a sample rate equal to .sample_rate and be 16-bit linearly-encoded. This function operates on single-channel audio. If you wish to process data in a different sample rate or format consider using .process_file().

Parameters

  • pcm Sequence[int] : Audio data.

Returns

  • Tuple[str, Sequence[Word]] : Inferred transcription and sequence of transcribed words and their associated metadata.

Throws

  • LeopardError

pvleopard.Leopard.process_file()

def process_file(self, audio_path: str) -> Tuple[str, Sequence[Word]]

Processes a given audio file and returns its transcription. The supported formats are: 3gp (AMR), FLAC, MP3, MP4/m4a (AAC), Ogg, WAV, and WebM.

Parameters

  • audio_path str : Absolute path to the audio file.

Returns

  • Tuple[str, Sequence[Word]] : Inferred transcription and sequence of transcribed words and their associated metadata.

Throws

  • LeopardError

pvleopard.LeopardError

class LeopardError(Exception)

Error thrown if an error occurs within Leopard Speech-to-Text engine.

Exceptions

class LeopardActivationError(LeopardError)
class LeopardActivationLimitError(LeopardError)
class LeopardActivationRefusedError(LeopardError)
class LeopardActivationThrottledError(LeopardError)
class LeopardIOError(LeopardError)
class LeopardInvalidArgumentError(LeopardError)
class LeopardInvalidStateError(LeopardError)
class LeopardKeyError(LeopardError)
class LeopardMemoryError(LeopardError)
class LeopardRuntimeError(LeopardError)
class LeopardStopIterationError(LeopardError)

Was this doc helpful?

Issue with this doc?

Report a GitHub Issue
Leopard Speech-to-Text Python API
  • pvleopard.create()
  • pvleopard.Leopard
  • version
  • sample_rate
  • __init__()
  • delete()
  • Word
  • process()
  • process_file()
  • pvleopard.LeopardError
Voice AI
  • Leopard Speech-to-Text
  • Cheetah Streaming Speech-to-Text
  • Orca Text-to-Speech
  • Koala Noise Suppression
  • Eagle Speaker Recognition
  • Falcon Speaker Diarization
  • Porcupine Wake Word
  • Rhino Speech-to-Intent
  • Cobra Voice Activity Detection
Local LLM
  • picoLLM Inference
  • picoLLM Compression
  • picoLLM GYM
Resources
  • Docs
  • Console
  • Blog
  • Use Cases
  • Playground
Sales & Services
  • Consulting
  • Foundation Plan
  • Enterprise Plan
  • Enterprise Support
Company
  • About us
  • Careers
Follow Picovoice
  • LinkedIn
  • GitHub
  • X
  • YouTube
  • AngelList
Subscribe to our newsletter
Terms of Use
Privacy Policy
© 2019-2025 Picovoice Inc.