Leopard Speech-to-Text
Python API

API Reference for the Python Leopard SDK (PyPI).

pvleopard.`create()`

def create(
        access_key: str,
        model_path: Optional[str] = None,
        library_path: Optional[str] = None,
        enable_automatic_punctuation: bool = False,
        enable_diarization: bool = False) -> Leopard

Factory method for Leopard Speech-to-Text engine.

Parameters

access_key str : AccessKey obtained from Picovoice Console.
model_path Optional[str] : Absolute path to the file containing model parameters.
library_path Optional[str] : Absolute path to Leopard's dynamic library.
enable_automatic_punctuation bool : Set to True to enable automatic punctuation insertion.
enable_diarization bool : Set to true to enable speaker diarization, which allows Leopard to differentiate speakers as part of the transcription process. Word metadata will include a speakerTag to identify unique speakers.

Returns

Leopard : An instance of Leopard Speech-to-Text engine.

Throws

LeopardError

pvleopard.Leopard

class Leopard(object)

Class for the Leopard Speech-to-Text engine. Leopard can be initialized either using the module level create() function or directly using the class __init__() method. Resources should be cleaned when you are done using the delete() method.

pvleopard.Leopard.`version`

self.version: str

The version string of the Leopard library.

pvleopard.Leopard.`sample_rate`

self.sample_rate: int

The audio sample rate the Leopard accepts.

pvleopard.Leopard.`init()`

def __init__(
        self,
        access_key: str,
        model_path: str,
        library_path: str,
        enable_automatic_punctuation: bool = False,
        enable_diarization: bool = False) -> Leopard

__init__ method for Leopard Speech-to-Text engine.

Parameters

access_key str : AccessKey obtained from Picovoice Console.
model_path str : Absolute path to the file containing model parameters.
library_path str : Absolute path to Leopard's dynamic library.
enable_automatic_punctuation bool : Set to True to enable automatic punctuation insertion.
enable_diarization bool : Set to true to enable speaker diarization, which allows Leopard to differentiate speakers as part of the transcription process. Word metadata will include a speakerTag to identify unique speakers.

Returns

Leopard: An instance of Leopard Speech-to-Text engine.

Throws

LeopardError

pvleopard.Leopard.`delete()`

def delete(self)

Releases resources acquired by Leopard.

pvleopard.Leopard.Word

Word = namedtuple('Word', ['word', 'start_sec', 'end_sec', 'confidence', 'speaker_tag])

Metadata associated with a transcribed word.

word str : Transcribed word.
start_sec float : Start of word in seconds
end_sec float : End of word in seconds
confidence float : Transcription confidence.
speaker_tag int : Speaker tag is -1 if diarization is not enabled during initialization; otherwise, it's a non-negative integer identifying unique speakers, with 0 reserved for unknown speakers.

pvleopard.Leopard.`process()`

def process(self, pcm: Sequence[int]) -> Tuple[str, Sequence[Word]]

Processes a given audio data and returns its transcription. The audio needs to have a sample rate equal to .sample_rate and be 16-bit linearly-encoded. This function operates on single-channel audio. If you wish to process data in a different sample rate or format consider using .process_file().

Parameters

pcm Sequence[int] : Audio data.

Returns

Tuple[str, Sequence[Word]] : Inferred transcription and sequence of transcribed words and their associated metadata.

Throws

LeopardError

pvleopard.Leopard.`process_file()`

def process_file(self, audio_path: str) -> Tuple[str, Sequence[Word]]

Processes a given audio file and returns its transcription. The supported formats are: 3gp (AMR), FLAC, MP3, MP4/m4a (AAC), Ogg, WAV, and WebM.

Parameters

audio_path str : Absolute path to the audio file.

Returns

Tuple[str, Sequence[Word]] : Inferred transcription and sequence of transcribed words and their associated metadata.

Throws

LeopardError

pvleopard.LeopardError

class LeopardError(Exception)

Error thrown if an error occurs within Leopard Speech-to-Text engine.

Exceptions

class LeopardActivationError(LeopardError)
class LeopardActivationLimitError(LeopardError)
class LeopardActivationRefusedError(LeopardError)
class LeopardActivationThrottledError(LeopardError)
class LeopardIOError(LeopardError)
class LeopardInvalidArgumentError(LeopardError)
class LeopardInvalidStateError(LeopardError)
class LeopardKeyError(LeopardError)
class LeopardMemoryError(LeopardError)
class LeopardRuntimeError(LeopardError)
class LeopardStopIterationError(LeopardError)

Was this doc helpful?

Issue with this doc?

Leopard Speech-to-Text Python API

Leopard Speech-to-Text
Python API