Leopard Speech-to-Text
Python API

API Reference for the Python Leopard SDK (PyPI).

pvleopard.`create()`

def create(
        access_key: str,
        model_path: Optional[str] = None,
        device: Optional[str] = None,
        library_path: Optional[str] = None,
        enable_automatic_punctuation: bool = False,
        enable_diarization: bool = False) -> Leopard

Factory method for Leopard Speech-to-Text engine.

Parameters

access_key str : AccessKey obtained from Picovoice Console.
model_path Optional[str] : Absolute path to the file containing model parameters.
device Optional[str] : String representation of the device (e.g., CPU or GPU) to use. If set to best, the most suitable device is selected automatically. If set to gpu, the engine uses the first available GPU device. To select a specific GPU device, set this argument to gpu:${GPU_INDEX}, where ${GPU_INDEX} is the index of the target GPU. If set tocpu, the engine will run on the CPU with the default number of threads. To specify the number of threads, set this argument to cpu:${NUM_THREADS}, where ${NUM_THREADS} is the desired number of threads.
library_path Optional[str] : Absolute path to Leopard's dynamic library.
enable_automatic_punctuation bool : Set to True to enable automatic punctuation insertion.
enable_diarization bool : Set to true to enable speaker diarization, which allows Leopard to differentiate speakers as part of the transcription process. Word metadata will include a speakerTag to identify unique speakers.

Returns

Leopard : An instance of Leopard Speech-to-Text engine.

Throws

LeopardError

pvleopard.`available_devices()`

def available_devices(library_path: Optional[str] = None) -> Sequence[str]

Lists all available devices that Leopard can use for inference. Each entry in the list can be the device argument of create() factory method or Leopard constructor.

Parameters

library_path Optional[str] : Absolute path to Leopard's dynamic library. If not set it will be set to the default location.

Returns

Sequence[str]: List of all available devices that Leopard can use for inference.

Throws

LeopardError

pvleopard.Leopard

class Leopard(object)

Class for the Leopard Speech-to-Text engine. Leopard can be initialized either using the module level create() function or directly using the class __init__() method. Resources should be cleaned when you are done using the delete() method.

pvleopard.Leopard.`version`

self.version: str

The version string of the Leopard library.

pvleopard.Leopard.`sample_rate`

self.sample_rate: int

The audio sample rate the Leopard accepts.

pvleopard.Leopard.`init()`

def __init__(
        self,
        access_key: str,
        model_path: str,
        device: str,
        library_path: str,
        enable_automatic_punctuation: bool = False,
        enable_diarization: bool = False) -> Leopard

__init__ method for Leopard Speech-to-Text engine.

Parameters

access_key str : AccessKey obtained from Picovoice Console.
model_path str : Absolute path to the file containing model parameters.
device str : String representation of the device (e.g., CPU or GPU) to use. If set to best, the most suitable device is selected automatically. If set to gpu, the engine uses the first available GPU device. To select a specific GPU device, set this argument to gpu:${GPU_INDEX}, where ${GPU_INDEX} is the index of the target GPU. If set tocpu, the engine will run on the CPU with the default number of threads. To specify the number of threads, set this argument to cpu:${NUM_THREADS}, where ${NUM_THREADS} is the desired number of threads.
library_path str : Absolute path to Leopard's dynamic library.
enable_automatic_punctuation bool : Set to True to enable automatic punctuation insertion.
enable_diarization bool : Set to true to enable speaker diarization, which allows Leopard to differentiate speakers as part of the transcription process. Word metadata will include a speakerTag to identify unique speakers.

Returns

Leopard: An instance of Leopard Speech-to-Text engine.

Throws

LeopardError

pvleopard.Leopard.`delete()`

def delete(self)

Releases resources acquired by Leopard.

pvleopard.Leopard.Word

Word = namedtuple('Word', ['word', 'start_sec', 'end_sec', 'confidence', 'speaker_tag])

Metadata associated with a transcribed word.

word str : Transcribed word.
start_sec float : Start of word in seconds
end_sec float : End of word in seconds
confidence float : Transcription confidence.
speaker_tag int : Speaker tag is -1 if diarization is not enabled during initialization; otherwise, it's a non-negative integer identifying unique speakers, with 0 reserved for unknown speakers.

pvleopard.Leopard.`process()`

def process(self, pcm: Sequence[int]) -> Tuple[str, Sequence[Word]]

Processes a given audio data and returns its transcription. The audio needs to have a sample rate equal to .sample_rate and be 16-bit linearly-encoded. This function operates on single-channel audio. If you wish to process data in a different sample rate or format consider using .process_file().

Parameters

pcm Sequence[int] : Audio data.

Returns

Tuple[str, Sequence[Word]] : Inferred transcription and sequence of transcribed words and their associated metadata.

Throws

LeopardError

pvleopard.Leopard.`process_file()`

def process_file(self, audio_path: str) -> Tuple[str, Sequence[Word]]

Processes a given audio file and returns its transcription. The supported formats are: 3gp (AMR), FLAC, MP3, MP4/m4a (AAC), Ogg, WAV, and WebM.

Parameters

audio_path str : Absolute path to the audio file.

Returns

Tuple[str, Sequence[Word]] : Inferred transcription and sequence of transcribed words and their associated metadata.

Throws

LeopardError

pvleopard.`list_hardware_devices()`

def list_hardware_devices(library_path: str) -> Sequence[str]:

Lists all available devices that Leopard can use for inference. Each entry in the list can be the device argument of create() factory method or Leopard constructor.

Internal method. The higher level pvleopard.available_devices() should be used instead.

Parameters

library_path str : Absolute path to Leopard's dynamic library.

Returns

Sequence[str]: List of all available devices that Leopard can use for inference.

Throws

LeopardError

pvleopard.LeopardError

class LeopardError(Exception)

Error thrown if an error occurs within Leopard Speech-to-Text engine.

Exceptions

class LeopardActivationError(LeopardError)
class LeopardActivationLimitError(LeopardError)
class LeopardActivationRefusedError(LeopardError)
class LeopardActivationThrottledError(LeopardError)
class LeopardIOError(LeopardError)
class LeopardInvalidArgumentError(LeopardError)
class LeopardInvalidStateError(LeopardError)
class LeopardKeyError(LeopardError)
class LeopardMemoryError(LeopardError)
class LeopardRuntimeError(LeopardError)
class LeopardStopIterationError(LeopardError)

Was this doc helpful?

Issue with this doc?

Leopard Speech-to-Text Python API

Leopard Speech-to-Text
Python API