Leopard Speech-to-Text
Go API
API Reference for the Leopard Go SDK (pkg.go.dev)
leopard.SampleRate
Audio sample rate accepted by Leopard.
leopard.Version
Leopard version.
leopard.Leopard
Struct for the Leopard Speech-to-Text engine.
leopard.Leopard.AccessKey
AccessKey obtained from Picovoice Console (https://console.picovoice.ai/).
leopard.Leopard.ModelPath
Absolute path to the file containing model parameters.
leopard.Leopard.LibraryPath
Absolute path to Leopard's dynamic library.
leopard.Leopard.EnableAutomaticPunctuation
Flag to enable automatic punctuation insertion.
leopard.Leopard.EnableDiarization
Flag to enable automatic diarization. Set to true
to enable speaker diarization, which allows Leopard
to differentiate speakers as part of the transcription process. Word metadata will include a
speakerTag
to identify unique speakers.
leopard.Leopard.Delete()
Releases resources acquired by Leopard.
Returns
error
: Error produced by the Leopard SDK.nil
if no error was encountered.
leopard.Leopard.Init()
Init function for Leopard. Must be called before attempting process.
Returns
error
: Error produced by the Leopard SDK.nil
if no error was encountered.
leopard.Leopard.Process()
Processes a given audio data and returns its transcription. The audio needs to have a sample rate equal to .SampleRate and be 16-bit linearly-encoded. This function operates on single-channel audio. If you wish to process data in a different sample rate or format consider using ProcessFile. Returns the inferred transcription.
Returns
string
: Transcription for the given audio.[]LeopardWord
: List of words in the transcription and their associated metadata.error
: Error produced by the Leopard SDK.nil
if no error was encountered.
leopard.Leopard.ProcessFile()
Processes a given audio file and returns its transcription. The file needs to have a sample rate equal to or greater
than
.SampleRate. The supported formats are: FLAC
, MP3
, Ogg
, Opus
, Vorbis
, WAV
, and WebM
.
Returns the inferred transcription.
Returns
string
: Transcription for the given audio file.[]LeopardWord
: List of words in the transcription and their associated metadata.error
: Error produced by the Leopard SDK.nil
if no error was encountered.
leopard.NewLeopard()
Creates a Leopard struct with default parameters.
Parameters
accessKey
string : AccessKey obtained from Picovoice Console (https://console.picovoice.ai/).
Returns
Leopard
: An instance of Leopard struct.
LeopardWord
Struct which contains a transcribed word and their associated metadata.
Word
string : Transcribed word.StartSec
float32 : Start of word in seconds.EndSec
float32 : End of word in seconds.Confidence
float32 : Transcription confidence. It is a number within [0, 1].SpeakerTag
int32 : The speaker tag is-1
if diarization is not enabled during initialization; otherwise, it's a non-negative integer identifying unique speakers, with0
reserved for unknown speakers.
leopard.LeopardError
Custom error type for errors produced from the Leopard Go SDK.
leopard.LeopardError.Error()
Formats the Leopard error into a string.
Returns
string
: Formatted error string.
leopard.PvStatus
Status return codes from the Leopard library. Possible values are: