Leopard Speech-to-Text
.NET API
API Reference for the .NET Leopard SDK (NuGet)
namespace: Pv
Leopard
Class for the Leopard Speech-to-Text engine.
Leopard.Create()
Leopard constructor.
Parameters
accessKeystring : AccessKey obtained from Picovoice Console.modelPathstring : Absolute path to the file containing model parameters (.pv).enableAutomaticPunctuationbool : Whether to enable automatic punctuation.enableDiarizationbool : Whether to enable diarization. Set totrueto enable speaker diarization, which allows Leopard to differentiate speakers as part of the transcription process. Word metadata will include aspeaker_tagto identify unique speakers.
Returns
Leopard: An instance of Leopard Speech-To-Text engine.
Throws
LeopardException: If an error occurs while creating an instance of the Leopard Speech-to-Text engine.
Leopard.Process()
Processes given audio data and returns its transcription. The incoming audio needs to have a sample rate equal
to .SampleRate() and be 16-bit linearly-encoded. Furthermore, Leopard operates on
single channel audio. If you wish to process data in a different sample rate or format consider
using .ProcessFile().
Parameters
pcmshort[] : Audio data.
Returns
LeopardTranscript: object which contains the transcription results of the engine.
Throws
LeopardException: if there is an error while processing the audio frame.
Leopard.ProcessFile()
Processes a given audio file and returns its transcription.
Parameters
audioPathstring : Absolute path to the audio file. The supported audio file formats are:3gp (AMR),FLAC,MP3,MP4/m4a (AAC),Ogg,WAVandWebM.
Returns
LeopardTranscript: object which contains the transcription results of the engine.
Throws
LeopardException: if there is an error while processing the audio file.
Leopard.SampleRate
Getter for audio sample rate accepted by Picovoice.
Returns
int: Audio sample rate accepted by Picovoice.
Leopard.Version
Getter for version.
Returns
string: CurrentLeopardversion.
LeopardTranscript
Class that contains transcription results returned from Leopard.process()
and Leopard.processFile().
Parameters
transcriptStringString : Inferred transcription.wordArrayLeopardWord[] : Transcribed words and their associated metadata.
LeopardTranscript.TranscriptString
Getter for the inferred transcription.
Returns
String: Inferred transcription.
LeopardTranscript.WordArray
Getter for transcribed words and their associated metadata.
Returns
LeopardWord[]: Transcribed words and their associated metadata.
LeopardWord
Class for storing word metadata.
Parameters
wordString : Transcribed word.confidencefloat : Transcription confidence. It is a number within [0, 1].startSecfloat : Start of word in seconds.endSecfloat : End of word in seconds.speakerTagint : The speaker tag is-1if diarization is not enabled during initialization; otherwise, it's a non-negative integer identifying unique speakers, with0reserved for unknown speakers.
LeopardWord.Word
Getter for the transcribed word.
Returns
String: Transcribed word.
LeopardWord.Confidence
Getter for the transcription confidence.
Returns
float: Transcription confidence. It is a number within [0, 1].
LeopardWord.StartSec
Getter for the start of word in seconds.
Returns
float: Start of word in seconds.
LeopardWord.EndSec
Getter for the end of word in seconds.
Returns
float: End of word in seconds.
LeopardWord.SpeakerTag
Getter for the speaker tag.
Returns
int: Speaker tag associated with speaker.
LeopardException
Exception thrown if an error occurs within Leopard Speech-to-Text engine.
Exceptions: