Leopard Speech-to-Text
.NET API
API Reference for the .NET Leopard SDK (NuGet)
namespace: Pv
Leopard
Class for the Leopard Speech-to-Text engine.
Leopard.Create()
Leopard
constructor.
Parameters
accessKey
string : AccessKey obtained from Picovoice Console.modelPath
string : Absolute path to the file containing model parameters (.pv
).enableAutomaticPunctuation
bool : Whether to enable automatic punctuation.enableDiarization
bool : Whether to enable diarization. Set totrue
to enable speaker diarization, which allows Leopard to differentiate speakers as part of the transcription process. Word metadata will include aspeaker_tag
to identify unique speakers.
Returns
Leopard
: An instance of Leopard Speech-To-Text engine.
Throws
LeopardException
: If an error occurs while creating an instance of the Leopard Speech-to-Text engine.
Leopard.Process()
Processes given audio data and returns its transcription. The incoming audio needs to have a sample rate equal
to .SampleRate()
and be 16-bit linearly-encoded. Furthermore, Leopard
operates on
single channel audio. If you wish to process data in a different sample rate or format consider
using .ProcessFile()
.
Parameters
pcm
short[] : Audio data.
Returns
LeopardTranscript
: object which contains the transcription results of the engine.
Throws
LeopardException
: if there is an error while processing the audio frame.
Leopard.ProcessFile()
Processes a given audio file and returns its transcription.
Parameters
audioPath
string : Absolute path to the audio file. The supported audio file formats are:3gp (AMR)
,FLAC
,MP3
,MP4/m4a (AAC)
,Ogg
,WAV
andWebM
.
Returns
LeopardTranscript
: object which contains the transcription results of the engine.
Throws
LeopardException
: if there is an error while processing the audio file.
Leopard.SampleRate
Getter for audio sample rate accepted by Picovoice.
Returns
int
: Audio sample rate accepted by Picovoice.
Leopard.Version
Getter for version.
Returns
string
: CurrentLeopard
version.
LeopardTranscript
Class that contains transcription results returned from Leopard.process()
and Leopard.processFile()
.
Parameters
transcriptString
String : Inferred transcription.wordArray
LeopardWord[] : Transcribed words and their associated metadata.
LeopardTranscript.TranscriptString
Getter for the inferred transcription.
Returns
String
: Inferred transcription.
LeopardTranscript.WordArray
Getter for transcribed words and their associated metadata.
Returns
LeopardWord[]
: Transcribed words and their associated metadata.
LeopardWord
Class for storing word metadata.
Parameters
word
String : Transcribed word.confidence
float : Transcription confidence. It is a number within [0, 1].startSec
float : Start of word in seconds.endSec
float : End of word in seconds.speakerTag
int : The speaker tag is-1
if diarization is not enabled during initialization; otherwise, it's a non-negative integer identifying unique speakers, with0
reserved for unknown speakers.
LeopardWord.Word
Getter for the transcribed word.
Returns
String
: Transcribed word.
LeopardWord.Confidence
Getter for the transcription confidence.
Returns
float
: Transcription confidence. It is a number within [0, 1].
LeopardWord.StartSec
Getter for the start of word in seconds.
Returns
float
: Start of word in seconds.
LeopardWord.EndSec
Getter for the end of word in seconds.
Returns
float
: End of word in seconds.
LeopardWord.SpeakerTag
Getter for the speaker tag.
Returns
int
: Speaker tag associated with speaker.
LeopardException
Exception thrown if an error occurs within Leopard
Speech-to-Text engine.
Exceptions: