Leopard Speech-to-Text
.NET Quick Start
Platforms
- Linux (x86_64)
- macOS (x86_64, arm64)
- Windows (x86_64)
- NVIDIA Jetson Nano
- Raspberry Pi (3, 4, 5)
Requirements
.NET Framework 4.6.1+ / .NET Standard 2.0+ / .NET Core 3.0+:
- Windows (x86_64)
.NET Standard 2.0+ / .NET Core 3.0+:
- Linux (x86_64)
- macOS (x86_64)
.NET Core 3.0+:
- NVIDIA Jetson Nano
- Raspberry Pi (3, 4)
.NET 6.0+:
- macOS (arm64)
Picovoice Account & AccessKey
Signup or Login to Picovoice Console to get your AccessKey
.
Make sure to keep your AccessKey
secret.
Quick Start
Setup
Install .NET.
Install the Leopard NuGet package in Visual Studio or using the .NET CLI:
Usage
Create an instance of the engine:
Transcribe an audio file by providing an absolute path to the file:
Transcribe raw audio data (sample rate of 16 kHz, 16-bit linearly encoded and 1 channel):
Language Model
The Leopard .NET SDK comes preloaded with a default English language model (.pv
file).
Default models for other supported languages can be found in the Leopard GitHub repository.
Create custom language models using the Picovoice Console. Here you can train language models with custom vocabulary and boost words in the existing vocabulary.
To switch from the default English model, pass in a .pv
file to the .Create()
constructor:
Word Metadata
Along with the transcript, Leopard returns metadata for each transcribed word. Available metadata items are:
- Start Time: Indicates when the word started in the transcribed audio. Value is in seconds.
- End Time: Indicates when the word ended in the transcribed audio. Value is in seconds.
- Confidence: Leopard's confidence that the transcribed word is accurate. It is a number within
[0, 1]
. - Speaker Tag: If speaker diarization is enabled on initialization, the speaker tag is a non-negative integer identifying unique speakers, with
0
reserved for unknown speakers. If speaker diarization is not enabled, the value will always be-1
.
Demo
For the Leopard .NET SDK, we offer demo applications that demonstrate how to use the Speech-to-Text engine on audio files.
Setup
- Clone the Leopard repository from GitHub:
- Build the demo:
Usage
Use the --help
flag to see the usage options for the demo:
Run the following command to transcribe an audio file:
For more information on our Leopard demos for .NET, head over to our GitHub repository.