Leopard Speech-to-Text
C Quick Start
Platforms
- Linux (x86_64)
- macOS (x86_64, arm64)
- Windows (x86_64)
- Raspberry Pi (3, 4, 5)
Requirements
Picovoice Account & AccessKey
Signup or Login to Picovoice Console to get your AccessKey
.
Make sure to keep your AccessKey
secret.
Quick Start
Setup
- Clone the repository:
Usage
- Include the public header files (
picovoice.h
andpv_leopard.h
). - Link the project to an appropriate precompiled library for the target platform and load it.
- Download a custom model from Picovoice Console or use a default language model.
- Construct the Leopard Speech-to-Text object:
- Pass in an audio path to the
pv_leopard_process_file
function:
- Release resources explicitly when done with Leopard Speech-to-Text:
Word Metadata
Along with the transcript, Leopard Speech-to-Text returns metadata for each transcribed word. Available metadata items are:
- Start Time: Indicates when the word started in the transcribed audio. Value is in seconds.
- End Time: Indicates when the word ended in the transcribed audio. Value is in seconds.
- Confidence: Leopard Speech-to-Text's confidence that the transcribed word is accurate. It is a number within
[0, 1]
. - Speaker Tag: If speaker diarization is enabled on initialization, the speaker tag is a non-negative integer identifying unique speakers, with
0
reserved for unknown speakers. If speaker diarization is not enabled, the value will always be-1
.
Demo
For the Leopard Speech-to-Text SDK, we offer demo applications that demonstrate how to use the Speech-to-Text engine on audio recordings.
Setup
- Clone the Leopard Speech-to-Text repository from GitHub using HTTPS:
- Build the demo:
Usage
To see the usage options for the demo:
Run the command corresponding to your platform from the root of the repository:
For more information on our Leopard Speech-to-Text demos for C, head over to our GitHub repository.