Leopard Speech-to-Text
Go Quick Start
Platforms
- Linux (x86_64)
- macOS (x86_64, arm64)
- Windows (x86_64)
- Raspberry Pi (3, 4, 5)
Requirements
- Picovoice Account & AccessKey
- Go 1.16+
- Windows only: a gcc compiler like Mingw in $PATH
Picovoice Account & AccessKey
Signup or Login to Picovoice Console to get your AccessKey
.
Make sure to keep your AccessKey
secret.
Quick Start
Setup
Download and install Go language.
Install the Leopard Speech-to-Text Go Package using the Go CLI:
Usage
Create an instance of the Leopard Speech-to-Text engine:
Transcribe an audio file:
When done be sure to explicitly release the resources using leopard.Delete()
.
Model File
The Leopard Speech-to-Text Go SDK comes preloaded with a default English language model (.pv
file).
Default models for other supported languages can be found in the Leopard Speech-to-Text GitHub repository.
Create custom language models using the Picovoice Console. Here you can train language models with custom vocabulary and boost words in the existing vocabulary.
Pass in the .pv
file by setting .ModelPath
on an instance of Leopard Speech-to-Text before initializing:
Word Metadata
Along with the transcript, Leopard Speech-to-Text returns metadata for each transcribed word. Available metadata items are:
- Start Time: Indicates when the word started in the transcribed audio. Value is in seconds.
- End Time: Indicates when the word ended in the transcribed audio. Value is in seconds.
- Confidence: Leopard Speech-to-Text's confidence that the transcribed word is accurate. It is a number within
[0, 1]
. - Speaker Tag: If speaker diarization is enabled on initialization, the speaker tag is a non-negative integer identifying unique speakers, with
0
reserved for unknown speakers. If speaker diarization is not enabled, the value will always be-1
.
Demo
For the Leopard Speech-to-Text Go SDK, we offer demo applications that demonstrate how to use the Speech-to-Text engine on audio files.
Setup
Clone the Leopard Speech-to-Text repository from GitHub using HTTPS:
Usage
To see the usage options for the demos, use the -h
flag:
Run the following command to transcribe an audio file:
For more information on our Leopard Speech-to-Text demos for Go, head over to our GitHub repository.