Leopard Speech-to-Text

Build accurate and private transcription software.

On-device transcription with cloud-level accuracy bringing control back to enterprises


What is Leopard Speech-to-Text?

Leopard Speech-to-Text is software that converts audio and video recordings into text with cloud-level accuracy without sacrificing privacy.

Leopard Speech-to-Text brings speech recognition to where data resides, enabling transcription on the device, mobile, web browsers, on-prem, or cloud.

Build with intuitive Speech-to-Text SDKs

o = pvleopard.create(access_key)
transcript, words =
Build with Python
const o = new Leopard(accessKey)
const { transcript, words } =
Build with NodeJS
Leopard o = new Leopard.Builder()
LeopardTranscript r =
Build with Android
let o = Leopard(
accessKey: accessKey,
modelPath: modelPath)
let r = o.processFile(path)
Build with iOS
o = NewLeopard(accessKey)
err := o.Init()
transcript, words, err
:= o.ProcessFile(path)
Build with Go
Leopard o = new Leopard.Builder()
LeopardTranscript r =
Build with Java
Leopard o =
LeopardTranscript result =
Build with .NET
let o: Leopard =
if let Ok(result) =
o.process_file(path) { }
Build with Rust
Leopard o = await Leopard.create(
LeopardTranscript result =
await o.processFile(path);
Build with Flutter
const o = await Leopard.create(
const {transcript, words} =
await o.processFile(path)
Build with React Native
pv_leopard_t *leopard = NULL;
char *transcript = NULL;
int32_t num_words = 0;
pv_word_t *words = NULL;
Build with C
const leopard =
await LeopardWorker.
const {
} =
await leopard.process(pcm);
Build with Web

Why Leopard Speech-to-Text?

Speech-to-text APIs require enterprises to send their data to a 3rd party cloud, giving away control over their data and product.

Leopard Speech-to-Text offers the same performance with no compromises.

Don't leave any data behind

Creating new possibilities for your content, product, and database

Just like “best” cloud transcription APIs…

Leopard Speech-to-Text offers cloud-level accuracy, model customization, and cross-platform support…

…with no compromises

…without sacrificing privacy, reliability, and affordability, enabling use cases that were impossible before.

Scientifically-Proven Accuracy

Your product, your decision

Evaluate the accuracy of Leopard Speech-to-Text vs other transcription APIs scientifically with the open-source speech-to-text benchmark, enabling you to make decisions confidently with your data.

Speech-to-Text Model Adaptation

Boost accuracy with custom models

Customize pre-trained speech-to-text models instantly by adding domain-specific vocabulary and boosting frequently-used words on a self-service platform, achieving the highest possible accuracy.

Speech-to-text APIs transfer voice input to the cloud to transcribe it into text, creating privacy, and reliability issues and additional costs.
Cross-platform support

Create seamless experiences

Deploy Leopard Speech-to-Text anywhere and offer seamless experiences across devices, mobile apps, web browsers, on-premise, cloud, or all.

Privacy by design

Do not rely on “check the box” compliance models

Process voice data without sharing it with 3rd parties, ensuring compliance with GDPR, HIPAA, CCPA, and more - including any policies that come in the future.

No downtime and zero latency

Develop dependable products

Build reliable products with predictable response times by bringing speech-to-text closer to your data to bypass network latency, congestion, outages, or throttling.

Cost-effective at scale

Scale your business, not cloud providers’

Do not bear the cost of running bulky models in the cloud. Big Tech uses on-device speech-to-text for their products because running large models in the cloud is costly, even for them.

Get started with

Leopard Speech-to-Text

The best way to learn about Leopard Speech-to-Text is to use it!

Start Now
Forever Free Plan
  • Pre-trained models
  • Custom vocabulary
  • Keyword boosting
  • Intuitive SDKs
  • Trucasing and Punctuation
  • Word-level Confidence Scores
  • Word-level Timestamps
  • English, French, German, Italian, Japanese, Korean, Portuguese, and Spanish
Learn more about

Leopard Speech-to-Text

What is speech-to-text?

Speech-to-text (STT), also known as Automatic Speech Recognition (ASR) and Open-Domain Large Vocabulary Speech Recognition (LVSR), refers to the technology and methodologies that convert voice data into text.

How does on-device speech-to-text differ from cloud-based speech-to-text?

Cloud-based speech-to-text APIs send voice data to vendors’ servers, where the transcription engine resides. On-device voice processing brings voice recognition where voice data is, eliminating all the steps related to cloud processing.