Leopard Speech-to-Text

Accurately convert voice to text with future-ready compliance.

On-device transcription with cloud-level accuracy bringing control back to enterprises

Press the button
to start transcribing with Leopard

Loved by developers, trusted by enterprises

What is Leopard Speech-to-Text?

Leopard Speech-to-Text is software that converts audio and video recordings into text with cloud-level accuracy without sacrificing privacy.

Leopard Speech-to-Text brings speech recognition to where data resides, enabling transcription on the device, mobile, web browsers, on-prem, or cloud.

Build with intuitive Speech-to-Text SDKs

o = pvleopard.create(access_key)

transcript, words =
  o.process_file(path)
Build with Python

Why Leopard Speech-to-Text?

Speech-to-text APIs require enterprises to send their data to a 3rd party cloud, giving away control over their data and product.

Leopard Speech-to-Text offers the same performance with no compromises.

Compliant Transcription that Keeps Voice Data Private and Confidential

Creating new possibilities for your content, product, and database

Cloud Speech-to-Text Performance

✅
Accurate
🎚
Custom models
🤸
Platform-agnostic

…with On-device Benefits

⚡
No downtime
🔒
Private by design
💰
Cost-effective

Scientifically-Proven Accuracy

Your product, your decision

Evaluate the accuracy of Leopard Speech-to-Text vs other transcription APIs scientifically with the open-source speech-to-text benchmark, enabling you to make decisions confidently with your data.

Speech-to-Text Model Adaptation

Boost accuracy with custom models

Customize pre-trained speech-to-text models instantly by adding domain-specific vocabulary and boosting frequently-used words on a self-service platform, achieving the highest possible accuracy.

Speech-to-text APIs transfer voice input to the cloud to transcribe it into text, creating privacy, and reliability issues and additional costs.

Cross-platform support

Create seamless experiences

Deploy Leopard Speech-to-Text anywhere and offer seamless experiences across devices, mobile apps, web browsers, on-premise, cloud, or all.

Privacy by design

Do not rely on “check the box” compliance models

Process voice data without sharing it with 3rd parties, ensuring compliance with GDPR, HIPAA, CCPA, and more - including any policies that come in the future.

No downtime and zero latency

Develop dependable products

Build reliable products with predictable response times by bringing speech-to-text closer to your data to bypass network latency, congestion, outages, or throttling.

Cost-effective at scale

Grow your business, not cloud providers’

Do not bear the cost of running bulky models in the cloud. Big Tech uses on-device speech-to-text for their products because running large models in the cloud is costly, even for them.

Get started with

Leopard Speech-to-Text

The best way to learn about Leopard Speech-to-Text is to use it!

Start Free

Pre-trained models
Custom vocabulary
Keyword boosting
Intuitive SDKs
Speaker Diarization
Trucasing and Punctuation
Word-level Confidence Scores
Word-level Timestamps
English, French, German, Italian, Japanese, Korean, Portuguese, and Spanish

Everything You Need to Know About Speech-to-Speech Translation

On-device voice AI for French to build AI Agents

How do Voice AI Agents work?

On-device AI Models to Convert Voice to Text in Spanish

Multilingual On-device Speech-to-Text for Real-time Applications

AI Voice Assistant for iOS Powered by Local LLM

FAQ

What is speech-to-text?

Speech-to-text (STT), also known as Automatic Speech Recognition (ASR) and Open-Domain Large Vocabulary Speech Recognition (LVSR), refers to the technology and methodologies that convert voice data into text.

How does on-device speech-to-text differ from cloud-based speech-to-text?

Cloud-based speech-to-text APIs send voice data to vendors’ servers, where the transcription engine resides. On-device voice processing brings voice recognition to where the voice data resides, eliminating all the unnecessary steps related to cloud processing.

What are the benefits of on-device speech-to-text over cloud speech-to-text APIs?

On-device speech-to-text empowers enterprises to retain ownership and control over their data and products. Sending voice data to the cloud has privacy, latency, reliability, and cost implications. On-device speech-to-text overcomes these challenges, bringing control back to enterprises.

Does Leopard Speech-to-Text support real-time transcription?

Leopard Speech-to-Text doesn’t, but Cheetah Streaming Speech-to-Text does. Cheetah is Picovoice’s on-device streaming speech-to-text engine that provides text output in real time.

Can I use Leopard Speech-to-Text in the cloud?

Yes. You can run Leopard Speech-to-Text in the cloud, whether private, public, or hybrid. Picovoice’s on-device voice recognition technology ensures that data doesn’t have to leave the enterprises’ premises regardless of the platform, instead of making the cloud mandatory. Don’t forget to check tutorials for serverless speech-to-text with AWS Lambda and transcription microservice with gRPC.

Does Leopard Speech-to-Text support Speaker Diarization?

Leopard Speech-to-Text offers an optimized Falcon Speaker Diarization embedded to simplify the development process. Please check Leopard Speech-to-Text documentation for more information.

Does Leopard Speech-to-Text perform Trucasing and Punctuation?

Leopard Speech-to-Text performs Trucasing and Punctuation. Please refer to the Leopard Speech-to-Text documentation to enable or disable automatic punctuation.

Does Leopard Speech-to-Text return Word-level Confidence Scores?

Leopard Speech-to-Text returns Word-level Confidence Scores. Please refer to the Leopard Speech-to-Text documentation for more information.

Does Leopard Speech-to-Text generate Word-level Timestamps?

Leopard Speech-to-Text generates Word-level Timestamps. Please refer to the Leopard Speech-to-Text documentation for more information.

How do I choose the best speech-to-text for my project?

“Best” is a subjective term. Every use case has different business requirements. Several factors, such as accuracy, availability of features, the total cost of ownership, and data privacy and governance, have different weights in different use cases.

Which platforms does Leopard Speech-to-Text support?

Desktop and Servers: Linux, macOS, and Windows
Web Browsers: Chrome, Safari, Edge, and Firefox
Mobile Devices: Android and iOS
Single Board Computers: Raspberry Pi

Which languages does Leopard Speech-to-Text support?

Leopard Speech-to-Text supports English, French, German, Italian, Japanese, Korean, Portuguese, and Spanish.

What should I do if I need support for other languages?

Reach out to Picovoice Sales to tell us about your commercial endeavor.

How do I get technical support for Leopard Speech-to-Text?

Picovoice docs, blog, Medium posts, and GitHub are great resources to learn about voice AI, Picovoice technology, and how to start building transcription products. Enterprise customers get dedicated support specific to their applications from Picovoice Product & Engineering teams. While Picovoice customers reach out to their contacts, prospects can also purchase Enterprise Support before committing to any paid plan.

How can I get informed about updates and upgrades?

Version changes appear in the and LinkedIn. Subscribing to GitHub is the best way to get notified of patch releases. If you enjoy building with Leopard Speech-to-Text, show it by giving a GitHub star!

Accurately convert voice to text with future-ready compliance.

What is Leopard Speech-to-Text?

Build with intuitive Speech-to-Text SDKs

Why Leopard Speech-to-Text?

Compliant Transcription that Keeps Voice Data Private and Confidential

Cloud Speech-to-Text Performance

…with On-device Benefits

Your product, your decision

Boost accuracy with custom models

Create seamless experiences

Do not rely on “check the box” compliance models

Develop dependable products

Grow your business, not cloud providers’

Leopard Speech-to-Text

More from Picovoice

Everything You Need to Know About Speech-to-Speech Translation

On-device voice AI for French to build AI Agents

How do Voice AI Agents work?

On-device AI Models to Convert Voice to Text in Spanish

Multilingual On-device Speech-to-Text for Real-time Applications

AI Voice Assistant for iOS Powered by Local LLM

FAQ