Leopard Speech-to-Text

Accurately convert voice to text with future-ready compliance.

On-device transcription with cloud-level accuracy bringing control back to enterprises

Press the button
to start transcribing with Leopard

What is Leopard Speech-to-Text?

Leopard Speech-to-Text is software that converts audio and video recordings into text with cloud-level accuracy without sacrificing privacy.

Leopard Speech-to-Text brings speech recognition to where data resides, enabling transcription on the device, mobile, web browsers, on-prem, or cloud.

Build with intuitive Speech-to-Text SDKs

o = pvleopard.create(access_key)
transcript, words =
o.process_file(path)

Why Leopard Speech-to-Text?

Speech-to-text APIs require enterprises to send their data to a 3rd party cloud, giving away control over their data and product.

Leopard Speech-to-Text offers the same performance with no compromises.

Compliant Transcription that Keeps Voice Data Private and Confidential

Creating new possibilities for your content, product, and database

Cloud Speech-to-Text Performance

  • Accurate
  • 🎚
    Custom models
  • 🤸
    Platform-agnostic

…with On-device Benefits

  • No downtime
  • 🔒
    Private by design
  • 💰
    Cost-effective
Scientifically-Proven Accuracy

Your product, your decision

Evaluate the accuracy of Leopard Speech-to-Text vs other transcription APIs scientifically with the open-source speech-to-text benchmark, enabling you to make decisions confidently with your data.
2024-01-11T10:25:36.564739image/svg+xmlMatplotlib v3.7.4, https://matplotlib.org/
Speech-to-Text Model Adaptation

Boost accuracy with custom models

Customize pre-trained speech-to-text models instantly by adding domain-specific vocabulary and boosting frequently-used words on a self-service platform, achieving the highest possible accuracy.
Speech-to-text APIs transfer voice input to the cloud to transcribe it into text, creating privacy, and reliability issues and additional costs.
Cross-platform support

Create seamless experiences

Deploy Leopard Speech-to-Text anywhere and offer seamless experiences across devices, mobile apps, web browsers, on-premise, cloud, or all.
Privacy by design

Do not rely on “check the box” compliance models

Process voice data without sharing it with 3rd parties, ensuring compliance with GDPR, HIPAA, CCPA, and more - including any policies that come in the future.
No downtime and zero latency

Develop dependable products

Build reliable products with predictable response times by bringing speech-to-text closer to your data to bypass network latency, congestion, outages, or throttling.
Cost-effective at scale

Grow your business, not cloud providers’

Do not bear the cost of running bulky models in the cloud. Big Tech uses on-device speech-to-text for their products because running large models in the cloud is costly, even for them.
Get started with

Leopard Speech-to-Text

The best way to learn about Leopard Speech-to-Text is to use it!

Start Now
Forever Free Plan
  • Pre-trained models
  • Custom vocabulary
  • Keyword boosting
  • Intuitive SDKs
  • Speaker Diarization
  • Trucasing and Punctuation
  • Word-level Confidence Scores
  • Word-level Timestamps
  • English, French, German, Italian, Japanese, Korean, Portuguese, and Spanish
Learn more about

Leopard Speech-to-Text

What is speech-to-text?

Speech-to-text (STT), also known as Automatic Speech Recognition (ASR) and Open-Domain Large Vocabulary Speech Recognition (LVSR), refers to the technology and methodologies that convert voice data into text.

How does on-device speech-to-text differ from cloud-based speech-to-text?

Cloud-based speech-to-text APIs send voice data to vendors’ servers, where the transcription engine resides. On-device voice processing brings voice recognition to where the voice data resides, eliminating all the unnecessary steps related to cloud processing.

What are the benefits of on-device speech-to-text over cloud speech-to-text APIs?

On-device speech-to-text empowers enterprises to retain ownership and control over their data and products. Sending voice data to the cloud has privacy, latency, reliability, and cost implications. On-device speech-to-text overcomes these challenges, bringing control back to enterprises.

Does Leopard Speech-to-Text support real-time transcription?

Leopard Speech-to-Text doesn’t, but Cheetah Streaming Speech-to-Text does. Cheetah is Picovoice’s on-device streaming speech-to-text engine that provides text output in real time.

Can I use Leopard Speech-to-Text in the cloud?

Yes. You can run Leopard Speech-to-Text in the cloud, whether private, public, or hybrid. Picovoice’s on-device voice recognition technology ensures that data doesn’t have to leave the enterprises’ premises regardless of the platform, instead of making the cloud mandatory. Don’t forget to check tutorials for serverless speech-to-text with AWS Lambda and transcription microservice with gRPC.

Does Leopard Speech-to-Text support Speaker Diarization?

Leopard Speech-to-Text offers an optimized Falcon Speaker Diarization embedded to simplify the development process. Please check Leopard Speech-to-Text documentation for more information.

Does Leopard Speech-to-Text perform Trucasing and Punctuation?

Leopard Speech-to-Text performs Trucasing and Punctuation. Please refer to the Leopard Speech-to-Text documentation to enable or disable automatic punctuation.

Does Leopard Speech-to-Text return Word-level Confidence Scores?

Leopard Speech-to-Text returns Word-level Confidence Scores. Please refer to the Leopard Speech-to-Text documentation for more information.

Does Leopard Speech-to-Text generate Word-level Timestamps?

Leopard Speech-to-Text generates Word-level Timestamps. Please refer to the Leopard Speech-to-Text documentation for more information.

How do I choose the best speech-to-text for my project?

“Best” is a subjective term. Every use case has different business requirements. Several factors, such as accuracy, availability of features, the total cost of ownership, and data privacy and governance, have different weights in different use cases.

Which platforms does Leopard Speech-to-Text support?

Which languages does Leopard Speech-to-Text support?

Leopard Speech-to-Text supports English, French, German, Italian, Japanese, Korean, Portuguese, and Spanish.

What should I do if I need support for other languages?

Reach out to Picovoice Sales to tell us about your commercial endeavor.

How do I get technical support for Leopard Speech-to-Text?

Picovoice docs, blog, Medium posts, and GitHub are great resources to learn about voice AI, Picovoice technology, and how to start building transcription products. Picovoice also offers an optional Enterprise Support Add-on for Forever-Free plan users.

How can I get informed about updates and upgrades?

Version changes appear in the and LinkedIn. Subscribing to GitHub is the best way to get notified of patch releases. If you enjoy building with Leopard Speech-to-Text, show it by giving a GitHub star!