Transcribe voice data locally with cloud-level accuracy
Adaptive, efficient, cost-effective, and no compromises
Start Building for FreeWork with an expertTrain an optimized speech-to-text model by adding custom vocabulary and boosting words relevant to your use case. Simply type or bulk import words to achieve the highest possible accuracy. Test within browser and download models for local transcription.
Add Leopard Speech-to-Text with your favourite SDK, including Python, NodeJS, Android, iOS, and React Native. Transcribing speech data locally on a device is just a few lines of code away!
o = pvleopard.create(access_key)transcript, words =o.process_file(path)Build with Python
const o = new Leopard(accessKey)const { transcript, words } =o.processFile(path)Build with NodeJS
Leopard o = new Leopard.Builder().setAccessKey(accessKey).setModelPath(modelPath).build(appContext);LeopardTranscript r =o.processFile(path);Build with Android
let o = Leopard(accessKey: accessKey,modelPath: modelPath)let r = o.processFile(path)Build with iOS
o = NewLeopard(accessKey)err := o.Init()transcript, words, err:= o.ProcessFile(path)Build with Go
Leopard o = new Leopard.Builder().setAccessKey(accessKey).build();LeopardTranscript r =o.processFile(path);Build with Java
Leopard o =Leopard.Create(accessKey);LeopardTranscript result =o.ProcessFile(path);Build with .NET
let o: Leopard =LeopardBuilder::new().access_key(access_key).init().expect("");if let Ok(result) =o.process_file(path) { }Build with Rust
Leopard o = await Leopard.create(accessKey,modelPath);LeopardTranscript result =await o.processFile(path);Build with Flutter
const o = await Leopard.create(accessKey,modelPath)const {transcript, words} =await o.processFile(path)Build with React Native
pv_leopard_t *leopard = NULL;pv_leopard_init(access_key,model_path,enable_automatic_punctuation,&leopard);char *transcript = NULL;int32_t num_words = 0;pv_word_t *words = NULL;pv_leopard_process_file(leopard,path,&transcript,&num_words,&words);Build with C
const leopard =await LeopardWorker.fromPublicDirectory(accessKey,modelPath);const {transcript,words} =await leopard.process(pcm);Build with Web
Deploy Leopard Speech-to-Text anywhere. Offer seamless experiences across platforms such as web browsers, mobile devices, on-prem servers, or all. Leopard brings state-of-the-art voice recognition to where data resides.
Transcribe more for less! Leopard is not affordable by percentage, but by a factor of 10 to 20. Transcribe all your data instead of just a portion due to STT costs. Check out the cost comparison of the most-known speech-to-text engines: Amazon Transcribe, Azure, Google Speech-to-Text, IBM Watson and Leopard.
Start with the Free TierLet the data decide what’s accurate. No voice AI vendor claims “mediocre accuracy.” At least we haven’t seen it. However, accuracy depends on various factors. More importantly, if an Automatic Speech Recognition engine with 90% accuracy, i.e. 10% Word Error Rate, misses the most important words, that accuracy doesn’t mean much. That is why we’ve published an open-source benchmark. It compares Leopard’s accuracy against Amazon Transcribe, Microsoft Azure, Google Speech-to-Text, and IBM Watson.
Do not hinder user experience due to network outages or latency. Leopard’s edge-first architecture ensures reliable processing time by cutting the connectivity dependency. Unlike cloud APIs, Leopard Speech-to-Text offers on-device voice recognition.
Do not lose control over your data! Leopard processes voice data on the device without sending it to a 3rd party cloud. Sending the data to the cloud is risky, especially in highly regulated industries such as healthcare and financial services. Google Speech-to-Text charges 50% extra if you don’t share your data with Google. Even if the audio recordings are not stored, transmission over the Internet is still a risk factor. Data is interceptable on route to the 3rd party cloud unless it’s encrypted and sent over a secure connection. The controversial privacy policies of the big tech are very well known. However, they are not alone. As alternative transcription providers grow, just like Otter.ai, privacy flaws of cloud-dependent transcription become more visible.
English
German
Deutsch
Spanish
Español
French
Français
Add voice for truly hands-free search experiences on the websites, mobile applications and devices.
Search By VoiceAdd voice search to mobile applications, websites, and devices. Find keywords and phrases in audio, video, and streams.
Voice SearchTransformative customer and employee experience with speech analytics and intelligence tools powered by the only end-to-end Voice AI platform.
Speech AnalyticsAdd voice commands to devices, mobile or web applications to elevate user experience.
Voice CommandLeopard doesn’t, but Cheetah does. Cheetah is Picovoice’s on-device streaming speech-to-text engine that provides text output in real-time for visual feedback. Like Leopard, Cheetah is also private, fast and cost-effective as voice data is processed locally on the device. Cheetah also runs across platforms, thanks to its compact and computationally efficient model. Although Cheetah is less accurate than Leopard, it outperforms IBM Watson and Google Speech-to-Text. Learn more about Cheetah Speech-to-Text.
Achieving state-of-the-art accuracy within the limitations of the edge is a challenging problem. Large models require more resources, which could be offered only by the cloud. However, relying on the cloud comes with costs such as hefty bills at the scale, privacy and environmental costs with productivity losses due to network outages or delays. Picovoice aims to be the developer’s first choice for adding voice to anything. We developed local speech-to-text to give the control back to you. So you can add voice to anything, on your terms, with no compromises.
Yes! By processing voice data locally on the device, Picovoice brings the voice recognition technology close to where data resides instead of sending the data to where the processing happens. One can process voice data with Picovoice Speech-to-Text in private, public or hybrid cloud. Don’t forget to check out tutorials for serverless speech-to-text with Leopard & AWS Lambda and microservice with Leopard and gRPC for inspiration.
“Best” is a subjective term. Every use case has different business requirements. Also, the available resources vary from one enterprise to another. Accuracy, availability of features, time-to-market, the total cost of ownership, data privacy and governance, and reliability of the providers are some factors to consider, and they do not have the same weight for everyone. Read our guidelines to evaluate top FOSS (free and open-source) and production-grade speech-to-text solutions.
Picovoice Speech-to-Text only supports the English language for now.
Reach out to Picovoice Sales to tell us about your commercial endeavour. Don’t forget the add the use case, business requirements and project details. Picovoice team will respond to you.
Picovoice docs, blogs, Medium posts, and GitHub are great resources to learn about voice recognition, Picovoice engines, and how to start building transcription products. Picovoice also offers GitHub community support to all Free Tier users.
Version changes appear in the Picovoice Newsletter, LinkedIn, and Twitter. Subscribing to GitHub is the best way to get notified of the patch releases. If you enjoy building with Leopard, don’t forget to give it a star when you’re on GitHub!