Design, develop, and ship useful voice features.

The end-to-end platform for embedding private voice AI into any software in a few lines of code

Trusted by thousands of enterprises - from startups to Fortune 500s
Loved by 200,000+ developers
OpenAI

What is the end-to-end on-device voice AI platform?

Picovoice end-to-end on-device voice AI platform consists of voice AI engines and models to empower enterprises to design, develop, and ship voice products without sacrificing user privacy or experience.

Picovoice end-to-end on-device platform features a complete set of modular voice AI engines delivered as cross-platform SDKs and a no-code platform to instantly train bespoke voice AI models to boost accuracy and efficiency.

End-to-end voice AI platform

Design

Design with no limits on top of a modular platform. Create use-case-specific voice AI models in seconds.

Develop

Develop voice features with a few lines of code using intuitive and cross-platform SDKs.

Ship

Deliver voice AI everywhere: on-device, mobile, web browsers, on-premise, or cloud.

Iterate

Measure adoption, learn, and iterate. Continuously re-design and re-train to optimize engagement.

Offerings

Leopard
Speech-to-Text
A transcription engine that automatically converts audio and video recordings into text with high accuracy without sacrificing privacy.
Press the button
to start transcribing with Leopard

Why choose Picovoice?

Building accurate, responsive, and private voice technology is difficult.
We learned the hard way, so you don’t have to.

Innovative

Picovoice heavily invests in R&D to offer superior voice AI that surpasses even Big Tech in accuracy and efficiency. Picovoice researchers do not follow recent frameworks and techniques but build them.

Developer-first

Picovoice empowers developers to prototype, build, and evangelize with no strings attached. Builders can start free without a limited trial, credit card, or endless sales meetings.

Private

Picovoice returns control to enterprises as voice data never leaves the premises. Enterprises enjoy high accuracy without compromising privacy and reliability.

Don't just take our word!

Put us to the test with a Forever-Free Account

Start Free

What does Picovoice offer?

Everything to design, develop, and ship voice products: a complete set of modular voice AI engines delivered as cross-platform SDKs and a no-code platform to instantly train bespoke voice AI models to boost accuracy and efficiency.

Speech-to-Text

A transcription engine that automatically converts audio and video recordings into text with high accuracy without sacrificing privacy.
Leopard Speech-to-Text

Streaming Speech-to-Text

A real-time transcription engine that automatically converts conversations into text with zero latency.
Cheetah Streaming Speech-to-Text

Noise Suppression and Cancellation

Noise cancellation software that removes background noise from audio in real time while preserving human speech
Koala Noise Suppression

Speaker Recognition and Identification

Speaker recognition and identification software that distinguishes individuals using their unique voice characteristics.
Eagle Speaker Recognition

Falcon Speaker Diarization

A speaker diarization engine that identifies “who spoke when” in an audio stream by finding speaker changes and grouping them.
Falcon Speaker Diarization

Speech-to-Index

A search engine that indexes speech directly without converting it into text, enabling keyword and phrase search within audio and video files.
Octopus Speech-to-Index

Wake Word Detection

A wake word detection engine that recognizes unique signals to transition software from passive to active listening.
Porcupine Wake Word

Speech-to-Intent

Natural Language Understanding engine fused with speech-to-text, allowing users to interact with applications via voice commands.
Rhino Speech-to-Intent

Voice Activity Detection

Voice activity detection (VAD) software scans audio streams to identify the presence of human speech in real time.
Cobra Voice Activity Detection

Text-to-Speech

A voice generator that converts written text into spoken audio output without network latency or jeopardizing user privacy.
Orca Text-to-Speech

The Enterprise Voice AI

Secure and Flexible Deployment

Secure and flexible deployment with embedded, mobile, web browsers, on-premise, and cloud options. Expert help to choose the best deployment and platform for unique needs.

Picovoice Consulting

Customizable Voice AI Models

Performant out-of-the-box voice AI models, proven by open-source benchmarks. Purpose-built models for specialized applications, use cases, domains, and industries.

Picovoice Consulting

Dedicated Support

Easy-to-follow docs covering 99% of questions and an active GitHub community addressing technical issues. Dedicated support through the Support Add-on and Enterprise Plans.

Support Add-on
The Edge Voice AI Platform

Private, reliable and powerful
voice products

Start Free

FAQ

Feature

On-device voice AI platform offers everything that developers need to design, develop, and ship voice products: a complete set of modular voice AI engines delivered as cross-platform SDKs and a no-code platform to instantly train bespoke voice AI models to boost accuracy and efficiency.

We recommend Cheetah Streaming Speech-to-Text for real-time conversations such as live events, conferences, and meetings, or enable note-taking and voice typing.

Please note that every use case is unique and the nuances may affect the performance of your product. If you’re a Picovoice customer, please reach out to your Picovoice contact to get dedicated support. If you’re a Free Plan user, you can purchase an Enterprise Support Add-on to discuss your specific use case.

We recommend Leopard Speech-to-Text to convert audio and video files such as recordings of interviews, meetings, or calls, podcasts, and voicemails into text.

Please note that every use case is unique and the nuances may affect the performance of your product. If you’re a Picovoice customer, please reach out to your Picovoice contact to get dedicated support. If you’re a Free Plan user, you can purchase an Enterprise Support Add-on to discuss your specific use case.

We recommend Koala Noise Suppression to achieve crisp and clear conversations by removing background noise and enhancing speech

Please note that every use case is unique and the nuances may affect the performance of your product. If you’re a Picovoice customer, please reach out to your Picovoice contact to get dedicated support. If you’re a Free Plan user, you can purchase an Enterprise Support Add-on to discuss your specific use case.

We recommend Falcon Speaker Diarization for speaker diarization to make transcripts readable and analyzable.

Please note that every use case is unique and the nuances may affect the performance of your product. If you’re a Picovoice customer, please reach out to your Picovoice contact to get dedicated support. If you’re a Free Plan user, you can purchase an Enterprise Support Add-on to discuss your specific use case.

We recommend Eagle Speaker Recognition to identify and verify speakers and personalize experiences simply by recognizing the user’s voice.

Please note that every use case is unique and the nuances may affect the performance of your product. If you’re a Picovoice customer, please reach out to your Picovoice contact to get dedicated support. If you’re a Free Plan user, you can purchase an Enterprise Support Add-on to discuss your specific use case.

We recommend Orca Streaming Text-to-Speech to convert written text into spoken audio output.

Please note that every use case is unique and the nuances may affect the performance of your product. If you’re a Picovoice customer, please reach out to your Picovoice contact to get dedicated support. If you’re a Free Plan user, you can purchase an Enterprise Support Add-on to discuss your specific use case.

We recommend Orca Streaming Text-to-Speech to convert streaming LLM text output into voice.

Please note that every use case is unique and the nuances may affect the performance of your product. If you’re a Picovoice customer, please reach out to your Picovoice contact to get dedicated support. If you’re a Free Plan user, you can purchase an Enterprise Support Add-on to discuss your specific use case.

We recommend Porcupine Wake Word to detect wake words (Alexa), always listening commands (turn the lights on), and monitor conversations for specific keywords (product name).

Please note that every use case is unique and the nuances may affect the performance of your product. If you’re a Picovoice customer, please reach out to your Picovoice contact to get dedicated support. If you’re a Free Plan user, you can purchase an Enterprise Support Add-on to discuss your specific use case.

We recommend Rhino Speech-to-Intent to add custom voice commands to software (set the brightness at 60%), create voicebots and IVRs, and navigate in menus (2022 Hyundai IONIQ 5 AWD)

Please note that every use case is unique and the nuances may affect the performance of your product. If you’re a Picovoice customer, please reach out to your Picovoice contact to get dedicated support. If you’re a Free Plan user, you can purchase an Enterprise Support Add-on to discuss your specific use case.

We recommend Cobra Voice Activity Detection to detect when someone starts or stops speaking and trigger action accordingly.

Please note that every use case is unique and the nuances may affect the performance of your product. If you’re a Picovoice customer, please reach out to your Picovoice contact to get dedicated support. If you’re a Free Plan user, you can purchase an Enterprise Support Add-on to discuss your specific use case.

We recommend Cobra Voice Activity Detection to detect and clean silence in audio and video data.

Please note that every use case is unique and the nuances may affect the performance of your product. If you’re a Picovoice customer, please reach out to your Picovoice contact to get dedicated support. If you’re a Free Plan user, you can purchase an Enterprise Support Add-on to discuss your specific use case.

We recommend Octopus Speech-to-Index to make audio and video libraries discoverable to search for keywords, including proper nouns, and slang even without knowing the exact spelling.

Please note that every use case is unique and the nuances may affect the performance of your product. If you’re a Picovoice customer, please reach out to your Picovoice contact to get dedicated support. If you’re a Free Plan user, you can purchase an Enterprise Support Add-on to discuss your specific use case.

We recommend Picovoice Voice Recorders to record and process audio files to create audio streams and use Picovoice Voice AI engines.

Please note that every use case is unique and the nuances may affect the performance of your product. If you’re a Picovoice customer, please reach out to your Picovoice contact to get dedicated support. If you’re a Free Plan user, you can purchase an Enterprise Support Add-on to discuss your specific use case.

You can train voice AI models on the Picovoice Console . Picovoice Console is a no-code platform with a web-based type-and-train interface. You can create an account for the Picovoice Console account immediately and start building without engaging with the Picovoice team.

Before signing up for the Console, you can watch our tutorials to learn how to train custom voice AI models:

Usage

  • Desktop & Server: Linux, Windows & macOS
  • Mobile: Android & iOS
  • Web Browsers: Chrome, Safari, Edge and Firefox
  • Single Board Computers: Raspberry Pi
  • Cloud Providers: AWS, Azure, Google, IBM, Oracle, and others.

Yes, You can run all Picovoice voice AI engines (Speech-to-Text, Streaming Speech-to-Text, Noise Suppression, Speaker Recognition, Speaker Diarization, Text-to-Speech, Wake Word, Speech-to-Intent, Voice Activity Detection, Speech-to-Index) in the cloud.

Yes, You can run all Picovoice voice AI engines (Speech-to-Text, Streaming Speech-to-Text, Noise Suppression, Speaker Recognition, Speaker Diarization, Text-to-Speech, Wake Word, Speech-to-Intent, Voice Activity Detection, Speech-to-Index) on-prem.

Yes, You can run all Picovoice voice AI engines (Speech-to-Text, Streaming Speech-to-Text, Noise Suppression, Speaker Recognition, Speaker Diarization, Text-to-Speech, Wake Word, Speech-to-Intent, Voice Activity Detection, Speech-to-Index) in the serverless.

Yes, You can run all Picovoice voice AI engines (Speech-to-Text, Streaming Speech-to-Text, Noise Suppression, Speaker Recognition, Speaker Diarization, Text-to-Speech, Wake Word, Speech-to-Intent, Voice Activity Detection, Speech-to-Index) on mobile devices.

Yes, You can run all Picovoice voice AI engines (Speech-to-Text, Streaming Speech-to-Text, Noise Suppression, Speaker Recognition, Speaker Diarization, Text-to-Speech, Wake Word, Speech-to-Intent, Voice Activity Detection, Speech-to-Index) within web browsers.

Yes, You can run all Picovoice voice AI engines (Speech-to-Text, Streaming Speech-to-Text, Noise Suppression, Speaker Recognition, Speaker Diarization, Text-to-Speech, Wake Word, Speech-to-Intent, Voice Activity Detection, Speech-to-Index) on embedded devices.

No, Picovoice voice AI engines do not require a GPU.

Picovoice on-device Voice AI platform supports all modern SDKs. Android, Angular, C, .NET Flutter, Go, iOS, Java, Nodejs, Python, React, React Native, Rust, Unity, Vue, Web. If you need another SDK, you can check our open-source SDKs and build it yourself or contact Picovoice Consulting. Picovoice Consulting experts can create a public or private library for the SDK of your choice and maintain it.

Picovoice uses AccessKey, hence internet connectivity, to be able to offer its services according to your plan limits. Picovoice engines call home servers to validate the AccessKey and check your plan limits.

Picovoice tracks usage in the amount of data processed -in hours or characters, or the number of users depending on the engines and project setup.

Your usage automatically resets every 30 days.

Picovoice tracks the usage accumulated in the last 30 days. You can see the real-time consumption on your Picovoice Console Profile.

Downloaded models are counted toward your monthly model allowance. Once you hit download, your training usage will increase.

Picovoice tracks the training usage accumulated in the last 30 days. You can see the total number of models you trained in the last 30 days on your Picovoice Console Profile.

Yes, you can use Picovoice Voice AI engines for research, non-commercial, and commercial purposes as long as you are within your plan limits and compliant with the Picovoice Terms of Use.

Technical Questions

Picovoice voice AI SDKs, voice recorders, and benchmarks are open-source and free to use.

Picovoice researchers continuously improve techniques and frameworks used to train algorithms. Picovoice applies transfer learning, hardware-aware training, and neural compression principles, resulting in efficient models competing with cloud-dependent AI models.

It depends on your tech stack and design. Given the number of engines Picovoice offers and the platforms it supports, it’s hard to communicate one number. We encourage developers to do their own tests and evaluations in their real environments.

Picovoice currently supports seventeen languages: English, Arabic, Dutch, Farsi, French, German, Hindi, Italian, Japanese, Korean, Mandarin, Polish, Portuguese, Russian, Spanish, Swedish, and Vietnamese. Please check the product page if you’re looking for engine-specific information. If you have an opportunity requiring another language, engage with Picovoice Consulting to get a custom model trained for you!

Yes, Picovoice technology works well across accents and dialects. The best way to learn about it is to test Picovoice engines with your dataset. Picovoice offers a Free Plan that allows enterprises to evaluate and become familiar with the technology, as well as a Developer Plan to run thorough tests before committing to an Enterprise Plan.

Picovoice aims to provide realistic benchmarks by leveraging various accents and noise. Yet, we encourage developers to test the engines in their real-world environments.

Picovoice engines expect audio with a 16kHz sampling rate. PSTN networks usually sample at 8kHz. It is possible to upsample but the frequency content above 4kHz is gone, and performance will be suboptimal. It is possible to train acoustic models for telephony applications for enterprise customers. Engage with Picovoice Consulting to find the best solution that works for you.

Picovoice software expects a 16kHz sampling rate. You will need to downsample. Typically, operating systems or sound cards (Audio codecs) provide such functionality; otherwise, you will need to implement it.

Picovoice software expects a 16kHz sampling rate, as it strikes a balance between quality and file size, used in voice commands and speech recognition technologies. At 16kHz, audio files are small enough to store and transmit while offering reasonable audio quality. Secondly, the human voice's most critical frequencies lie between 300Hz and 3400Hz. The Nyquist-Shannon sampling theorem states that a sampling rate of at least twice the highest frequency is required for accurate signal representation. 16kHz is more than twice 3400Hz and sufficient for processing the human voice. That’s why 16kHz has become a standard in applications using human speech and voice.

There are several factors that affect the performance of voice AI engines: quality of audio data, environment - noise, echo, reverberation, tech stack, and design.

Custom Models & Support

You can leverage the self-service Picovoice Console to fine-tune voice AI models or engage with Picovoice Consulting for further improvement.

See how to fine-tune models on the Picovoice Console:

Custom speech recognition models are created for specific tasks, specific use cases, and sometimes for specific environments. General-purpose models are jacks-of-all-trades and masters-of-none. For example, if you need a medical dictation app, you need a fine-tuned speech-to-text to be able to capture the jargon correctly. If you’re building a sales enablement app, just like you train your salesforce to learn about your product names, you should adapt the general speech recognition model accordingly.

You can engage with Picovoice Consulting to discuss the opportunity.

Picovoice voice AI engines support the most popular and widely-used hardware and software out-of-the-box - from web, mobile, desktop, and on-prem to private cloud. However, there may be certain chipsets we do not currently support. (There are so many of them, yet only so much time and money, making it impossible to support everything.) You can engage with Picovoice Consulting and get any Picovoice voice AI engine ported to the platform of your choice.

Picovoice voice AI engines support the most popular and widely used SDKs. If you need another SDK, you can check our open-source SDKs and build it yourself or contact Picovoice Consulting. Picovoice Consulting experts can create a public or private library for the SDK of your choice and maintain it.

Picovoice currently supports seventeen languages: English, Arabic, Dutch, Farsi, French, German, Hindi, Italian, Japanese, Korean, Mandarin, Polish, Portuguese, Russian, Spanish, Swedish, and Vietnamese. Please check the product page if you’re looking for engine-specific information. If you have an opportunity requiring another language, engage with Picovoice Consulting to get a custom model trained for you!

Picovoice engines have a lexicon of hundreds of thousands of words in their lexicons. However, there might be some special words we missed. You can add a custom word to Leopard Speech-to-Text and Cheetah Streaming Speech-to-Text on the self-service Picovoice Console. In order to add a new word to the Porcupine Wake Word and Rhino Speech-to-Intent lexicon, if you’re a Picovoice customer, reach out to your Picovoice contact, if you’re not a customer, why don’t you become one?

You can create a GitHub issue under the relevant repository/demo.

Enterprises face several challenges while building PoCs. Finding a talent experienced in machine learning is one of the biggest challenges to start with. We learned this the hard way, and experience it every day. On top of it, executives and clients may have unrealistic deadlines.

Experts at Picovoice Consulting help enterprises build PoCs, develop their AI strategy, and work with them hand-in-hand offering the guidance they need.

Data Security & Privacy

Picovoice voice AI engines process data in your environment, whether it’s public or private cloud, on-prem, web, mobile, desktop, or embedded.

Picovoice is private by design and has no access to user data. Thus, Picovoice doesn’t retain user data as it never tracks or stores them in the first place.

Yes. Enterprises using Picovoice don’t need to share their user data with Picovoice or any other 3rd party to run voice AI models, making Picovoice voice AI platform intrinsically HIPAA-compliant.

Yes. Enterprises using Picovoice don’t need to share their user data with Picovoice or any other 3rd party to run voice AI models, making Picovoice voice AI platform intrinsically GDPR-compliant.

Yes. Enterprises using Picovoice don’t need to share their user data with Picovoice or any other 3rd party to run voice AI models, making Picovoice voice AI platform intrinsically CCPA-compliant.

Building with Picovoice

Yes, you can use voice AI with local LLMs and create private, accurate, and reliable AI agents.

The answer is “it depends”. Voice AI is complex technology and building products for production requires diligent work. It depends on your use case, other tools, and the tech stack used, along with hardware and software choice. Given the variables, it can be challenging. You can experiment different scenarios leveraging Picovoice’s Free resources or engage with experts from Picovoice Consulting to find the best approach to deploying language models for production.

Yes! Picovoice engines are modular and work with other Picovoice products or competitive products. Check Picovoice Blog or GitHub to find more information, tutorials, and demos.