Accelerating the adoption of voice AI through innovation

Picovoice brings the control back to enterprises with accurate, private, and fast voice AI technology that runs on-device, mobile, web browsers, on-premise, and cloud.

Story

In 2018, the approach to voice AI was to gather data and train a model for each project. Any requirement changes would reset the process. Model sizes correlated with accuracy - the larger, the better. Only a handful of companies could afford to build or buy it.

The launch of Porcupine Wake Word challenged these industry dogmas - and this was only the beginning!

Porcupine Wake Word was more accurate and efficient than alternatives. Training a custom wake word would take hours, not months. It was groundbreaking. Today, it’s even better and Picovoice keeps innovating:

Only full-stack voice AI platform with modular, accurate, and private voice AI products.
Only Free Plan with all deployment options, products, and SDKs, enabling PoCs in seconds.
Driver of the industry with innovative products, open-source benchmarks and datasets.

0K+

Developers building with Picovoice

Fortune 50 Enterprises using Picovoice

<0min

Time required to customize voice AI models

Free Plan

Picovoice’s innovative and distinctive Free Plan accelerates the adoption of voice AI:

The only free plan that allows on-device and on-prem deployment.
The only free plan that provides access to accurate, private, and cross-platform voice AI models.
Enables “real” PoCs for production and commercial purposes.

Investing in the Free Plan was risky, considering the limited resources of a startup and technical challenges. Yet, long sales cycles should be removed to let anyone start building in seconds and accelerate the adoption of voice AI.

Today, thousands of developers can access private, accurate, and cross-platform voice AI models through Picovoice, while other players still work with “selected partners only.”

Offerings

Each Picovoice offering has a unique advantage, creating new opportunities for enterprises to bring their vision to life.

Speech-to-Text

The only easy-to-customize and cross-platform on-device speech-to-text engines with cloud accuracy. Brings control back to enterprises.

Leopard Speech-to-Text Cheetah Streaming Speech-to-Text

Noise Suppression & Cancellation

The only freely-available, real-time, and cross-platform high-quality noise suppression engine.

Koala Noise Suppression

Speaker Recognition

The only language-agnostic, text-independent, cross-platform commercial engine that is ready in seconds.

Eagle Speaker Recognition

Speaker Diarization

The only modular and cross-platform Speaker Diarization software that works with any Speech-to-Text engine.

Falcon Speaker Diarization

Text-to-Speech

Cross-platform voice generator that enables human-like interactions without network latency.

Orca Text-to-Speech

Wake Word Detection

The #1 wake word detection repository on GitHub for years with nothing comparable since its launch.

Porcupine Wake Word

Speech-to-Intent

The best Speech-to-Text alternative to use-case-specific voice commands.

Rhino Speech-to-Intent

Voice Activity Detection

The only enterprise-grade and cross-platform voice activity detection engine.

Cobra Voice Activity Detection

Speech-to-Index

The best and only freely available Speech-to-Text alternative for search.

Octopus Speech-to-Index

Self-Service Console

The first no-code self-service interface to design and train voice AI models in seconds.

Picovoice Console

Professional Services

Inventors of voice AI helping enterprises disrupt their industry with breakthrough Picovoice technology.

Picovoice Consulting

No-code platform

The first no-code platform for building voice interfaces on microcontrollers.

Picovoice Shepherd

Tools

Voice Recorders

Picovoice Voice Recorders eliminates one of the biggest problems in voice AI: audio processing.

Voice AI engines receive audio streams and process them to generate the desired output. Voice AI vendors focus on processing the audio streams. Creating audio streams is a challenge left to developers. Especially finding a solution for real-time audio processing blocks many developers.

We initially built voice recorders for Picovoice engines to simplify the development process. Acknowledging the challenges, we created separate libraries, enabling developers to use them freely.

Today, developers who start building voice products with Picovoice without any experience with other vendors do not even realize the challenge because:

Picovoice Voice Recorders allow developers to choose the frame size.
Picovoice Voice Recorders support real-time audio processing.
Picovoice Voice Recorders are cross-platform.
Picovoice Voice Recorders are simple and ready to use with a few lines of code.
The Picovoice team maintains Picovoice Voice Recorders, giving enterprise developers peace of mind.

Although Picovoice Voice Recorders shine when used with our engines and for speech processing, we have seen them used for other purposes, including with competitor products.

Recording Audio on PC, Server and Embedded

Cross-platform Voice Recorder for Python

Finding the right package or overcoming platform and version compatibility remains a big challenge for audio processing in Python, even with popular PyAudio and python-sounddevice.

Cross-platform Voice Recorder for Python is a better alternative to PyAudio and python-sounddevice for voice processing and works across Linux, macOS, and Windows.

Voice Recording and Audio Processing in Python

Cross-platform Voice Recorder for C

Available tools to record audio in C, such as PortAudio, show their age and come with compatibility, documentation, support, or maintenance challenges.

Cross-platform Voice Recorder for C is a practical alternative to PortAudio that runs across Linux, macOS, and Windows.

Voice Recording and Audio Processing in C

Cross-platform Voice Recorder for .NET

.NET is popular for enterprise applications. However, popular .NET audio recorder tools, such as NAudio and CSCore developed by individuals. Maintenance, support, documentation, and compatibility issues (especially on Windows) increase the cost of enterprise projects.

Cross-platform Voice Recorder for .NET is an alternative to NAudio and CSCore, giving developers a head start and peace of mind while building cross-platform voice products on Linux, macOS, and Windows in .NET.

Cross-platform Voice Recorder and Audio Processor in .NET

Cross-platform Voice Recorder for Node.js

Node-record-lpcm16 and node-sox are the most popular Node.js audio recording libraries. However, they do not offer cross-platform support. Some developers create workarounds with Web Audio API Node.js bindings.

Cross-Platform Voice Recorder for Node.js is a superior alternative to node-record-lpcm16, node-sox, and Web Audio API for cross-platform and reliable experiences.

Voice Recording and Audio Processing in Node.js

Cross-platform Voice Recorder for Rust

Audio is a big problem in Rust. Compatibility, latency, buffer size, and support issues restrict developers. Even the most known library, cpal, has limitations.

Cross-platform Voice Recorder for Rust is a superior alternative to cpal to record and process audio in Rust.

Voice Recording and Audio Processing in Rust

Cross-platform Voice Recorder for Golang

Popular Go audio libraries, such as go-audio and portaudio can be complex and overwhelming to achieve unified cross-platform experiences.

Cross-platform Voice Recorder for Go is a simple and better alternative to go-audio and portaudio to build voice products running across Linux, macOS, and Windows in Go.

Voice Recording and Audio Processing in Golang

Recording Audio in the Web Browser

Cross-platform Voice Recorder for Web

Web Audio API is a powerful tool to record and process audio. However, cross-browser compatibility, latency, sample rate limitations, and overall complexity restrain what developers can do on the web.

Cross-platform Voice Recorder for Web is a simple, flexible, and cross-browser compatible alternative to Web Audio API.

Recording Audio on Mobile

Voice Recorder for Android

MediaRecorder and AudioRecord are built-in Android recorders. MediaRecorder is easy to use but does not provide control over advanced parameters, such as sample rate or real-time manipulation. AudioRecord gives control but is more difficult than MediaRecorder. Adding a variety of Android devices with different audio capabilities makes audio recording in Android even more difficult.

Voice Recorder for Android is a viable alternative to Android AudioRecord and MediaRecorder. It’s easy to use and manages recordings on a different thread, offering improved efficiency in seconds

Cross-platform Voice Recorder for Flutter

Achieving unified cross-platform experiences on mobile is itself challenging. While Flutter makes it easier, audio is still a problem. Licensing, support, update, and platform compatibility limitations of flutter_audio_recorder or flutter_sound make them unappealing for enterprise applications.

Cross-platform Voice Recorder for Flutter is a simple alternative to flutter_audio_recorder and flutter_sound.

Cross-platform Voice Recorder for React-Native

React-native-audio-recorder-player, react-native-audio-toolkit, and react-native-audio offer audio recording and playback options. However, they come with limited maintenance, support, documentation, updates, and implementation complexity, hindering the adoption of voice AI.

Cross-platform Voice Recorder for React-Native is an alternative to open-source React-Native audio toolkits.

Voice Recorder for iOS (Swift)

AVAudioRecorder is Apple’s native but complex audio processing tool. Getting started using the right settings or getting just the raw audio data from AVAudioRecorder can be challenging.

Voice Recorder for iOS is a simple and efficient alternative to the iOS native AVAudioRecorder. Voice Recorder for iOS runs on a different thread, resulting in faster and better iOS apps.

Recording Audio in Gaming and Spatial Computing

Voice Recorder for Unity

Voice is integral to voice chats in games and voice command and control in spatial computing. Unity’s built-in Unity Microphone is complex to use, hindering its adoption.

Unity Voice Recorder is a viable and simple alternative to Unity Microphone.

Compare

Voice AI is a complex and expensive technology that advances fast. Vendors claim “the best,” “revolutionary,” and “the most accurate” do not help enterprises make informed decisions.

Speech-to-Text Comparison

Open-source speech-to-text benchmark is a scalable framework to compare Amazon Transcribe, Azure Speech-to-Text, Google Speech-to-Text, IBM Watson Speech-to-Text, and Picovoice Leopard Speech-to-Text.

Amazon Transcribe

Azure Speech-to-Text

Google Speech-to-Text

IBM Watson Speech-to-Text

Leopard Speech-to-Text

Noise Suppression Comparison

Open-source speech enhancement and noise suppression comparison brings a scientific, transparent, and objective framework to compare noise cancellation solutions.

Mozilla RNNoise Noise Suppression

Koala Noise Suppression

Speaker Diarization Comparison

Open-source speaker diarization comparison compares speaker diarization capabilities of Amazon Transcribe, Azure Speech-to-Text, Google Speech-to-Text, IBM Watson Speech-to-Text with Falcon Speaker Diarization and Pyannote Speaker Diarization.

Amazon Transcribe Speaker Diarization

Azure Speech-to-Text Speaker Diarization

Google Speech-to-Text Speaker Diarization

pyannote Speaker Diarization

Falcon Speaker Diarization

Speaker Recognition Comparison

Open-source speaker recognition comparison enables data-driven decision making while choosing the best speaker verification and identification SDK.

pyannote Speaker Recognition

SpeechBrain Speaker Recognition

WeSpeaker Speaker Recognition

Eagle Speaker Recognition

Keyword & Phrase Search Comparison

Open-source keyword and phrase search comparison helps enterprises evaluate Speech-to-Text and Speech-to-Text Alternatives for Search, enabling informed decisions.

Google Speech-to-Text

Mozilla DeepSpeech

Octopus Speech-to-Index

Wake Word Comparison

Open-source wake word benchmark evaluates the performance of freely available wake word detection engines. Enterprises can add other alternatives to the comparison framework.

PocketSphinx Wake Word

Snowboy Wake Word

Porcupine Wake Word

Natural Language Understanding Comparison

Open-source natural language understanding benchmark is a scalable framework to compare the voice command acceptance performance of Amazon Lex, Google Dialogflow, IBM Watson, Microsoft LUIS, and Picovoice Rhino Speech-to-Intent.

Rhino Speech-to-Intent

Voice Activity Detection Comparison

Open-source void activity detection benchmark compares webRTC Voice Activity Detection by Google and Cobra Voice Activity Detection by Picovoice and allows enterprises what works for them.

webRTC Voice Activity Detection (VAD)

Cobra Voice Activity Detection (VAD)

Build with Open-source Voice AI Comparisons

Open-source voice AI comparisons were built internally to ensure we always offer the best-in-class voice AI products. We later open-sourced them as a part of our commitment to:

providing the best technology to our customers
fact-based marketing
bringing transparency to the industry

Our comparison tools have also been adopted by researchers. Here are some examples:

We’re aware that our frameworks do not include every vendor, including some of known players. To offer reproducible and transparent comparisons, we only work with available APIs and SDKs and respect our competitors’ Terms of Use.