Accelerating the adoption
of voice AI through innovation

Bring control back to enterprises with accurate, private, and fast voice AI technology that runs on-device, mobile, web browsers, on-premise & cloud.

Story

In 2018, the approach to voice AI was to gather data and train a model for each project. Any requirement changes would reset the process. Model sizes correlated with accuracy - the larger, the better. Only a handful of companies could afford to build or buy it.

The launch of Porcupine Wake Word challenged these industry dogmas - and this was only the beginning!

Porcupine Wake Word was more accurate and efficient than alternatives. Training a custom wake word would take hours, not months. It was groundbreaking. Today, it's even better and Picovoice keeps innovating:

Only full-stack voice AI platform with modular, accurate, and private voice AI products.
Only local LLM platform, empowering enterprises to deploy language models on any device.
Driver of the industry with innovative products, open-source benchmarks and datasets.

0K+

Developers building with Picovoice

Fortune 50 Enterprises using Picovoice

<0min

Time required to customize voice AI models

Offerings

On-device Voice AI Offerings

Each Picovoice offering has a unique advantage, creating new opportunities for enterprises to bring their vision to life.

Streaming Speech-to-Text

The only state-of-the-art streaming speech-to-text with cloud accuracy. Eliminating network latency for "real" real-time transcription.

Cheetah Streaming Speech-to-Text

Speech-to-Text

The only easy-to-customize and cross-platform on-device speech-to-text engines with cloud accuracy. Brings control back to enterprises.

Leopard Speech-to-Text

Streaming Text-to-Speech

The only cross-platform, production-ready on-device streaming text-to-speech engine. Supporting dual-streaming.

Orca Streaming Text-to-Speech

Noise Suppression & Cancellation

The only production-ready, real-time, and cross-platform high-quality noise suppression engine.

Koala Noise Suppression

Speaker Recognition

The only language-agnostic, text-independent, cross-platform commercial engine that is ready in seconds.

Eagle Speaker Recognition

Speaker Diarization

The only modular and cross-platform Speaker Diarization software that works with any Speech-to-Text engine.

Falcon Speaker Diarization

Wake Word Detection

The #1 wake word detection repository on GitHub for years with nothing comparable since its launch.

Porcupine Wake Word

Speech-to-Intent

The best Speech-to-Text alternative to use-case-specific voice commands.

Rhino Speech-to-Intent

Voice Activity Detection

The only enterprise-grade and cross-platform voice activity detection engine.

Cobra Voice Activity Detection

Local LLM Offerings

The world's only end-to-end local LLM platform empowering enterprises to deploy language models on any device.

LLM Inference

Production-ready, enterprise-grade LLM Inference Engine runs across Linux, Windows, macOS, Android, iOS, Chrome, Safari, Edge, Firefox, and embedded systems.

picoLLM Inference

LLM Compression

Compression algorithm with unmatched accuracy, reducing runtime and storage requirements of any LLM.

picoLLM Compression

Services

Professional Services

Inventors of voice AI helping enterprises disrupt their industry with breakthrough Picovoice technology.

Picovoice Consulting

Enterprise Support

Access Picovoice's technical expertise to get enterprise-level support even before becoming a customer.

Enterprise Support

Tools

Self-Service Console

The first no-code self-service interface to design and train voice AI models in seconds.

Picovoice Console

Voice Recorders

Picovoice Voice Recorders eliminates one of the biggest problems in voice AI: audio processing.

Voice AI engines receive audio streams and process them to generate the desired output. Voice AI vendors focus on processing the audio streams. Creating audio streams is a challenge left to developers. Especially finding a solution for real-time audio processing blocks many developers.

We initially built voice recorders for Picovoice engines to simplify the development process. Acknowledging the challenges, we created separate libraries, enabling developers to use them freely.

Today, developers who start building voice products with Picovoice without any experience with other vendors do not even realize the challenge because:

Picovoice Voice Recorders allow developers to choose the frame size.
Picovoice Voice Recorders support real-time audio processing.
Picovoice Voice Recorders are cross-platform.
Picovoice Voice Recorders are simple and ready to use with a few lines of code.
The Picovoice team maintains Picovoice Voice Recorders, giving enterprise developers peace of mind.

Although Picovoice Voice Recorders shine when used with our engines and for speech processing, we have seen them used for other purposes, including with competitor products.

Recording Audio on PC, Server and Embedded

Cross-platform Voice Recorder for Python

Finding the right package or overcoming platform and version compatibility remains a big challenge for audio processing in Python, even with popular PyAudio and python-sounddevice.

Cross-platform Voice Recorder for Python is a better alternative to PyAudio and python-sounddevice for voice processing and works across Linux, macOS, and Windows.

Voice Recording and Audio Processing in Python

Cross-platform Voice Recorder for C

Available tools to record audio in C, such as PortAudio, show their age and come with compatibility, documentation, support, or maintenance challenges.

Cross-platform Voice Recorder for C is a practical alternative to PortAudio that runs across Linux, macOS, and Windows.

Voice Recording and Audio Processing in C

Cross-platform Voice Recorder for .NET

.NET is popular for enterprise applications. However, popular .NET audio recorder tools, such as NAudio and CSCore developed by individuals. Maintenance, support, documentation, and compatibility issues (especially on Windows) increase the cost of enterprise projects.

Cross-platform Voice Recorder for .NET is an alternative to NAudio and CSCore, giving developers a head start and peace of mind while building cross-platform voice products on Linux, macOS, and Windows in .NET.

Cross-platform Voice Recorder and Audio Processor in .NET

Cross-platform Voice Recorder for Node.js

Node-record-lpcm16 and node-sox are the most popular Node.js audio recording libraries. However, they do not offer cross-platform support. Some developers create workarounds with Web Audio API Node.js bindings.

Cross-Platform Voice Recorder for Node.js is a superior alternative to node-record-lpcm16, node-sox, and Web Audio API for cross-platform and reliable experiences.

Voice Recording and Audio Processing in Node.js

Cross-platform Voice Recorder for Rust

Audio is a big problem in Rust. Compatibility, latency, buffer size, and support issues restrict developers. Even the most known library, cpal, has limitations.

Cross-platform Voice Recorder for Rust is a superior alternative to cpal to record and process audio in Rust.

Voice Recording and Audio Processing in Rust

Recording Audio in the Web Browser

Cross-platform Voice Recorder for Web

Web Audio API is a powerful tool to record and process audio. However, cross-browser compatibility, latency, sample rate limitations, and overall complexity restrain what developers can do on the web.

Cross-platform Voice Recorder for Web is a simple, flexible, and cross-browser compatible alternative to Web Audio API.

Recording Audio on Mobile

Voice Recorder for Android

MediaRecorder and AudioRecord are built-in Android recorders. MediaRecorder is easy to use but does not provide control over advanced parameters, such as sample rate or real-time manipulation. AudioRecord gives control but is more difficult than MediaRecorder. Adding a variety of Android devices with different audio capabilities makes audio recording in Android even more difficult.

Voice Recorder for Android is a viable alternative to Android AudioRecord and MediaRecorder. It's easy to use and manages recordings on a different thread, offering improved efficiency in seconds

Cross-platform Voice Recorder for Flutter

Achieving unified cross-platform experiences on mobile is itself challenging. While Flutter makes it easier, audio is still a problem. Licensing, support, update, and platform compatibility limitations of flutter_audio_recorder or flutter_sound make them unappealing for enterprise applications.

Cross-platform Voice Recorder for Flutter is a simple alternative to flutter_audio_recorder and flutter_sound.

Cross-platform Voice Recorder for React-Native

React-native-audio-recorder-player, react-native-audio-toolkit, and react-native-audio offer audio recording and playback options. However, they come with limited maintenance, support, documentation, updates, and implementation complexity, hindering the adoption of voice AI.

Cross-platform Voice Recorder for React-Native is an alternative to open-source React-Native audio toolkits.

Voice Recorder for iOS (Swift)

AVAudioRecorder is Apple's native but complex audio processing tool. Getting started using the right settings or getting just the raw audio data from AVAudioRecorder can be challenging.

Voice Recorder for iOS is a simple and efficient alternative to the iOS native AVAudioRecorder. Voice Recorder for iOS runs on a different thread, resulting in faster and better iOS apps.

Recording Audio in Gaming and Spatial Computing

Voice Recorder for Unity

Voice is integral to voice chats in games and voice command and control in spatial computing. Unity's built-in Unity Microphone is complex to use, hindering its adoption.

Unity Voice Recorder is a viable and simple alternative to Unity Microphone.

Compare

Voice AI is a complex and expensive technology that advances fast. Vendors claim "the best," "revolutionary," and "the most accurate" do not help enterprises make informed decisions.

Text-to-Speech Latency Comparison

Open-source text-to-speech latency benchmark compares the response times of different voice generators when used in LLM-based voice assistants.

Picovoice Orca Text-to-Speech

Speech-to-Text Comparison

Open-source speech-to-text benchmark is a scalable framework to compare Amazon Transcribe, Azure Speech-to-Text, Google Speech-to-Text, IBM Watson Speech-to-Text, and Picovoice Leopard Speech-to-Text.

Amazon Transcribe

Azure Speech-to-Text

Google Speech-to-Text

IBM Watson Speech-to-Text

OpenAI Whisper Speech-to-Text

Cheetah Streaming Speech-to-Text

Leopard Speech-to-Text

Noise Suppression Comparison

Open-source speech enhancement and noise suppression comparison brings a scientific, transparent, and objective framework to compare noise cancellation solutions.

Mozilla RNNoise Noise Suppression

Koala Noise Suppression

Speaker Diarization Comparison

Open-source speaker diarization comparison compares speaker diarization capabilities of Amazon Transcribe, Azure Speech-to-Text, Google Speech-to-Text, IBM Watson Speech-to-Text with Falcon Speaker Diarization and Pyannote Speaker Diarization.

Amazon Transcribe Speaker Diarization

Azure Speech-to-Text Speaker Diarization

Google Speech-to-Text Speaker Diarization

pyannote Speaker Diarization

Falcon Speaker Diarization

Speaker Recognition Comparison

Open-source speaker recognition comparison enables data-driven decision making while choosing the best speaker verification and identification SDK.

pyannote Speaker Recognition

SpeechBrain Speaker Recognition

WeSpeaker Speaker Recognition

Eagle Speaker Recognition

Wake Word Comparison

Open-source wake word benchmark evaluates the performance of freely available wake word detection engines. Enterprises can add other alternatives to the comparison framework.

PocketSphinx Wake Word

Snowboy Wake Word

Porcupine Wake Word

Natural Language Understanding Comparison

Open-source natural language understanding benchmark is a scalable framework to compare the voice command acceptance performance of Amazon Lex, Google Dialogflow, IBM Watson, Microsoft LUIS, and Picovoice Rhino Speech-to-Intent.

Rhino Speech-to-Intent

Voice Activity Detection Comparison

Open-source voice activity detection benchmark compares webRTC Voice Activity Detection by Google and Cobra Voice Activity Detection by Picovoice and allows enterprises what works for them.

webRTC Voice Activity Detection (VAD)

Cobra Voice Activity Detection (VAD)

LLM Compression Comparison

Open-source LLM Compression Benchmark compares compression techniques that are used to reduce large language models (LLMs) size and memory usage while preserving quality.

GPTQ

picoLLM Compression

Build with Open-source Benchmarks

Open-source AI model comparisons were built internally to ensure we always offer the best-in-class on-device products. We later open-sourced them as a part of our commitment to:

providing the best technology to our customers
fact-based marketing
bringing transparency to the industry

Our comparison tools have also been adopted by researchers. Here are some examples:

We're aware that our frameworks do not include every vendor, including some of known players. To offer reproducible and transparent comparisons, we only work with available APIs and SDKs and respect our competitors' Terms of Use.

Accelerating the adoptionof voice AI through innovation

Story

Offerings

On-device Voice AI Offerings

Streaming Speech-to-Text

Speech-to-Text

Streaming Text-to-Speech

Noise Suppression & Cancellation

Speaker Recognition

Speaker Diarization

Wake Word Detection

Speech-to-Intent

Voice Activity Detection

Local LLM Offerings

LLM Inference

LLM Compression

Services

Professional Services

Enterprise Support

Tools

Self-Service Console

Voice Recorders

Recording Audio on PC, Server and Embedded

Cross-platform Voice Recorder for Python

Cross-platform Voice Recorder for C

Cross-platform Voice Recorder for .NET

Cross-platform Voice Recorder for Node.js

Cross-platform Voice Recorder for Rust

Recording Audio in the Web Browser

Cross-platform Voice Recorder for Web

Recording Audio on Mobile

Voice Recorder for Android

Cross-platform Voice Recorder for Flutter

Cross-platform Voice Recorder for React-Native

Voice Recorder for iOS (Swift)

Recording Audio in Gaming and Spatial Computing

Voice Recorder for Unity

Compare

Text-to-Speech Latency Comparison

Speech-to-Text Comparison

Noise Suppression Comparison

Speaker Diarization Comparison

Speaker Recognition Comparison

Wake Word Comparison

Natural Language Understanding Comparison

Voice Activity Detection Comparison

LLM Compression Comparison

Build with Open-source Benchmarks

Accelerating the adoption
of voice AI through innovation