Picovoice Wordmark
Start Building
Introduction
Introduction
AndroidC.NETiOSLinuxmacOSNode.jsPythonRaspberry PiWebWindows
AndroidC.NETiOSNode.jsPythonWeb
SummaryPicovoice picoLLMGPTQ
Introduction
AndroidC.NETFlutteriOSJavaLinuxmacOSNode.jsPythonRaspberry PiReactReact NativeWebWindows
AndroidC.NETFlutteriOSJavaNode.jsPythonReactReact NativeWeb
SummaryPicovoice LeopardAmazon TranscribeAzure Speech-to-TextGoogle ASRGoogle ASR (Enhanced)IBM Watson Speech-to-TextWhisper Speech-to-Text
FAQ
Introduction
AndroidC.NETFlutteriOSJavaLinuxmacOSNode.jsPythonRaspberry PiReactReact NativeWebWindows
AndroidC.NETFlutteriOSJavaNode.jsPythonReactReact NativeWeb
SummaryPicovoice CheetahAzure Real-Time Speech-to-TextAmazon Transcribe StreamingGoogle Streaming ASR
FAQ
Introduction
AndroidC.NETiOSLinuxmacOSNode.jsPythonRaspberry PiWebWindows
AndroidC.NETiOSNode.jsPythonWeb
SummaryAmazon PollyAzure TTSElevenLabsOpenAI TTSPicovoice Orca
Introduction
AndroidCiOSLinuxmacOSPythonRaspberry PiWebWindows
AndroidCiOSPythonWeb
SummaryPicovoice KoalaMozilla RNNoise
Introduction
AndroidCiOSLinuxmacOSNode.jsPythonRaspberry PiWebWindows
AndroidCNode.jsPythoniOSWeb
SummaryPicovoice EaglepyannoteSpeechBrainWeSpeaker
Introduction
AndroidCiOSLinuxmacOSPythonRaspberry PiWebWindows
AndroidCiOSPythonWeb
SummaryPicovoice FalconAmazon TranscribeAzure Speech-to-TextGoogle Speech-to-Textpyannote
Introduction
AndroidArduinoCChrome.NETEdgeFirefoxFlutteriOSJavaLinuxmacOSMicrocontrollerNode.jsPythonRaspberry PiReactReact NativeSafariUnityWebWindows
AndroidC.NETFlutteriOSJavaMicrocontrollerNode.jsPythonReactReact NativeUnityWeb
SummaryPorcupineSnowboyPocketSphinx
Wake Word TipsFAQ
Introduction
AndroidCChrome.NETEdgeFirefoxFlutteriOSJavaLinuxmacOSNode.jsPythonRaspberry PiReactReact NativeSafariUnityWebWindows
AndroidC.NETFlutteriOSJavaNode.jsPythonReactReact NativeUnityWeb
SummaryPicovoice RhinoGoogle DialogflowAmazon LexIBM WatsonMicrosoft LUIS
Expression SyntaxFAQ
Introduction
AndroidC.NETiOSLinuxmacOSNode.jsPythonRaspberry PiWebWindows
AndroidC.NETiOSNode.jsPythonWeb
SummaryPicovoice CobraWebRTC VADSilero VAD
FAQ
Introduction
AndroidC.NETFlutteriOSNode.jsPythonReact NativeUnityWeb
AndroidC.NETFlutteriOSNode.jsPythonReact NativeUnityWeb
Introduction
C.NETNode.jsPython
C.NETNode.jsPython
FAQGlossary

Picovoice AI Frequently Asked Questions

Find answers to frequently asked questions on the Picovoice on-device Voice AI and local LLM platforms, Console, and Pricing. For software-specific questions, please refer to the dedicated FAQs at the bottom of each product page:

Local LLM:

  • picoLLM Local LLM Platform

On-device Voice AI:

  • Leopard Speech-to-Text
  • Cheetah Streaming Speech-to-Text
  • Koala Noise Suppression
  • Eagle Speaker Recognition
  • Falcon Speaker Diarization
  • Orca Text-to-Speech
  • Porcupine Wake Word
  • Rhino Speech-to-Intent
  • Cobra Voice Activity Detection

FAQ


Business Model & Pricing
Voice AI Platform Features
Usage
Technical Questions
Custom Models & Support
Data Security & Privacy
Building with Picovoice
Business Model & Pricing
What's Picovoice's business model?

Picovoice sells its proprietary voice AI and LLM technology to enable enterprises to build AI-powered products without sacrificing privacy or accuracy in a few lines of code.

Picovoice's subscription model:

  • Offers access to support, updates, and upgrades during the engagement.
  • Helps enterprises manage their working capital effectively.
  • Automizes usage tracking, resulting in efficiency gains and cost savings.
What subscription packages does Picovoice offer?

You can find more information about Picovoice's introductory packages on the pricing page.

How do Picovoice on-device AI models achieve cloud-level accuracy with minimal resources?

Picovoice offers highly accurate and lightweight on-device AI engines using deep neural networks trained in real-world environments.

Picovoice's proprietary algorithms are developed by Picovoice researchers using transfer learning and hardware-aware training principles. Transfer learning enables zero-shot learning and removes extensive data collection and training per model, resulting in dramatically simplified product development, reduced time-to-market, and more accurate voice models compared to the traditional methods relying on data gathering. Hardware-aware training optimizes on-device engines and models for the target platform, resulting in resource and power-efficient models even for stringent power consumption requirements.

What type of support does Picovoice offer?

Picovoice offers several types of support options:

  • Enterprise Plan Customers: Can customize the level of support to fit the unique needs of their organization.
  • Foundation Plan Customers: Get six (6) hours of email support with a 3-day SLA.
  • Enterprise Prospects: Can get dedicated support by booking a meeting with the Product and Engineering team.
  • Free Plan Account Owners: Can create GitHub issues to report bugs.
Can I use Free Plan to build a PoC to present to my management team, fundraise, or any project that may not generate revenue?

Free Plan is for personal and non-commercial projects only. Any commercial project requires a paid plan. Examples of commercial use include client projects, MVPs, internal testing, and evaluation, which involve founders, employees, contractors, or consultants writing code.

How can we evaluate Picovoice before committing to a paid plan?

Picovoice offers a Free Trial for enterprise developers. No credit card is required. You can sign up at this link.

How can I check my plan limit and usage?

Visit the dashboard or your profile page on Picovoice Console.

How does Picovoice track engine usage?

Usage tracking depends on the engine:

  • Audio processed (per second): Cheetah, Leopard, Koala, Eagle, Falcon
  • Text data (per character): Orca
  • Tokens (per token): picoLLM Inference
  • Monthly active users: Porcupine, Rhino, Cobra

A "user" refers to things that activate engines. It is not necessarily an account owner or end-user.

When does my usage reset?

Usage resets every 30 days. You can view real-time consumption on your Picovoice Console Profile.

How does Picovoice track model downloads?

Once you download a model, it's counted toward your monthly model download usage.

When does the model download usage reset?

Model download usage resets every 30 days. You can view your usage on your Picovoice Console Profile.

Can I reset my AccessKey on Picovoice Console?

No, you cannot reset your AccessKey. Do not share it with third parties.

Can I reset my usage without waiting for 30 days?

No, usage is reset automatically every 30 days.

Can I create another account until my usage resets?

No, creating multiple accounts violates the Terms of Use. You may be asked to pay fees or have access terminated. To get higher usage, upgrade to a paid plan.

Can I get my Free Trial period extended?

No, the Free Trial is a one-time offer. No credit card is required. Make sure to upgrade before the trial ends.

Can my colleague send another trial request?

No, the Free Trial is a one-time offer. If your organization wants extended use, you must upgrade.

Where can I ask more questions?

Most answers are available on the Picovoice website. For additional help:

  • Enterprise sales team
  • Technical support
  • GitHub issues
  • Email: [email protected]
Voice AI Platform Features
What does Picovoice on-device Voice AI Platform offer?

On-device voice AI platform offers everything that developers need to design, develop, and ship voice products: a complete set of modular voice AI engines delivered as cross-platform SDKs and a no-code platform to instantly train bespoke voice AI models to boost accuracy and efficiency.

What should I use to transcribe real-time conversations, such as live events, conferences, and meetings, or enable note-taking and voice typing?

We recommend Cheetah Streaming Speech-to-Text for real-time conversations such as live events, conferences, and meetings, or enable note-taking and voice typing.

Please note that every use case is unique, and the nuances may affect the performance of your product. If you're a Picovoice customer, please reach out to your Picovoice contact to get dedicated support. If you are not a customer yet, you can purchase Enterprise Support to discuss your use case and get your technical questions answered by the experts.

What should I use to convert audio and video files, such as recordings of interviews, meetings, calls, podcasts, and voicemails, into text?

We recommend Leopard Speech-to-Text to convert audio and video files, such as recordings of interviews, meetings, or calls, podcasts, and voicemails, into text.

Please note that every use case is unique, and the nuances may affect the performance of your product. If you're a Picovoice customer, please reach out to your Picovoice contact to get dedicated support. If you are not a customer yet, you can purchase Enterprise Support to discuss your use case and get your technical questions answered by the experts.

What should I use to achieve crisp and clear conversations by removing background noise and enhancing speech?

We recommend Koala Noise Suppression to achieve crisp and clear conversations by removing background noise and enhancing speech.

Please note that every use case is unique, and the nuances may affect the performance of your product. If you're a Picovoice customer, please reach out to your Picovoice contact to get dedicated support. If you are not a customer yet, you can purchase Enterprise Support to discuss your use case and get your technical questions answered by the experts.

What should I use to diarize speakers in conversations to make transcripts readable and analyzable?

We recommend Falcon Speaker Diarization for speaker diarization to make transcripts readable and analyzable.

Please note that every use case is unique, and the nuances may affect the performance of your product. If you're a Picovoice customer, please reach out to your Picovoice contact to get dedicated support. If you are not a customer yet, you can purchase Enterprise Support to discuss your use case and get your technical questions answered by the experts.

What should I use to identify and verify speakers, and personalize experiences simply by recognizing the user's voice?

We recommend Eagle Speaker Recognition to identify and verify speakers and personalize experiences simply by recognizing the user's voice.

Please note that every use case is unique, and the nuances may affect the performance of your product. If you're a Picovoice customer, please reach out to your Picovoice contact to get dedicated support. If you are not a customer yet, you can purchase Enterprise Support to discuss your use case and get your technical questions answered by the experts.

What should I use to convert written text into spoken audio output?

We recommend Orca Streaming Text-to-Speech to convert written text into spoken audio output.

Please note that every use case is unique, and the nuances may affect the performance of your product. If you're a Picovoice customer, please reach out to your Picovoice contact to get dedicated support. If you are not a customer yet, you can purchase Enterprise Support to discuss your use case and get your technical questions answered by the experts.

What should I use to add voice to an LLM-powered application to build an AI agent?

We recommend picoLLM On-device LLM Platform to convert streaming LLM text output into voice.

Please note that every use case is unique, and the nuances may affect the performance of your product. If you're a Picovoice customer, please reach out to your Picovoice contact to get dedicated support. If you are not a customer yet, you can purchase Enterprise Support to discuss your use case and get your technical questions answered by the experts.

What should I use to detect wake words, always listening commands, and monitor conversations for specific keywords?

We recommend Porcupine Wake Word to detect wake words (Alexa), always listening commands (turn the lights on), and monitor conversations for specific keywords (product name).

Please note that every use case is unique, and the nuances may affect the performance of your product. If you're a Picovoice customer, please reach out to your Picovoice contact to get dedicated support. If you are not a customer yet, you can purchase Enterprise Support to discuss your use case and get your technical questions answered by the experts.

What should I use to add custom voice commands to software, create voicebots and IVRs, and navigate menus?

We recommend Rhino Speech-to-Intent to add custom voice commands to software (set the brightness at 60%), create voicebots and IVRs, and navigate in menus (2022 Hyundai IONIQ 5 AWD)

Please note that every use case is unique, and the nuances may affect the performance of your product. If you're a Picovoice customer, please reach out to your Picovoice contact to get dedicated support. If you are not a customer yet, you can purchase Enterprise Support to discuss your use case and get your technical questions answered by the experts.

What should I use to activate software when someone starts or stops speaking?

We recommend Cobra Voice Activity Detection to detect when someone starts or stops speaking and trigger action accordingly.

Please note that every use case is unique, and the nuances may affect the performance of your product. If you're a Picovoice customer, please reach out to your Picovoice contact to get dedicated support. If you are not a customer yet, you can purchase Enterprise Support to discuss your use case and get your technical questions answered by the experts.

What should I do to detect and clean silence in audio and video data?

We recommend Cobra Voice Activity Detection to detect and clean silence in audio and video data.

Please note that every use case is unique, and the nuances may affect the performance of your product. If you're a Picovoice customer, please reach out to your Picovoice contact to get dedicated support. If you are not a customer yet, you can purchase Enterprise Support to discuss your use case and get your technical questions answered by the experts.

What should I use to record and process audio files?

We recommend Picovoice Voice Recorders to record and process audio files to create audio streams and use Picovoice Voice AI engines.

Please note that every use case is unique, and the nuances may affect the performance of your product. If you're a Picovoice customer, please reach out to your Picovoice contact to get dedicated support. If you are not a customer yet, you can purchase Enterprise Support to discuss your use case and get your technical questions answered by the experts.

What should I do to quantize LLMs to shrink their size and memory requirements?

You can download the quantized open-weight, publicly available Llama, Mistral, Mixtral, Phi, and Gemma models compressed by picoLLM Compression from Picovoice Console. For use case specific, custom LLM quantization requests, please reach out to your Picovoice contact to work with large language model experts who developed Picovoice's novel large language model (LLM) quantization algorithm, picoLLM.

How can I run quantized Large Language Models locally on embedded, mobile, laptop, or within web browsers? What should I do to quantize LLMs to shrink their size and memory requirements?

picoLLM comes with an inference engine that runs X-bit quantized LLMs. picoLLM inference engine:

  • runs on-device LLMs across Linux, macOS, Windows, Android, iOS, Raspberry Pi, Chrome, Safari, Edge, and Firefox.
  • supports CPU and GPU out-of-the-box and has the architecture to tap into other forms of accelerated computing.
  • works with any LLM architecture.
Does picoLLM offer on-device Llama models that run locally?

Yes, picoLLM offers quantized Llama models to run locally on-device for free. Quantized Llama language models can be downloaded from Picovoice Console and deployed locally across platforms within your plan limits.

Does picoLLM offer on-device Mistral models that run locally?

Yes, picoLLM offers quantized Mistral models to run locally on-device for free. Quantized Mistral language models can be downloaded from Picovoice Console and deployed locally across platforms within your plan limits.

Does picoLLM offer on-device Microsoft Phi models that run locally?

Yes, picoLLM offers quantized Microsoft Phi models to run locally on-device for free. Quantized Microsoft Phi language models can be downloaded from Picovoice Console and deployed locally across platforms within your plan limits.

Does picoLLM offer on-device Gemma models that run locally?

Yes, picoLLM offers quantized Gemma models to run locally on-device for free. Quantized Gemma models can be downloaded from Picovoice Console and deployed locally across platforms within your plan limits.

Usage
What are the hardware and software platforms supported by Picovoice on-device voice AI engines?
  • Desktop & Server: Linux, Windows & macOS
  • Mobile: Android & iOS
  • Web Browsers: Chrome, Safari, Edge and Firefox
  • Single Board Computers: Raspberry Pi
  • Cloud Providers: AWS, Azure, Google, IBM, Oracle, and others.
Do Picovoice voice AI engines run in the cloud?

Yes. You can run all Picovoice voice AI engines (Speech-to-Text, Streaming Speech-to-Text, Noise Suppression, Speaker Recognition, Speaker Diarization, Text-to-Speech, Wake Word, Speech-to-Intent, Voice Activity Detection, LLM Inference) in the cloud.

Do Picovoice voice AI engines run on-prem?

Yes. You can run all Picovoice voice AI engines (Speech-to-Text, Streaming Speech-to-Text, Noise Suppression, Speaker Recognition, Speaker Diarization, Text-to-Speech, Wake Word, Speech-to-Intent, Voice Activity Detection, LLM Inference) on-prem.

Do Picovoice voice AI engines run in the serverless?

Yes. You can run all Picovoice voice AI engines (Speech-to-Text, Streaming Speech-to-Text, Noise Suppression, Speaker Recognition, Speaker Diarization, Text-to-Speech, Wake Word, Speech-to-Intent, Voice Activity Detection, LLM Inference) in the serverless.

Do Picovoice voice AI engines run on mobile devices?

Yes. You can run all Picovoice voice AI engines (Speech-to-Text, Streaming Speech-to-Text, Noise Suppression, Speaker Recognition, Speaker Diarization, Text-to-Speech, Wake Word, Speech-to-Intent, Voice Activity Detection, LLM Inference) on mobile devices.

Do Picovoice voice AI engines run within web browsers?

Yes. You can run all Picovoice voice AI engines (Speech-to-Text, Streaming Speech-to-Text, Noise Suppression, Speaker Recognition, Speaker Diarization, Text-to-Speech, Wake Word, Speech-to-Intent, Voice Activity Detection, LLM Inference) within web browsers.

Do Picovoice voice AI engines run on embedded devices?

Yes. You can run all Picovoice voice AI engines (Speech-to-Text, Streaming Speech-to-Text, Noise Suppression, Speaker Recognition, Speaker Diarization, Text-to-Speech, Wake Word, Speech-to-Intent, Voice Activity Detection, LLM Inference) on embedded devices.

Do Picovoice voice AI engines need a GPU?

No. Picovoice voice AI engines do not require a GPU. However, you can run picoLLM Inference on a GPU for better performance.

Which SDKs are supported by Picovoice?

The Picovoice on-device Voice AI platform supports a wide range of modern SDKs, including Android, C, .NET, Flutter, iOS, Java, Node.js, Python, React, React Native, and Web. For details on available SDKs for each engine, please refer to the respective platform or documentation page.

If your preferred SDK isn't currently supported, Picovoice Consulting can develop and maintain it for you as part of the Enterprise Plan offering.

Technical Questions
Is Picovoice open-source?

Picovoice voice AI SDKs, voice recorders, and benchmarks are open-source and free to use.

How accurate are Picovoice on-device voice AI models?

To enable data-driven decision-making and communicate its engines' accuracy, Picovoice publishes open-source benchmarks for each engine. You can reproduce them or run them with your data.

  • Open-source Natural Language Understanding Benchmark
  • Open-source Noise Suppression Benchmark
  • Open-source Speaker Diarization Benchmark
  • Open-source Speech-To-Text Benchmark
  • Open-source Text-to-Speech Benchmark
  • Open-source Voice Activity Detection Benchmark
  • Open-source Wake Word Benchmark
How accurate is picoLLM Compression?

We compared the picoLLM Compression algorithm accuracy against popular quantization techniques. Ceteris paribus - at a given size and model - picoLLM offers better accuracy than the popular quantization techniques, such as AWQ, GPTQ,LLM.int8(), and SqueezeLLM. You can check the open-source compression benchmark to compare the performance of picoLLM Compression against GPTQ.

Please note that there is no single widely used framework to evaluate LLM accuracy, as LLMs are relatively new and capable of performing various tasks. One metric can be more important for a certain task, and irrelevant to others. Taking "accuracy" metrics at face value and comparing two figures calculated in different settings may lead to wrong conclusions.

Also, picoLLM Compression's value add is retaining the original quality while making LLMs available across platforms, i.e., offering the most efficient models without sacrificing accuracy, not offering the most accurate model.

We highly encourage enterprises to compare the accuracy against the original models, e.g., llama-2 70B vs. pico.llama-2 70B at different sizes.

How are Picovoice's small voice AI models more accurate than large, cloud-dependent AI models?

The secret sauce of the success behind Picovoice's super lightweight and accurate models is end-to-end optimization. Most edge voice AI models use post-training optimization of pre-trained models. Since these models were not designed for edge deployment in the first place, potential optimizations are restricted.

Furthermore, they depend on open-source runtimes like PyTorch or TensorFlow, which again restrict performance improvements. As a result, achieving cloud-level accuracy on the edge remains a challenge.

By owning the entire data pipeline and training process, Picovoice enables full end-to-end optimization. Furthermore, Picovoice researchers continuously improve techniques and frameworks used to train algorithms. Picovoice applies transfer learning, hardware-aware training, and neural compression principles, resulting in efficient models competing with cloud-dependent AI models.

How fast are Picovoice on-device voice AI engines?

It depends on your tech stack and design. Given the number of engines Picovoice offers and the platforms it supports, it's hard to communicate one number. We encourage developers to do their own tests and evaluations in their real environments.

How fast is picoLLM?

The smaller the models and more powerful the systems are, the faster language models run.

Speed tests (token/second) are generally done in a controlled environment and, unsurprisingly, in favor of the model/vendor. Several factors, hardware (GPU, CPU, RAM, motherboard, original size of the models) and software (background processes and programs), language model, and so on affect the speed.

At Picovoice, our communication has always been fact-based and scientific. Since speed tests are easy to manipulate and it's impossible to create a reproducible framework we cannot publish any metrics. We strongly suggest everyone run their own tests in their environment.

Which languages does Picovoice support?

Picovoice on-device voice AI models currently support: English, French, German, Italian, Japanese, Korean, Chinese, Portuguese, and Spanish. Please check the product page if you're looking for engine-specific information. If you have an opportunity requiring another language, engage with Picovoice Consulting to get a custom model trained for you!

Does Picovoice technology work across various accents and dialects?

Yes, Picovoice technology works well across accents and dialects. The best way to learn about it is to test Picovoice technology with your dataset. Picovoice offers a Free Trial that allows enterprises to evaluate and become familiar with the technology before committing to a paid plan.

Can I use Picovoice software for telephony applications?

Picovoice engines expect audio with a 16kHz sampling rate. PSTN networks usually sample at 8kHz. It is possible to upsample, but the frequency content above 4kHz is gone, and performance will be suboptimal.

It is possible to train acoustic models for telephony applications for enterprise customers. Engage with Picovoice Consulting to find the best solution that works for you.

My audio source is 48kHz/44.1kHz. Does Picovoice software support that?

Picovoice software expects a 16kHz sampling rate. You will need to downsample. Typically, operating systems or sound cards (Audio codecs) provide such functionality; otherwise, you will need to implement it.

What's the 16kHz sampling rate?

Picovoice software expects a 16kHz sampling rate, as it strikes a balance between quality and file size, used in voice commands and speech recognition technologies.

At 16kHz, audio files are small enough to store and transmit while offering reasonable audio quality. Secondly, the human voice's most critical frequencies lie between 300Hz and 3400Hz. The Nyquist-Shannon sampling theorem states that a sampling rate of at least twice the highest frequency is required for accurate signal representation. 16kHz is more than twice 3400Hz and sufficient for processing the human voice. That's why 16kHz has become a standard in applications using human speech and voice.

What are the other factors that affect the performance of voice AI engines?

There are several factors that affect the performance of voice AI engines: quality of audio data, environment - noise, echo, reverberation, tech stack, and design.

What are the advantages of using quantized models over non-quantized models?

There are several advantages of running quantized models:

  • Reduced Model Size: Quantization decreases the model size of large language models, resulting in:
    • Smaller download size: Quantized LLMs require less time and bandwidth to download. For example, a mobile app using a large model may not be approved to be on the App Store.
    • Smaller storage size: Quantized LLMs occupy less storage space. For example, an Android app using a small language model will take up less storage space, improving the usability of your application and the experience of users.
    • Less memory usage: Quantized LLMs use less RAM, which speeds up LLM inference and your application and frees up memory for other parts of your application to use, resulting in better performance and stability.
  • Reduced Latency: Compute latency and network latency consist of the total latency.
    • Reduced Compute Latency: Compute latency is the time between a machine receiving a request and the moment and returning a response. LLMs require powerful infrastructure to run with minimal compute latency. Otherwise, it may take minutes, even hours, or days to respond. Reduced computational requirements allow quantized LLMs to respond faster given the same resources (reduces latency) or to achieve the same latency using fewer resources.
    • Zero Network Latency: Network latency, delay, or lag shows the time that data takes to transfer across the network. Since quantized LLMs can run where the data is generated rather than requiring data to be sent to a 3rd party cloud, there is no need for the data transfer, hence zero network latency.

Quantization can be used to reduce the size of models and latency, potentially at the expense of some accuracy. Choosing the right quantized model is important to ensure small to no accuracy loss. Our Deep Learning Researchers explain why picoLLM Compression is different from other quantization techniques.

How does picoLLM Compression differ from other compression techniques such as AWQ, GPTQ, LLM.int8(), and SqueezeLLM?

Quantization techniques, such as AWQ, GPTQ, LLM.int8(), and SqueezeLLM are developed by researchers for research. picoLLM is developed by researchers for production to enable enterprise-grade applications.

At any given size, picoLLM retains more of the original quality. In other words, picoLLM compresses models more efficiently than the others, offering efficient models without sacrificing accuracy compared to these techniques.

Read more from our deep learning research team about our approach to LLM quantization.

How does picoLLM Inference differ from other inference engines?

picoLLM Inference is specifically developed for the picoLLM platform.

Existing inference engines can handle models with known bit distribution (4 or 8-bit) across model weights. picoLLM-compressed weight contains 1, 2, 3, 4, 5, 6, 7, and 8-bit quantized parameters to retain intelligence while minimizing the model size. Hence existing inference engines built for pre-defined bit distribution are not able to match the dynamic nature of picoLLM.

Read more from our engineering team who explained why and how we developed picoLLM Inference engine.

Can I use picoLLM offerings with another LLM Inference engine?

There are three major issues with the existing LLM inference engines.

  1. They are not versatile. They only support certain platforms or model types.
  2. They are not ready-to-use, requiring machine learning knowledge.
  3. They cannot handle X-bit quantization, as this innovative approach is unique to picoLLM Compression.

HuggingFace transformers work with transformers only. TensorFlow Serving works with TensorFlow models only and has a steep learning curve to get started. TorchServe is designed for Pytorch and integrates well with AWS. NVIDIA Triton Inference Server is designed for NVIDIA GPUs only. OpenVINO is optimized for Intel hardware.

In reality, your software can and will be run on different platforms. That's why we had to develop picoLLM Inference. It's the only ready-to-use and hardware-agnostic engine.

Custom Models & Support
How can I fine-tune Picovoice on-device voice AI models?

You can leverage the self-service Picovoice Console to fine-tune voice AI models or engage with Picovoice Consulting for further improvement.

See how to fine-tune models on the Picovoice Console:

  • Custom Wake Words
  • Custom Voice Commands
  • Custom Speech-to-Text
How do custom speech recognition models compare with general models?

Custom speech recognition models are created for specific tasks, specific use cases, and sometimes for specific environments. General-purpose models are jacks-of-all-trades and masters-of-none.

For example, if you need a medical dictation app, you need a fine-tuned speech-to-text to be able to capture the jargon correctly. If you're building a sales enablement app, just like you train your salesforce to learn about your product names, you should adapt the general speech recognition model accordingly.

How can I fine-tune Picovoice on-device Large Language Models?

At the moment, custom language model training is available through picoLLM GYM for selected enterprise customers. Please engage with your account manager if you're already a Picovoice customer. If you're not a customer, become one!

How do custom large language models compare with general open LLMs?

Custom LLMs are created for specific tasks and specific use cases. General-purpose large language models are jacks-of-all-trades and masters-of-none. In other words, they can help a student with their homework, but not a knowledge worker with company-specific information.

General-purpose LLMs are offered by foundation model providers, such as OpenAI, Google, Meta, Microsoft, Cohere, Anthropic, Mistral, Databricks, and so on. They're good at developing products such as chatbots, translation services, and content creation apps. Developers building hobby projects, one-size-fits-all applications, or with no access to training datasets, can choose general-purpose LLMs.

Custom LLMs can offer distinctive feature sets and increased domain expertise, resulting in unmatched precision and relevance. Hence, custom LLMs have become popular in enterprise applications in several industries, including healthcare, law, and finance. They're used in various applications, such as medical diagnosis, legal document analysis, and financial risk assessment. Unlike general-purpose LLMs, custom LLMs are not ready to use; they require special training that leverages domain-specific data to perform better in certain use cases.

Why shouldn't we just use big vendors' closed-source models, such as GPT-4 or Claude, instead of custom large language models?

If you think they're a better fit, you should. Especially in the beginning, to have an understanding of what LLMs can achieve, using an API can be a better approach, as control over data, model, infrastructure, or inference cost is a concern. Closed-source model drawbacks become a concern when enterprises want to have control over their specific use case. If customizability, privacy, ownership, reliability, or inference cost at scale is a concern, then you should be more cautious about choosing a closed-source model.

  • Customizability: Each vendor has different criteria and processes to develop custom models. In order to send an inquiry to OpenAI, one has to acknowledge that it may take months to train custom models, and pricing starts at $2-3million.
  • Privacy: The default business model for closed-source models is to run inference in the cloud. Hence, it requires enterprises to send their user data and confidential information to the cloud.
  • Ownership: You never have ownership of a closed-source model. If your LLM is critical for the success of your product, or in other words, if you view your LLM as an asset rather than a simple tool, it should be owned and controlled by you.
  • Reliability: You are at the mercy of closed-source model providers. When their API goes down or has an increase in traffic, the performance of your software, hence user experience and productivity, is negatively affected.
  • Cost at scale: Cloud computing at scale is costly. That's why cloud repatriation has become popular among large enterprises. Large Language Model APIs are not different, if not more costly, given the size of the models. If your growth estimation involves high-volume inference, do your math carefully.
We have a custom LLM, how can we use the picoLLM Compression?

Picovoice Consulting works with Enterprise Plan customers to compress their custom or fine-tuned LLMs using the picoLLM inference engine.

We need a new voice AI engine or model that the Picovoice voice AI platform doesn't offer. How can we get a new engine/model developed?

Enterprise Plan customers can engage with Picovoice Consulting to discuss custom development needs.

My platform is not currently supported by Picovoice. How can I get Picovoice to support it?

Picovoice voice AI engines support the most popular and widely-used hardware and software out-of-the-box - from web, mobile, desktop, and on-prem to private cloud. However, there are so many platforms, yet only so much time and money, making it impossible to support everything.

You can engage with Picovoice Consulting and get any Picovoice voice AI engine ported to the platform of your choice once you become an Enterprise Plan customer.

Picovoice doesn't offer the SDK we're using in production. How can I get a new SDK added?

Picovoice supports the most popular and widely used SDKs. If you need another SDK, you can check our open-source SDKs and build it yourself, or contact Picovoice Consulting once you become an Enterprise Plan customer. Picovoice Consulting experts can create a public or private library for the SDK of your choice and maintain it.

Current Picovoice Voice AI dictionaries do not include the words that I need. How can I add a new word?

Picovoice engines have hundreds of thousands of words in their lexicons. However, there might be some special words we missed. You can add a custom word to Leopard Speech-to-Text and Cheetah Streaming Speech-to-Text on the self-service Picovoice Console. For Porcupine Wake Word and Rhino Speech-to-Intent, Enterprise Plan customers can engage Picovoice Consulting.

  • Add Custom Vocabulary to Leopard Speech-to-Text
  • Add Custom Vocabulary to Cheetah Streaming Speech-to-Text
I am using official Picovoice voice AI demos, however, I get an error. How do I report bugs?

You can create a GitHub issue under the relevant repository/demo.

I need help with developing my PoC and product. How do I get help?

Enterprises face several challenges while building PoCs. Finding talented and experienced individuals in machine learning is one of the biggest challenges to start with. We learned this the hard way, and experience it every day. On top of it, executives and clients may have unrealistic deadlines.

Experts at Picovoice Consulting help enterprises build PoCs, develop their AI strategy, and work with them hand-in-hand, offering the guidance they need.

Data Security & Privacy
Where does Picovoice process data?

Picovoice on-device AI engines process data in your environment, whether it's public or private cloud, on-prem, web, mobile, desktop, or embedded.

For how long do on-device AI engines retain user data, audio, or text files?

Picovoice is private by design and has no access to user data. Thus, Picovoice doesn't retain user data as it never tracks or stores it in the first place.

Is Picovoice on-device AI platform HIPAA-compliant?

Yes. Enterprises using Picovoice don't need to share their user data with Picovoice or any other 3rd-party to run voice AI models, making Picovoice on-device voice AI platform intrinsically HIPAA-compliant.

Is Picovoice on-device AI platform GDPR-compliant?

Yes. Enterprises using Picovoice don't need to share their user data with Picovoice or any other 3rd-party to run voice AI models, making Picovoice on-device voice AI platform intrinsically GDPR-compliant.

Is Picovoice on-device AI platform CCPA-compliant?

Yes. Enterprises using Picovoice don't need to share their user data with Picovoice or any other 3rd-party to run voice AI models, making Picovoice on-device voice AI platform intrinsically CCPA-compliant.

Building with Picovoice
Can I use Picovoice Voice AI engines with picoLLM to build voice AI agents?

Yes, you can use voice AI with local LLMs and create private, accurate, and reliable AI agents. Check Picovoice Blog or GitHub to find more information, tutorials, and demos. Some examples are:

  • LLM-powered voice AI agent in Python
  • LLM-powered voice AI agent in Web
  • LLM-powered voice AI agent in iOS
  • LLM-powered voice AI agent in Android
What are the best practices to develop and deploy on-device AI engines and models?

The answer is "it depends". Voice AI is complex technology, and building products for production requires diligent work. It depends on your use case, other tools, and the tech stack used, along with hardware and software choices. Given the variables, it can be challenging.

You can experiment with different scenarios leveraging Picovoice's Free resources or engage with experts from Picovoice Consulting to find the best approach to deploying language models for production.

Can I use multiple Picovoice products together?

Yes! Picovoice engines are modular and work with other Picovoice products or competitive products. Check Picovoice Blog or GitHub to find more information, tutorials, and demos. The examples below use Porcupine Wake Word, Cheetah Streaming Speech-to-Text, picoLLM, and Orca Streaming Text-to-Speech together:

  • LLM-powered voice AI agent in Python
  • LLM-powered voice AI agent in Web
  • LLM-powered voice AI agent in iOS
  • LLM-powered voice AI agent in Android
I'm struggling to build a PoC to present to my management team (or client) in the upcoming days. Can you help with it?

Enterprises face several challenges while building PoCs. Finding talented and experienced individuals in machine learning is one of the biggest challenges to start with. We learned this the hard way, and experience it every day. On top of it, executives and clients may have unrealistic deadlines.

Experts at Picovoice Consulting help enterprises build PoCs, develop their AI strategy, and work with them hand-in-hand, offering the guidance they need.

Was this doc helpful?

Issue with this doc?

Report a GitHub Issue
Voice AI
  • picoLLM On-Device LLM
  • Leopard Speech-to-Text
  • Cheetah Streaming Speech-to-Text
  • Orca Text-to-Speech
  • Koala Noise Suppression
  • Eagle Speaker Recognition
  • Falcon Speaker Diarization
  • Porcupine Wake Word
  • Rhino Speech-to-Intent
  • Cobra Voice Activity Detection
Resources
  • Docs
  • Console
  • Blog
  • Use Cases
  • Playground
Sales & Services
  • Consulting
  • Foundation Plan
  • Enterprise Plan
  • Enterprise Support
Company
  • About us
  • Careers
Follow Picovoice
  • LinkedIn
  • GitHub
  • X
  • YouTube
  • AngelList
Subscribe to our newsletter
Terms of Use
Privacy Policy
© 2019-2025 Picovoice Inc.