Picovoice AI Frequently Asked Questions

Find answers to frequently asked questions about the Picovoice on-device AI platform for real-time voice, language, and vision understanding, Picovoice Console. For software-specific questions, please refer to the dedicated FAQs at the bottom of each product page:

On-device Voice AI:

On-device Language Understanding:

On-device Vision Processing:

FAQ

Business Model & Pricing

On-device voice AI Platform Features

Usage

Technical Questions

Custom Models & Support

Data Security & Privacy

Building with Picovoice

Business Model & Pricing

What's Picovoice's business model?

Picovoice sells its proprietary on-device voice AI, language, and vision technologies to enable enterprises to build AI-powered products without sacrificing privacy or accuracy. Picovoice's subscription model:

Offers access to support, updates, and upgrades during the engagement.
Helps enterprises manage their working capital effectively.
Automates usage tracking, resulting in efficiency gains and cost savings.

How do Picovoice on-device AI models achieve cloud-level accuracy with minimal resources?

Most edge voice AI models use post-training optimization of pre-trained models. Since these models were not designed for edge deployment in the first place, potential optimizations are restricted. Furthermore, they depend on open-source runtimes like PyTorch or TensorFlow, which again restrict performance improvements. As a result, achieving cloud-level accuracy on the edge remains a challenge. By owning the entire data pipeline and training process, Picovoice enables full end-to-end optimization. This approach makes cloud-quality voice AI on edge devices possible.

Picovoice's proprietary runtime, picoInference, on-device AI training algorithms, and tools, such as picoGYM and picoCompression, are developed by Picovoice researchers from scratch for on-device AI.

What type of support does Picovoice offer?

Picovoice offers several types of support options:

Enterprise Plan Customers: Can customize the level of support to fit the unique needs of their organization.
Enterprise Prospects: Can get dedicated support by contacting sales.
Enterprise Developers: Can create GitHub issues to report bugs or errors in our code and docs.

How can we evaluate Picovoice before committing to a paid plan?

Picovoice offers a Free Trial for enterprise developers. No credit card is required. You can sign up at this link.

How can I check my plan limit and usage?

Visit the homepage or your usage page on Picovoice Console.

How does Picovoice track engine usage?

Usage tracking depends on the engine:

Audio processed (per second): Cheetah, Leopard, Koala, Eagle, Falcon, and Bat
Text data (per character): Orca and Zebra
Tokens (per token): picoLLM, picoOCR, picoVLM
Monthly active users: Porcupine, Rhino, Cobra

A "user" refers to things that activate and use Picovoice software, not your software. Depending on your platform, it is typically a unique device, app, or browser instance that initializes the engine within a 30-day period. It depends on how your software activates Picovoice engines, hence your design and tech stack. It may not match the number of end users or end user accounts.

When does my usage reset?

Usage resets every 30 days. You can view real-time consumption on your Picovoice Usage Page.

How does Picovoice track model downloads?

Once you download a model, it's counted toward your monthly model download usage.

When does the model download usage reset?

Model download usage resets every 30 days. You can view your usage on your Picovoice Usage Page.

Can I reset my AccessKey on Picovoice Console?

No, you cannot reset your AccessKey. Do not share it with third parties.

Can I reset my usage without waiting for 30 days?

No, usage is reset automatically every 30 days.

Can I get my Free Trial period extended?

No, the Free Trial is a one-time offer, and it doesn't renew automatically once the trial ends. Make sure you contact sales during your trial to avoid service interruptions.

Can my teammates send another trial request?

No, the Free Trial is a one-time offer. If your project team wants extended use, you must contact sales.

Can I use Picovoice for personal projects?

Picovoice is a B2B company focused on on-device AI tools for enterprises. At this time, there are no dedicated free or paid plans for personal or non-commercial use.

Where can I ask more questions?

Most answers are available on the Picovoice website. For additional help:

On-device voice AI Platform Features

What does Picovoice on-device Voice AI Platform offer?

On-device voice AI platform offers everything that developers need to design, develop, and ship voice products: a complete set of modular voice AI engines delivered as cross-platform SDKs and a no-code platform to instantly train bespoke voice AI models to boost accuracy and efficiency.

What should I use to transcribe real-time conversations, such as live events, conferences, and meetings, or enable note-taking and voice typing?

We recommend Cheetah Streaming Speech-to-Text for real-time conversations such as live events, conferences, and meetings, or enable note-taking and voice typing.

Please note that every use case is unique, and the nuances may affect the performance of your product. If you're a Picovoice customer, please reach out to your Picovoice contact to get dedicated support. If you are not a customer yet, you can contact sales to discuss your use case and get your technical questions answered by the experts.

What should I use to convert audio and video files, such as recordings of interviews, meetings, calls, podcasts, and voicemails, into text?

We recommend Leopard Speech-to-Text to convert audio and video files, such as recordings of interviews, meetings, or calls, podcasts, and voicemails, into text.

What should I use to achieve crisp and clear conversations by removing background noise and enhancing speech?

We recommend Koala Noise Suppression to achieve crisp and clear conversations by removing background noise and enhancing speech.

Please note that every use case is unique, and the nuances may affect the performance of your product. If you're a Picovoice customer, please reach out to your Picovoice contact to get dedicated support. If you are not a customer yet, you cancontact sales to discuss your use case and get your technical questions answered by the experts.

What should I use to diarize speakers in conversations to make transcripts readable and analyzable?

We recommend Falcon Speaker Diarization for speaker diarization to make transcripts readable and analyzable.

What should I use to diarize speakers live, in a streaming conversation, rather than after the fact?

We recommend Bluebird Streaming Speaker Diarization to label who is speaking in real time, such as tagging speakers live during a call or meeting as it happens.

What should I use to identify and verify speakers, and personalize experiences simply by recognizing the user's voice?

We recommend Eagle Speaker Recognition to identify and verify speakers and personalize experiences simply by recognizing the user's voice.

What should I use to convert written text into spoken audio output?

We recommend Orca Streaming Text-to-Speech to convert written text into spoken audio output.

What should I use to add voice to an LLM-powered application to build an AI agent?

We recommend Orca Streaming Text-to-Speech to convert streaming LLM text output into voice.

What should I use to detect wake words, always listening commands, and monitor conversations for specific keywords?

We recommend Porcupine Wake Word to detect wake words (Alexa), always listening commands (turn the lights on), and monitor conversations for specific keywords (product name).

What should I use to detect wake words uttered by only one speaker?

We recommend Porcupine Wake Word and Eagle Speaker Recognition to detect wake words (Alexa) and always-listening commands (turn the lights on) uttered by only one speaker. Check the personalized wake word recipe for details.

What should I use to add custom voice commands to software, create voicebots and IVRs, and navigate menus?

We recommend Rhino Speech-to-Intent to add custom voice commands to software (set the brightness at 60%), create voicebots and IVRs, and navigate in menus (2022 Hyundai IONIQ 5 AWD)

What should I use to offer personalized voice assistants based on user, i.e., speaker profile?

We recommend Porcupine Wake Word and Eagle Speaker Recognition and/or Rhino Speech-to-Intent and Eagle Speaker Recognition to detect wake words (Alexa) and always-listening commands (turn the lights on) uttered by only one speaker. Check out the speaker-aware voice assistant recipe for details.

What should I use to activate software when someone starts or stops speaking?

We recommend Cobra Voice Activity Detection to detect when someone starts or stops speaking and trigger action accordingly.

What should I do to detect and clean silence in audio and video data?

We recommend Cobra Voice Activity Detection to detect and clean silence in audio and video data.

What should I use to detect which language is being spoken before routing it to the right model?

We recommend Bat Spoken Language Identification to detect which language is being spoken in an audio stream, so you can route it to the appropriate speech-to-text, translation, or voice model.

What should I use to record and process audio files?

We recommend Picovoice Voice Recorders to record and process audio files to create audio streams and use Picovoice Voice AI engines.

What should I use to translate speech or text between languages on-device?

We recommend Zebra Translate to translate text between languages entirely on-device, without sending data to the cloud. You can use Zebra Translate along with other Picovoice products for speech-to-speech translation, live conversation translation, or live captioning and translation.

What should I use to extract text from images and documents?

We recommend picoOCR Optical Character Recognition to extract text from scanned documents, receipts, forms, and photos, entirely on-device.

What should I use to understand and answer questions about images or visual content?

We recommend picoVLM Vision-Language Models to understand image content and answer questions about it on-device, such as describing a scene or reading a chart.

What should I do to quantize LLMs to shrink their size and memory requirements?

You can download the quantized open-weight, publicly available Llama, Mistral, Mixtral, Phi, and Gemma models compressed by picoLLM Compression from Picovoice Console. For use case specific, custom LLM quantization requests, please reach out to your Picovoice contact to work with large language model experts who developed Picovoice's novel large language model (LLM) quantization algorithm, picoLLM Compression. If you're not a customer, please contact sales.

How can I run quantized Large Language Models locally on embedded, mobile, laptop, or within web browsers? What should I do to quantize LLMs to shrink their size and memory requirements?

picoLLM comes with an inference engine that runs X-bit quantized LLMs. picoLLM inference engine:

runs on-device LLMs across Linux, macOS, Windows, Android, iOS, Raspberry Pi, Chrome, Safari, Edge, and Firefox.
supports CPU and GPU out-of-the-box and has the architecture to tap into other forms of accelerated computing.
works with any LLM architecture.

Does picoLLM offer on-device Llama models that run locally?

Yes, picoLLM offers quantized Llama models to run locally on-device. Quantized Llama language models can be downloaded from Picovoice Console and deployed locally across platforms within your plan limits.

Does picoLLM offer on-device Mistral models that run locally?

Yes, picoLLM offers quantized Mistral models to run locally on-device. Quantized Mistral language models can be downloaded from Picovoice Console and deployed locally across platforms within your plan limits.

Does picoLLM offer on-device Microsoft Phi models that run locally?

Yes, picoLLM offers quantized Microsoft Phi models to run locally on-device. Quantized Microsoft Phi language models can be downloaded from Picovoice Console and deployed locally across platforms within your plan limits.

Does picoLLM offer on-device Gemma models that run locally?

Yes, picoLLM offers quantized Gemma models to run locally on-device. Quantized Gemma models can be downloaded from Picovoice Console and deployed locally across platforms within your plan limits.

Does picoLLM offer on-device Qwen-VL models that run locally?

Yes, picoLLM offers quantized Qwen-VL models to run locally on-device. Quantized Qwen-VL models can be downloaded from Picovoice Console and deployed locally across platforms within your plan limits.

Does picoLLM offer on-device DeepSeek OCR models that run locally?

Yes, picoLLM offers quantized DeepSeek OCR models to run locally on-device. Quantized DeepSeek OCR models can be downloaded from Picovoice Console and deployed locally across platforms within your plan limits.

Usage

What are the hardware and software platforms supported by Picovoice on-device AI engines?

Desktop & Server: Linux, Windows & macOS
Mobile: Android & iOS
Web Browsers: Chrome, Safari, Edge and Firefox
Single Board Computers: Raspberry Pi
Cloud Providers: AWS, Azure, Google, IBM, Oracle, and others.

Do Picovoice on-device AI engines run in the cloud?

Yes. You can run all Picovoice AI engines (Voice Activity Detection, Wake Word, Speech-to-Intent, Streaming Speech-to-Text, Speech-to-Text, Streaming Text-to-Speech, Speaker Recognition, Streaming Speaker Diarization, Speaker Diarization, Spoken Language Identification, Noise Suppression, Translation, and LLM Inference) in the cloud.

Do Picovoice on-device AI engines run on-prem?

Do Picovoice on-device AI engines run in the serverless?

Do Picovoice on-device AI engines run on mobile devices?

Do Picovoice on-device AI engines run within web browsers?

Do Picovoice on-device AI engines run on embedded devices?

Do Picovoice on-device AI engines need a GPU?

No. Picovoice AI engines do not require a GPU. However, you can run all Picovoice AI engines (Voice Activity Detection, Wake Word, Speech-to-Intent, Streaming Speech-to-Text, Speech-to-Text, Streaming Text-to-Speech, Speaker Recognition, Streaming Speaker Diarization, Speaker Diarization, Spoken Language Identification, Noise Suppression, Translation, and LLM Inference) on a GPU for better performance.

Which SDKs are supported by Picovoice?

The Picovoice on-device Voice AI platform supports a wide range of modern SDKs, including Android, C, .NET, Flutter, iOS, Java, Node.js, Python, React, React Native, and Web. For details on available SDKs for each engine, please refer to the respective platform or documentation page.

If your preferred SDK isn't currently supported, contact sales with your commercial and technical requirements.

Technical Questions

Is Picovoice open-source?

Picovoice on-device AI SDKs, voice recorders, and benchmarks are open-source. Picovoice models and inference engines are proprietary.

How accurate are Picovoice on-device AI models?

To enable data-driven decision-making and communicate its engines' accuracy, Picovoice publishes open-source benchmarks for each engine. You can reproduce them or run them with your data.

Wake Word Benchmark (KWS & hotword)
Speech-to-Intent Benchmark (VUI & NLU)
Real-time Transcription Benchmark (Streaming ASR & STT)
Text-to-Speech Latency Benchmark (Speech Synthesis)
Speaker Recognition Benchmark (Voice ID, Speaker Identification)
Voice Activity Detection Benchmark (VAD)
Noise Suppression Benchmark (Speech Enhancement)
Speaker Diarization Benchmark (Speaker Labels)
Spoken Language Identification Benchmark (LangID, LID)
Speech-to-Text Benchmark (ASR & STT)
LLM Compression Benchmark (LLM Quantization)
Translation Benchmark (Text Translation)

How accurate is picoLLM Compression?

We compared the picoLLM Compression algorithm accuracy against popular quantization techniques. Ceteris paribus - at a given size and model - picoLLM offers better accuracy than the popular quantization techniques, such as AWQ, GPTQ,LLM.int8(), and SqueezeLLM. You can check the open-source compression benchmark to compare the performance of picoLLM Compression against GPTQ.

Please note that there is no single widely used framework to evaluate LLM accuracy, as LLMs are relatively new and capable of performing various tasks. One metric can be more important for a certain task, and irrelevant to others. Taking "accuracy" metrics at face value and comparing two figures calculated in different settings may lead to wrong conclusions.

Also, picoLLM Compression's value add is retaining the original quality while making LLMs available across platforms, i.e., offering the most efficient models without sacrificing accuracy, not offering the most accurate model.

We highly encourage enterprises to compare the accuracy against the original models, e.g., llama-2 70B vs. pico.llama-2 70B at different sizes.

How are Picovoice's small voice AI models more accurate than large, cloud-dependent AI models?

The secret sauce of the success behind Picovoice's super lightweight and accurate models is end-to-end optimization. Most edge voice AI models use post-training optimization of pre-trained models. Since these models were not designed for edge deployment in the first place, potential optimizations are restricted.

Furthermore, they depend on open-source runtimes like PyTorch or TensorFlow, which again restrict performance improvements. As a result, achieving cloud-level accuracy on the edge remains a challenge.

By owning the entire data pipeline and training process, Picovoice enables full end-to-end optimization. Furthermore, Picovoice researchers continuously improve techniques and frameworks used to train algorithms. Picovoice applies transfer learning, hardware-aware training, and neural compression principles, resulting in efficient models competing with cloud-dependent AI models.

How fast are Picovoice on-device AI engines?

It depends on your tech stack and design. Given the number of engines Picovoice offers and the platforms it supports, it's hard to communicate one number. We encourage developers to do their own tests and evaluations in their real environments.

How fast is picoLLM?

The smaller the models and more powerful the systems are, the faster language models run.

picoLLM tokens per second across SIMD kernels — C: 4.7, SSE: 5.5, AVX: 6.1, AVX2: 7.5, AVX512F: 10.0

Speed tests (token/second) are generally done in a controlled environment and, unsurprisingly, in favor of the model/vendor. Several factors, hardware (GPU, CPU, RAM, motherboard, original size of the models) and software (background processes and programs), language model, and so on affect the speed.

At Picovoice, our communication has always been fact-based and scientific. Since speed tests are easy to manipulate and it's impossible to create a reproducible framework we cannot publish any metrics. We strongly suggest everyone run their own tests in their environment.

Which languages does Picovoice support?

Picovoice on-device voice AI models currently support: English, French, German, Italian, Japanese, Korean, Chinese, Portuguese, and Spanish. Please check the product page if you're looking for engine-specific information. If you have an opportunity requiring another language,contact sales to get a custom model trained for you!

Does Picovoice technology work across various accents and dialects?

Yes, Picovoice technology works well across accents and dialects. The best way to learn about it is to test Picovoice technology with your dataset. Picovoice offers a Free Trial that allows enterprises to evaluate and become familiar with the technology before committing to a paid plan.

Can I use Picovoice software for telephony applications?

Picovoice engines expect audio with a 16kHz sampling rate. PSTN networks usually sample at 8kHz. It is possible to upsample, but the frequency content above 4kHz is gone, and performance will be suboptimal.

It is possible to train acoustic models for telephony applications for enterprise customers. Contact sales to find the best solution that works for you.

My audio source is 48kHz/44.1kHz. Does Picovoice software support that?

Picovoice software expects a 16kHz sampling rate. You will need to downsample. Typically, operating systems or sound cards (Audio codecs) provide such functionality; otherwise, you will need to implement it.

What's the 16kHz sampling rate?

Picovoice software expects a 16kHz sampling rate, as it strikes a balance between quality and file size, used in voice commands and speech recognition technologies.

At 16kHz, audio files are small enough to store and transmit while offering reasonable audio quality. Secondly, the human voice's most critical frequencies lie between 300Hz and 3400Hz. The Nyquist-Shannon sampling theorem states that a sampling rate of at least twice the highest frequency is required for accurate signal representation. 16kHz is more than twice 3400Hz and sufficient for processing the human voice. That's why 16kHz has become a standard in applications using human speech and voice.

What are the other factors that affect the performance of voice AI engines?

There are several factors that affect the performance of voice AI engines: quality of audio data, environment - noise, echo, reverberation, tech stack, and design.

What are the advantages of using quantized models over non-quantized models?

There are several advantages of running quantized models:

Reduced Model Size: Quantization decreases the model size of large language models, resulting in:

Smaller download size: Quantized LLMs require less time and bandwidth to download. For example, a mobile app using a large model may not be approved to be on the App Store.
Smaller storage size: Quantized LLMs occupy less storage space. For example, an Android app using a small language model will take up less storage space, improving the usability of your application and the experience of users.
Less memory usage: Quantized LLMs use less RAM, which speeds up LLM inference and your application and frees up memory for other parts of your application to use, resulting in better performance and stability.

Reduced Latency: Compute latency and network latency consist of the total latency.

Reduced Compute Latency: Compute latency is the time between a machine receiving a request and the moment and returning a response. LLMs require powerful infrastructure to run with minimal compute latency. Otherwise, it may take minutes, even hours, or days to respond. Reduced computational requirements allow quantized LLMs to respond faster given the same resources (reduces latency) or to achieve the same latency using fewer resources.
Zero Network Latency: Network latency, delay, or lag shows the time that data takes to transfer across the network. Since quantized LLMs can run where the data is generated rather than requiring data to be sent to a 3rd party cloud, there is no need for the data transfer, hence zero network latency.

Quantization can be used to reduce the size of models and latency, potentially at the expense of some accuracy. Choosing the right quantized model is important to ensure small to no accuracy loss. Our Deep Learning Researchers explain why picoLLM Compression is different from other quantization techniques.

How does picoLLM Compression differ from other compression techniques such as AWQ, GPTQ, LLM.int8(), and SqueezeLLM?

Quantization techniques, such as AWQ, GPTQ, LLM.int8(), and SqueezeLLM are developed by researchers for research. picoLLM is developed by researchers for production to enable enterprise-grade applications.

At any given size, picoLLM retains more of the original quality. In other words, picoLLM compresses models more efficiently than the others, offering efficient models without sacrificing accuracy compared to these techniques.

Read more from our deep learning research team about our approach to LLM quantization.

How does picoLLM Inference differ from other inference engines?

picoLLM Inference is specifically developed for the picoLLM platform.

Existing inference engines can handle models with known bit distribution (4 or 8-bit) across model weights. picoLLM-compressed weight contains 1, 2, 3, 4, 5, 6, 7, and 8-bit quantized parameters to retain intelligence while minimizing the model size. Hence existing inference engines built for pre-defined bit distribution are not able to match the dynamic nature of picoLLM.

Read more from our engineering team who explained why and how we developed picoLLM Inference engine.

Can I use picoLLM offerings with another LLM Inference engine?

There are three major issues with the existing LLM inference engines.

They are not versatile. They only support certain platforms or model types.
They are not ready-to-use, requiring machine learning knowledge.
They cannot handle X-bit quantization, as this innovative approach is unique to picoLLM Compression.

HuggingFace transformers work with transformers only. TensorFlow Serving works with TensorFlow models only and has a steep learning curve to get started. TorchServe is designed for Pytorch and integrates well with AWS. NVIDIA Triton Inference Server is designed for NVIDIA GPUs only. OpenVINO is optimized for Intel hardware.

In reality, your software can and will be run on different platforms. That's why we had to develop picoLLM Inference. It's the only ready-to-use and hardware-agnostic engine.

Custom Models & Support

How can I fine-tune Picovoice on-device voice AI models?

You can leverage the self-service Picovoice Console to fine-tune voice AI models or contact sales to engage with our deep learning researchers for further improvement.

See how to fine-tune models on the Picovoice Console:

How do custom speech recognition models compare with general models?

Custom speech recognition models are created for specific tasks, specific use cases, and sometimes for specific environments. General-purpose models are jacks-of-all-trades and masters-of-none.

For example, if you need a medical dictation app, you need a fine-tuned speech-to-text to be able to capture the jargon correctly. If you're building a sales enablement app, just like you train your salesforce to learn about your product names, you should adapt the general speech recognition model accordingly.

How can I fine-tune Picovoice on-device Large Language Models?

At the moment, custom language model training is available through picoLLM GYM for selected enterprise customers. Please engage with your account manager if you're already a Picovoice customer. If you're not a customer, contact sales!

How do custom large language models compare with general open LLMs?

Custom LLMs are created for specific tasks and specific use cases. General-purpose large language models are jacks-of-all-trades and masters-of-none. In other words, they can help a student with their homework, but not a knowledge worker with company-specific information.

General-purpose LLMs are offered by foundation model providers, such as OpenAI, Google, Meta, Microsoft, Cohere, Anthropic, Mistral, Databricks, and so on. They're good at developing products such as chatbots, translation services, and content creation apps. Developers building hobby projects, one-size-fits-all applications, or with no access to training datasets, can choose general-purpose LLMs.

Custom LLMs can offer distinctive feature sets and increased domain expertise, resulting in unmatched precision and relevance. Hence, custom LLMs have become popular in enterprise applications in several industries, including healthcare, law, and finance. They're used in various applications, such as medical diagnosis, legal document analysis, and financial risk assessment. Unlike general-purpose LLMs, custom LLMs are not ready to use; they require special training that leverages domain-specific data to perform better in certain use cases.

Why shouldn't we just use big vendors' closed-source models, such as GPT-4 or Claude, instead of custom large language models?

If you think they're a better fit, you should. Especially in the beginning, to have an understanding of what LLMs can achieve, using an API can be a better approach, as control over data, model, infrastructure, or inference cost is a concern. Closed-source model drawbacks become a concern when enterprises want to have control over their specific use case. If customizability, privacy, ownership, reliability, or inference cost at scale is a concern, then you should be more cautious about choosing a closed-source model.

Customizability: Each vendor has different criteria and processes to develop custom models. In order to send an inquiry to OpenAI, one has to acknowledge that it may take months to train custom models, and pricing starts at $2-3million.
Privacy: The default business model for closed-source models is to run inference in the cloud. Hence, it requires enterprises to send their user data and confidential information to the cloud.
Ownership: You never have ownership of a closed-source model. If your LLM is critical for the success of your product, or in other words, if you view your LLM as an asset rather than a simple tool, it should be owned and controlled by you.
Reliability: You are at the mercy of closed-source model providers. When their API goes down or has an increase in traffic, the performance of your software, hence user experience and productivity, is negatively affected.
Cost at scale: Cloud computing at scale is costly. That's why cloud repatriation has become popular among large enterprises. Large Language Model APIs are not different, if not more costly, given the size of the models. If your growth estimation involves high-volume inference, do your math carefully.

We have a custom LLM, how can we use the picoLLM Compression?

Yes. Contact sales to discuss compressing custom LLMs or fine-tuning them using the picoLLM Compression.

We need a new voice AI engine or model that the Picovoice voice AI platform doesn't offer. How can we get a new engine/model developed?

Contact sales to discuss your custom development needs.

My platform is not currently supported by Picovoice. How can I get Picovoice to support it?

Picovoice voice AI engines support the most popular and widely-used hardware and software out-of-the-box - from web, mobile, desktop, and on-prem to private cloud. However, there are so many platforms, yet only so much time and money, making it impossible to support everything.

Contact sales to discuss your custom development needs, including getting on-device AI engines ported to the platform of your choice.

Picovoice doesn't offer the SDK we're using in production. How can I get a new SDK added?

Picovoice supports the most popular and widely used SDKs. If you need another SDK, you can contact sales to discuss your custom development needs, including the creation and maintenance of a public or private library for the SDK of your choice.

Current Picovoice Voice AI dictionaries do not include the words that I need. How can I add a new word?

Picovoice engines have hundreds of thousands of words in their lexicons. However, there might be some special words we missed. You can add a custom word to Leopard Speech-to-Text and Cheetah Streaming Speech-to-Text on the self-service Picovoice Console. In order to add a new word to the Porcupine Wake Word and Rhino Speech-to-Intent lexicon, contact sales to discuss your customization needs.

I am using official Picovoice voice AI demos, however, I get an error. How do I report bugs?

You can create a GitHub issue under the relevant repository/demo.

I need help with developing my PoC and product. How do I get help?

Enterprises face several challenges while building PoCs. Finding talented and experienced individuals in machine learning is one of the biggest challenges to start with. We learned this the hard way, and experience it every day. On top of it, executives and clients may have unrealistic deadlines.

Contact sales to work with Picovoice on-device AI experts who have helped enterprises build PoCs and products ready for production.

Data Security & Privacy

Where does Picovoice process data?

Picovoice on-device AI engines process data in your environment, whether it's public or private cloud, on-prem, web, mobile, desktop, or embedded.

For how long do on-device AI engines retain user data, audio, or text files?

Picovoice is private by design and has no access to user data. Thus, Picovoice doesn't retain user data as it never tracks or stores it in the first place.

Is Picovoice on-device AI platform HIPAA-compliant?

Yes. Enterprises using Picovoice don't need to share their user data with Picovoice or any other 3rd-party to run voice AI models, making Picovoice on-device voice AI platform intrinsically HIPAA-compliant.

Is Picovoice on-device AI platform CCPA-compliant?

Building with Picovoice

Can I use Picovoice Voice AI engines with picoLLM to build voice AI agents?

Yes, you can use voice AI with local LLMs and create private, accurate, and reliable AI agents. Check Picovoice Blog or GitHub to find more information, tutorials, and demos. Some examples are:

What are the best practices to develop and deploy on-device AI engines and models?

The answer is "it depends". On-device AI is complex technology, and building products for production requires diligent work. It depends on your use case, other tools, and the tech stack used, along with hardware and software choices. Given the variables, it can be challenging.

You can experiment with different scenarios or contact sales to work with on-device AI experts and find the best approach to deploying language models for production.

Can I use multiple Picovoice products together?

Yes! Picovoice engines are modular and work with other Picovoice products or competitive products. Check the Picovoice cookbook recipes, blog posts or GitHub to find more information, tutorials, and demos. Some examples are below:

Was this doc helpful?

Issue with this doc?