The speech-to-intent engine didn't understand the command.
Click to activate
small
medium
large
single shot
double shot
triple shot
americano
cappuccino
espresso
The speech-to-intent engine didn't understand the command.
Click to activate
small
medium
large
single shot
double shot
triple shot
americano
cappuccino
espresso
Design Context-Aware Voice User Interfaces
Design, test and train custom voice commands on Picovoice Console. Build models supporting intent classification and entity resolution with multiple slots. Beat cloud NLU accuracy with high margins by tuning into your domain of interest. Instantly download trained models for edge inference.
Add custom voice commands with your favourite SDK, including Android, iOS, Python, Flutter, and React. Add only a few lines of code, and let the SDK handle audio capture and inference.
Deploy Unified Voice User Interfaces across Platforms
Offer seamless user experiences across all platforms. Deploy domain-specific custom voice AI models across all platforms, including embedded, mobile, web, on-premise, and cloud.
Grow product and user engagement with unlimited voice interactions without worrying about the cost. API-based pricing grows out of hand with user engagement, while Picovoice’s remain constant!
Highly accurate — backed by data, not fancy slides
Choose the best engine based on data! Accuracy depends on various factors. In a market with numerous “the best engine”, we published an open-source benchmark. Compare Rhino against the most popular conversational AI engines, Amazon Lex,Google Dialogflow, IBM Watson, Microsoft LUIS, or any other Natural Language Understanding (NLU) engine. Rhino outperforms them across various accents and in the presence of noise and reverberations.
Real-time — no network delay, no downtime and zero latency
Build real real-time experiences with Rhino. Rhino’s edge-first architecture infers intents from utterances directly with zero latency. Relying on the cloud APIs hinders user experience due to fluctuating latency or network performance. Milliseconds matter in many applications such as automotive, smart TV or metaverse.
Private — intrinsically compliant with GDPR, HIPAA and more!
Ensure user privacy and stay compliant! Rhino processes voice commands locally on-device, without recording data and sending them to the cloud. Put Rhino in meeting rooms, warehouses or examination rooms, knowing that no one will ever have access to the conversations.
Multilingual - supports polyglot experiences.
Create polyglot experiences with Rhino Speech-to-Intent! Grow globally and train voice AI models in English, French, German, Italian, Japanese, Korean, Portuguese, Spanish, and more on the Picovoice Console. Every user still has access to unlimited voice interactions in all languages.
English
German Deutsch
Spanish Español
French Français
Italian Italiano
Japanese 日本語
Korean 한국어
Portuguese Português
Mandarin 普通话
Dutch Nederlands
Russian Русский
Hindi हिन्दी
Polish Język polski
Vietnamese Tiếng Việt
Swedish Svenska
Arabic اَلْعَرَبِيَّةُ
Use Cases
Search By Voice
Add voice for truly hands-free search experiences on the websites, mobile applications and devices.
Is Rhino a Natural Language Understanding (NLU) Engine?
NLU engines infer intents and slots (entities) from speech transcribed by a speech-to-text engine. Rhino Speech-to-Intent understands the intention directly from the spoken utterance. We coined the term Speech-to-Intent when developing Rhino to indicate the end-to-end nature of its inference.
How does Rhino Speech-to-Intent achieve such high accuracy with small model sizes compared to other edge and cloud-based solutions in the market?
The standard approach to intent inference (i.e. understanding voice commands) is to break it down into two tasks. First, a speech-to-text engine converts the spoken utterance into text. Then the transcription is processed by a natural language understanding (NLU) engine. The NLU engine is responsible for inferring the topic, intent, and slots. However, if the accuracy of the speech-to-text engine is not good, the output of NLU will be poor, too. Therefore, some solutions tune speech-to-text engines for the domain of interest to improve overall performance. This approach requires significant resources such as computing power, memory, and storage. When implemented as a cloud solution, this is not an issue. However, the cloud is not always the best option. Also, not every use case requires open-domain, millions of variants of spoken comments. One does not need to discuss the meaning of life with a coffee machine or a surgical robot. Most use cases have a confined domain (context) that covers thousands of spoken commands.
Picovoice’s Speech-to-Intent engine is perfect for these use cases by fusing automated speech recognition and NLU engines tuned for the specific domain of interest. This end-to-end approach results in small and efficient model sizes with high accuracy.
How do I learn more about the terminology used for Natural Language Understanding (NLU) Engines?
Intents, expressions, and slots are commonly used in conversational AI and across various engines such as Amazon Lex, IBM Watson, Google Dialogflow or Rasa NLU. They’re used to build voice assistants or bots. You can check out Picovoice Glossary to learn more or Rhino Syntax Cheat Sheet to start building contexts with intents, slots, macros and expressions.
How can I add custom commands to voice control mobile or web applications?
Picovoice docs is a great source to learn how to add custom voice commands to Android and iOS applications and modern web browsers.
Does Rhino Speech-to-Intent process voice data locally on the device?
Rhino processes voice data locally on the device. If you haven’t, try the voice-activated coffee maker demo offline. After allowing the microphone access, turn off your internet connection before running the demo. Rhino Speech-to-Intent directly infers intents from your utterances within your web browser.
Which platforms does Rhino Speech-to-Intent support?
Almost any vertical can benefit from conversational AI. Rhino is even getting ready for space exploration. Don’t forget to check out use case pages, including Search by Voice and Voice Command and Control. If you’re looking for inspiration, check Picovoice's Youtube and Medium pages.
How do I get technical support for Rhino Speech-to-Intent?
Picovoice docs, blog, Medium posts, and GitHub are great resources to learn about voice recognition, Picovoice engines, and how to start adding voice control to anything. Picovoice also offers GitHub community support to all Free Tier users.
What should I do if I need support for other languages?
Reach out to Picovoice Sales by providing details about the opportunity, including use case, requirements and project details.
How can I get informed about the updates and upgrades?
Version changes appear in the Picovoice Newsletter, LinkedIn, and Twitter. Subscribing to GitHub is the best way to get notified of the patch releases. If you enjoy building with Rhino, don’t forget to give it a star when you’re on GitHub!