Picovoice Wordmark
Start Building
Introduction
Introduction
AndroidC.NETFlutterlink to GoiOSJavaNVIDIA JetsonLinuxmacOSNodejsPythonRaspberry PiReact NativeRustWebWindows
AndroidC.NETFlutterlink to GoiOSJavaNodejsPythonReact NativeRustWeb
SummaryPicovoice LeopardAmazon TranscribeAzure Speech-to-TextGoogle ASRGoogle ASR (Enhanced)IBM Watson Speech-to-Text
FAQ
Introduction
AndroidC.NETFlutterlink to GoiOSJavaNVIDIA JetsonLinuxmacOSNodejsPythonRaspberry PiReact NativeRustWebWindows
AndroidC.NETFlutterlink to GoiOSJavaNodejsPythonReact NativeRustWeb
FAQ
Introduction
AndroidCiOSLinuxmacOSPythonWebWindows
AndroidCiOSPythonWeb
SummaryOctopus Speech-to-IndexGoogle Speech-to-TextMozilla DeepSpeech
FAQ
Introduction
AndroidAngularArduinoBeagleBoneCChrome.NETEdgeFirefoxFlutterlink to GoiOSJavaNVIDIA JetsonLinuxmacOSMicrocontrollerNodejsPythonRaspberry PiReactReact NativeRustSafariUnityVueWebWindows
AndroidAngularC.NETFlutterlink to GoiOSJavaMicrocontrollerNodejsPythonReactReact NativeRustUnityVueWeb
SummaryPorcupineSnowboyPocketSphinx
Wake Word TipsFAQ
Introduction
AndroidAngularBeagleBoneCChrome.NETEdgeFirefoxFlutterlink to GoiOSJavaNVIDIA JetsonLinuxmacOSNodejsPythonRaspberry PiReactReact NativeRustSafariUnityVueWebWindows
AndroidAngularC.NETFlutterlink to GoiOSJavaNodejsPythonReactReact NativeRustUnityVueWeb
SummaryPicovoice RhinoGoogle DialogflowAmazon LexIBM WatsonMicrosoft LUIS
Expression SyntaxFAQ
Introduction
AndroidBeagleboneCiOSNVIDIA JetsonLinuxmacOSPythonRaspberry PiRustWebWindows
AndroidCiOSPythonRustWeb
SummaryPicovoice CobraWebRTC VAD
FAQ
Introduction
AndroidCiOSNVIDIA JetsonLinuxmacOSPythonRaspberry PiWebWindows
AndroidCiOSPythonWeb
SummaryPicovoice KoalaMozilla RNNoise
Introduction
AndroidCiOSNVIDIA JetsonLinuxmacOSPythonRaspberry PiWebWindows
AndroidCPythoniOSWeb
Introduction
AndroidAngularArduinoBeagleBoneC.NETFlutterlink to GoiOSJavaNVIDIA JetsonMicrocontrollerNodejsPythonRaspberry PiReactReact NativeRustUnityVueWeb
AndroidAngularCMicrocontroller.NETFlutterlink to GoiOSJavaNodejsPythonReactReact NativeRustUnityVueWeb
Picovoice SDK - FAQ
IntroductionSTM32F407G-DISC1 (Arm Cortex-M4)STM32F411E-DISCO (Arm Cortex-M4)STM32F769I-DISCO (Arm Cortex-M7)IMXRT1050-EVKB (Arm Cortex-M7)
Introduction
AndroidC.NETFlutterlink to GoiOSNodejsPythonReact NativeRustUnityWeb
AndroidC.NETFlutterlink to GoiOSNodejsPythonReact NativeRustUnityWeb
FAQGlossary

Rhino Speech-to-Intent

Rhino is Picovoice's Speech-to-Intent engine. It directly infers intent from spoken commands within a given context of interest, in real-time. For example, given a spoken command:

Can I have a small double-shot espresso?

Rhino infers what the user wants and emits the following inference result:

{
"isUnderstood": "true",
"intent": "orderBeverage",
"slots": {
"beverage": "espresso",
"size": "small",
"numberOfShots": "2"
}
}

Rhino is:

  • using deep neural networks trained in real-world environments.
  • compact and computationally-efficient. It is perfect for IoT.
  • cross-platform:
    • Arm Cortex-M, STM32, Arduino, and i.MX RT
    • Raspberry Pi, NVIDIA Jetson Nano, and BeagleBone
    • Android and iOS
    • Chrome, Safari, Firefox, and Edge
    • Linux (x86_64), macOS (x86_64, arm64), and Windows (x86_64)
  • self-service. Developers can train custom contexts using Picovoice Console .
Arabicاَلْعَرَبِيَّةُ
DutchNederlands
EnglishEnglish
Farsiفارسی
FrenchFrançais
GermanDeutsch
Hindiहिन्दी
ItalianItaliano
Japanese日本語
Korean한국어
Mandarin普通话
PolishPolski
PortuguesePortuguês
RussianРусский
SpanishEspañol
SwedishSvenska
VietnameseTiếng Việt
AfrikaansAfrikaans
Bengaliবাংলা
Bulgarianбългарски
CroatianHrvatski
CzechČeština
DanishDansk
EstonianEesti keel
FinnishSuomi
GreekΕλληνικά
Hebrewעִברִית
HungarianMagyar
IcelandicÍslenska
IndonesianBahasa Indonesia
IrishGaeilge
NorwegianNorsk Bokmål
RomanianDaco-Romanian
SerbianCрпски језик
SlovakSlovenčina
SlovenianSlovenski jezik
Thaiภาษาไทย
TurkishTürkçe
UkrainianYкраїнська мова
Urduاردو

Get Started

Anyone who is using Picovoice needs to have a valid AccessKey. AccessKey is your authentication and authorization token for using Picovoice. It also verifies that your usage is within the limits of your account. You must keep your AccessKey secret!

Sign up for Picovoice Console

Sign up for Picovoice Console . It is free, no credit card required.

Retrieve AccessKey

Log in to your account. Click on the Show AccessKey to get your AccessKey.

Download SDK

Picovoice SDKs are available both on GitHub and via SDK-specific package managers. Follow one of the quick starts to use Rhino with your newly-created AccessKey.

Create a Context

A context represents the set of expressions (spoken commands), intents, and intent arguments (slots) within a domain of interest. Let's create a context for a "smart lighting system". This context will understand voice commands for controlling lights in a home.

First, navigate to the Rhino Speech-to-Intent console from the Picovoice Console's landing page:

Go to Rhino Console

Create a new Rhino context named PicoSmartLighting from an "Empty" template:

Create a new Rhino Context

Once created, it appears in your list of contexts. Click on the context name to go to context editor.

Create an intent

At the top level, a context is a collection of intents. For example, "turning lights on and off" can be a user intent in the context of PicoSmartLighting. Create an intent called turnLight:

Create *turnLight* intent

Adding expressions to an intent

A user can utter an intent in several ways. Each variation of a spoken command is called an expression. For example, "turn off all lights" and "turn the lights off" both map to the turnLight intent. Add these expressions to the turnLight intent:

Add Expressions

Use the Microphone to test the context

A unique feature of the Picovoice Console is that one can test the context at design time within the browser. Click on the microphone icon on the right. Picovoice Console then saves the context into your account, trains a Rhino model for it, and then loads it into your browser. When the microphone turns red, say:

Turn off all lights.

Test Rhino Context with Microphone

Use slots to capture variables within utterances

If a user can turn the lights off by saying "turn all lights off", it makes sense to allow turning them on using a command such as "turn all lights on". The only portion of the command that has changed if the state variable "on/off". We can capture this variable using a slot in Rhino contexts. Create a slot named lightState:

Create a slot

Modify the existing expressions to include the newly-created slot:

Add `lightState` Slot to Expressions

When entering a slot, use $ to instruct the editor that you are adding a slot. The autocomplete dropdown shows the list of available slot types. Pick the desired slot type and assign it a name. Click on the microphone and say:

Turn on all lights.

Test Slot with Microphone

Built-in slots

Some frequently-used slots are included in the Picovoice Console Rhino editor by default. The built-in slot types start with pv.. They appear in the autocomplete of the expression editor when you type $ (see also: Rhino Expression Syntax Cheat Sheet).

Let's say we want to set the intensity of the lights. For example:

Set the lights to 72%.

We can use the built-in slot pv.Percent to achieve this:

Use Built-in Slot to Address Common Use Cases

Optionals

Often there are words within a sentence that can be omitted without changing the intent and meaning of the utterance:

  • turn $lightState:state lights
  • turn $lightState:state the lights
  • turn $lightState:state all lights
  • turn $lightState:state all the lights
  • please turn $lightState:state lights
  • please turn $lightState:state all lights
  • please turn $lightState:state all the lights

We can handle these variations using optional phrase syntax of Rhino, capturing all the above expressions with single entry:

(please) turn $lightState:state (all) (the) lights

Phrases in parentheses can be omitted in utterances, and will still match the intent. This allows the context to handle a large amount of phrase variations without the number of expressions becoming unwieldy.

Choice

Synonyms and alternate phrasing can be captured using Rhino choice syntax:

(please) [switch, turn] $lightState:state (all) (the) lights

The square brackets indicate a logical "OR", where one of the phrases can be spoken to match the expression.

Additional syntax

See the Rhino Expression Syntax Cheat Sheet for the complete list of supported syntax in Rhino expressions.

Training and Downloading a context model

When done with the design of the context, click on the download button. Select a target platform, and click "Download". The Picovoice Console immediately trains a Rhino model for the active context to run on the target platform. The training takes approximately 5-10 seconds.

Download Rhino Model

Was this doc helpful?

Issue with this doc?

Report a GitHub Issue
Rhino Speech-to-Intent
  • Get Started
  • Sign up for Picovoice Console
  • Retrieve AccessKey
  • Download SDK
  • Create a Context
  • Create an intent
  • Adding expressions to an intent
  • Use the Microphone to test the context
  • Use slots to capture variables within utterances
  • Built-in slots
  • Optionals
  • Choice
  • Additional syntax
  • Training and Downloading a context model
Platform
  • Leopard Speech-to-Text
  • Cheetah Streaming Speech-to-Text
  • Koala Noise Suppression
  • Eagle Speaker RecognitionBETA
  • Octopus Speech-to-Index
  • Porcupine Wake Word
  • Rhino Speech-to-Intent
  • Cobra Voice Activity Detection
  • Orca Text-to-SpeechWAITLIST
  • Falcon Speaker DiarizationWAITLIST
Resources
  • Docs
  • Console
  • Blog
  • Use Cases
Sales & Services
  • Consulting
  • Developer Plan
  • Enterprise Plan
  • Support Add-on
Company
  • About us
  • Careers
Follow Picovoice
  • LinkedIn
  • GitHub
  • Twitter
  • Medium
  • YouTube
  • AngelList
Subscribe to our newsletter
Terms of Use
Privacy Policy
© 2019-2022 Picovoice Inc.