unityRhino - Unity API

  • Speech-to-Intent Engine
  • Domain Specific NLU
  • Offline NLU
  • Local Voice Recognition
  • Unity
  • C#
  • Desktop
  • Mobile
  • Windows
  • macOS
  • Linux
  • Android
  • iOS

This document outlines how to integrate the Rhino Speech-to-Intent engine within an application using its Unity API.

Requirements

  • Unity 2017.4+

To deploy to iOS or Android, ensure you install the relevant Unity build support modules using Unity Hub.

Compatibility

  • Android 4.1+ (API 16+) (ARM only)
  • iOS 9.0+
  • Windows (x86_64)
  • macOS (x86_64)
  • Linux (x86_64)

Installation

The easiest way to install the Rhino Unity SDK is to import rhino.unitypackage into your Unity project by either dropping it into the Unity editor or going to Assets>Import Package>Custom Package...

Packaging

To build the package from source, you have first have to clone the repo with submodules:

git clone --recurse-submodules [email protected]:Picovoice/rhino.git
# or
git clone --recurse-submodules https://github.com/Picovoice/rhino.git

You then have to run the copy.sh file to copy the package resources from various locations in the repo to the Unity project located at /binding/unity.

Then, open the Unity project, right click the Assets folder and select "Export Package". The resulting Unity package can be imported into other Unity projects as desired.

Usage

The module provides you with two levels of API to choose from depending on your needs.

High-Level API

RhinoManager provides a high-level API that takes care of audio recording. This class is the quickest way to get started.

NOTE: If running on iOS, you must fill in the Microphone Usage Description under Project Settings>Other Settings in order to enable audio recording.

Using the constructor RhinoManager.Create will create an instance of the RhinoManager using the provided context file.

using Pv.Unity;
try
{
RhinoManager _rhinoManager = RhinoManager.Create(
"/path/to/context/file.rhn",
OnInferenceResult);
}
catch (Exception ex)
{
// handle rhino init error
}

The inferenceCallback parameter is a function that you want to execute when Rhino makes an inference. The function should accept Inference object that represents the inference result.

private void OnInferenceResult(Inference inference)
{
if(inference.IsUnderstood)
{
string intent = inference.Intent;
Dictionary<string, string> slots = inference.Slots;
// add code to take action based on inferred intent and slot values
}
else
{
// add code to handle unsupported commands
}
}

You can override the default Rhino model file and/or the inference sensitivity. There is also an optional errorCallback that is called if there is a problem encountered while processing audio. These optional parameters can be passed in like so:

RhinoManager _rhinoManager = RhinoManager.Create(
"/path/to/context/file.rhn",
OnInferenceResult,
modelPath: "/path/to/model/file.pv",
sensitivity: 0.75f,
errorCallback: OnError);
void OnError(Exception ex){
Debug.LogError(ex.ToString());
}

Once you have instantiated a RhinoManager, you can start audio capture and intent inference by calling:

_rhinoManager.Process();

Audio capture stops and Rhino resets once an inference result is returned via the inference callback. When you wish to result, call .Process() again.

Once the app is done with using an instance of RhinoManager, you can explicitly release the audio resources and the resources allocated to Rhino:

_rhinoManager.Delete();

There is no need to deal with audio capture to enable intent inference with RhinoManager. This is because it uses our unity-voice-processor Unity package to capture frames of audio and automatically pass it to the inference engine.

Low-Level API

Rhino provides low-level access to the inference engine for those who want to incorporate speech-to-intent into a already existing audio processing pipeline.

To create an instance of Rhino, use the .Create static constructor and a context file.

using Pv.Unity;
try
{
Rhino _rhino = Rhino.Create("path/to/context/file.rhn");
}
catch (Exception ex)
{
// handle rhino init error
}

To feed Rhino your audio, you must send it frames of audio to its Process function until it has made an inference. You can then call GetInference to access the Inference object, which contains the following properties:

  • IsUnderstood - whether Rhino understood what it heard based on the context
    • Intent - if IsUnderstood, name of intent that were inferred
    • Slots - if IsUnderstood, dictionary of slot keys and values that were inferred
short[] GetNextAudioFrame()
{
// .. get audioFrame
return audioFrame;
}
try
{
bool isFinalized = _rhino.Process(GetNextAudioFrame());
if(isFinalized)
{
Inference inference = _rhino.GetInference();
if(inference.IsUnderstood)
{
string intent = inference.Intent;
Dictionary<string, string> slots = inference.Slots;
// .. code to take action based on inferred intent and slot values
}
else
{
// .. code to handle unsupported commands
}
}
}
catch (Exception ex)
{
Debug.LogError(ex.ToString());
}

For process to work correctly, the audio data must be in the audio format required by Picovoice. The required sample rate is specified by the SampleRate property and the required number of audio samples in each frame is specified by the FrameLength property.

Rhino implements the IDisposable interface, so you can use Rhino in a using block. If you don't use a using block, resources will be released by the garbage collector automatically or you can explicitly release the resources like so:

_rhino.Dispose();

Custom Contexts

You can create custom Rhino context models using Picovoice Console.

Custom Model Integration

To add a custom context to your Unity app, you'll need to add the rhn file to your project root under /StreamingAssets. Then, in a script, retrieve it like so:

string contextPath = Path.Combine(Application.streamingAssetsPath, "context.rhn");

Non-English Contexts

In order to run inference on non-English contexts you need to use the corresponding model file. The model files for all supported languages are available here.


Issue with this doc? Please let us know.