unityPicovoice Platform - Unity API

  • End-to-End Voice Platform
  • Offline Voice Recognition
  • Local Speech Recognition
  • Speech-to-Intent
  • Domain-Specific NLU
  • Wake Word Detection
  • Unity
  • C#
  • Desktop
  • Mobile
  • Windows
  • macOS
  • Linux
  • Android
  • iOS

This document outlines how to integrate the Picovoice end-to-end voice platform within an application using its Unity API.

Requirements

  • Unity 2017.4+

To deploy to iOS or Android, ensure you install the relevant Unity build support modules using Unity Hub.

Installation

The easiest way to install the Picovoice Unity SDK is to import picovoice.unitypackage into your Unity project by either dropping it into the Unity editor or going to Assets>Import Package>Custom Package...

Packaging

To build the package from source, you have first have to clone the repo with submodules:

git clone --recurse-submodules [email protected]:Picovoice/picovoice.git
# or
git clone --recurse-submodules https://github.com/Picovoice/picovoice.git

You then have to run the copy.sh file to copy the package resources from various locations in the repo to the Unity project located at /sdk/unity

Then, open the Unity project, right click the Assets folder and select "Export Package". The resulting Unity package can be imported into other Unity projects as desired.

Usage

The module provides you with two levels of API to choose from depending on your needs.

High-Level API

PicovoiceManager provides a high-level API that takes care of audio recording. This class is the quickest way to get started.

NOTE: If running on iOS, you must fill in the Microphone Usage Description under Project Settings>Other Settings in order to enable audio recording.

The constructor will create an instance of the PicovoiceManager using the Porcupine keyword and Rhino context files that you pass to it.

using Pv.Unity;
PicovoiceManager _picovoiceManager = new PicovoiceManager(
"/path/to/keyword/file.ppn",
OnWakeWordDetected,
"/path/to/context/file.rhn",
OnInferenceResult);

The wakeWordCallback and inferenceCallback arguments are functions that you want to execute when a wake word is detected and when an inference is made.

private void OnWakeWordDetected()
{
// wake word detected!
}
private void OnInferenceResult(Inference inference)
{
if(inference.IsUnderstood)
{
string intent = inference.Intent;
Dictionary<string, string> slots = inference.Slots;
// add code to take action based on inferred intent and slot values
}
else
{
// add code to handle unsupported commands
}
}

You can override the default model files and sensitivities. There is also an optional errorCallback that is called if there is a problem encountered while processing audio. These optional parameters can be passed in like so:

PicovoiceManager _picovoiceManager = new PicovoiceManager(
"/path/to/keyword/file.ppn",
OnWakeWordDetected,
"/path/to/context/file.rhn",
OnInferenceResult
porcupineModelPath: "/path/to/porcupine/model.pv",
porcupineSensitivity: 0.75f,
rhinoModelPath: "/path/to/rhino/model.pv",
rhinoSensitivity: 0.6f,
errorCallback: OnError);
void OnError(Exception ex){
Debug.LogError(ex.ToString());
}

Once you have instantiated a PicovoiceManager, you can start audio capture and processing by calling:

try
{
_picovoiceManager.Start();
}
catch(Exception ex)
{
Debug.LogError(ex.ToString());
}

And then stop it by calling:

_picovoiceManager.Stop();

PicovoiceManager uses our unity-voice-processor Unity package to capture frames of audio and automatically pass it to the Picovoice platform.

Low-Level API

Picovoice provides low-level access to the Picovoice platform for those who want to incorporate it into a already existing audio processing pipeline.

Picovoice is created by passing a Porcupine keyword file and Rhino context file to the Create static constructor.

using Pv.Unity;
try
{
Picovoice _picovoice = Picovoice.Create(
"path/to/keyword/file.ppn",
OnWakeWordDetected,
"path/to/context/file.rhn",
OnInferenceResult);
}
catch (Exception ex)
{
// handle Picovoice init error
}
private void OnWakeWordDetected()
{
// wake word detected!
}
private void OnInferenceResult(Inference inference)
{
if(inference.IsUnderstood)
{
string intent = inference.Intent;
Dictionary<string, string> slots = inference.Slots;
// add code to take action based on inferred intent and slot values
}
else
{
// add code to handle unsupported commands
}
}

To use Picovoice, you must pass frames of audio to the Process function. The callbacks will automatically trigger when the wake word is detected and then when the follow-on command is detected.

short[] GetNextAudioFrame()
{
// .. get audioFrame
return audioFrame;
}
short[] buffer = GetNextAudioFrame();
try
{
_picovoice.Process(buffer);
}
catch (Exception ex)
{
Debug.LogError(ex.ToString());
}

For process to work correctly, the audio data must be in the audio format required by Picovoice. The required audio format is found by calling .sampleRate to get the required sample rate and .frameLength to get the required frame size.

Picovoice implements the IDisposable interface, so you can use Picovoice in a using block. If you don't use a using block, resources will be released by the garbage collector automatically or you can explicitly release the resources like so:

_picovoice.Dispose();

Custom Wake Words & Contexts

You can create custom Porcupine wake word and Rhino context models using Picovoice Console

Custom Model Integration

To add custom models to your Unity app, you'll need to add them to your project root under /StreamingAssets. Then, in a script, retrieve them like so:

string keywordPath = Path.Combine(Application.streamingAssetsPath, "keyword.ppn");
string contextPath = Path.Combine(Application.streamingAssetsPath, "context.rhn");

Non-English Models

In order to detect wake words and run inference in other languages you need to use the corresponding model file. The model files for all supported languages are available here and here.


Issue with this doc? Please let us know.