Speech Recognition in Unity Tutorial

🚀 Best-in-class Voice AI!

Build compliant and low-latency AI applications using Unity without sending user data to 3rd party remote servers.

This article serves as a comprehensive guide for adding on-device Speech Recognition to an Unity project.

When used casually, Speech Recognition usually refers solely to Speech-to-Text. However, Speech-to-Text represents only a single facet of Speech Recognition technologies. It also refers to features such as Wake Word Detection, Voice Command Recognition, and Voice Activity Detection (VAD). In the context of Unity projects, Speech Recognition can be used to implement a Voice Interface.

Fortunately Picovoice offers a few tools to help implement Voice Interfaces. If all that is needed is to recognize when specific phrases or words are said, use Porcupine Wake Word. If Voice Commands need to be understood and intent extracted with details (i.e. slot values), Rhino Speech-to-Intent is more suitable. Keep reading to see how to quickly start with both of them.

Picovoice Unity SDKs have cross-platform support for Linux, macOS, Windows, Android and iOS!

Porcupine Wake Word

To integrate the Porcupine Wake Word SDK into your Unity project, download and import the latest Porcupine Unity package.
Sign up for a free Picovoice Console account and obtain your AccessKey. The AccessKey is only required for authentication and authorization.
Create a custom wake word model using Picovoice Console.
Download the .ppn model file and copy it into your project's StreamingAssets folder.
Write a callback that takes action when a keyword is detected:

void WakeWordCallback(int keywordIndex)
{
    if (keywordIndex > -1)
    {
        // keyword detected
    }
}

Initialize the Porcupine Wake Word engine with the callback and the .ppn file name (or path relative to the StreamingAssets folder):

using Pv.Unity;

PorcupineManager porcupineManager = PorcupineManager.FromKeywordPaths(
        "${ACCESS_KEY}",
        new List<string> {"${KEYWORD_FILE_NAME}"},
        WakeWordCallback);

Start detecting:

porcupineManager.Start();

For further details, visit the Porcupine Wake Word product page or refer to Porcupine's Unity SDK quick start guide.

Rhino Speech-to-Intent

To integrate the Rhino Speech-to-Intent SDK into your Unity project, download and import the latest Rhino Unity package.
Sign up for a free Picovoice Console account and obtain your AccessKey. The AccessKey is only required for authentication and authorization.
Create a custom context model using Picovoice Console.
Download the .rhn model file and copy it into your project's StreamingAssets folder.
Write a callback that takes action when a user's intent is inferred:

void InferenceCallback(Inference inference)
{
    if (inference.IsUnderstood)
    {
        string intent = inference.Intent;
        Dictionary<string, string> slots = inference.Slots;
        // take action based on inferred intent and slot values
    }
    else
    {
        // handle unsupported commands
    }
}

Initialize the Rhino Speech-to-Intent engine with the callback and the .rhn file name (or path relative to the StreamingAssets folder):

using Pv.Unity;

RhinoManager rhinoManager = RhinoManager.Create(
        "${ACCESS_KEY}",
        "${CONTEXT_FILE_NAME}",
        InferenceCallback);

Start inferring:

rhinoManager.Process();

For further details, visit the Rhino Speech-to-Intent product page or refer to Rhino's Android SDK quick start guide.

Unity Speech Recognition

Porcupine Wake Word

Rhino Speech-to-Intent

More from Picovoice