wasmPicovoice - Web API

  • Wake Word Engine
  • Offline Voice Commands
  • Local Speech Recognition
  • Always Listening
  • Follow-on commands
  • npm

This document outlines how to integrate the Picovoice SDK within an application using its Web API.

Requirements

  • yarn (or npm)
  • Secure browser context (i.e. HTTPS or localhost)

Compatibility

  • Chrome, Edge
  • Firefox
  • Safari

The Picovoice SDKs for Web are powered by WebAssembly (WASM), the Web Audio API, and Web Workers. All audio processing is performed in-browser, providing intrinsic privacy and reliability.

All modern browsers are supported, including on mobile. Internet Explorer is not supported.

Using the Web Audio API requires a secure context (HTTPS connection), with the exception of localhost, for local development.

JavaScript Frameworks

Looking to use Picovoice with React, Angular, or Vue? There are framework-specific packages available:

The framework-specific packages operate at a higher level of abstraction and are meant to integrate as quickly and easily as possible, following each framework's conventions and best practices. Furthermore, the demo applications show the complete lifecycle of using Picovoice in a component, including setup and teardown. Interacting with Web Workers is hidden behind a facade, and the Web Voice Processor is also coordinated behind the scenes to automatically setup the microphone.

Otherwise, this doc will provide instruction on how to use Picovoice with "Vanilla" JavaScript and HTML, connecting it to the Web Voice Processor for microphone use.

Introduction

Picovoice for Web is available in two flavors: Worker and Factory:

  • The Worker packages are all-in-one Web Workers which wrap Picovoice instances that will work with the @picovoice/web-voice-processor (and the Angular, React, and Vue packages).
  • The Factory packages give you access to instances directly. This is useful if you wish to build your own worker/worklet, or perhaps use Picovoice in some other custom scenario.

Structure

The Picovoice SDK for Web is provided in several npm packages, due to the logistics and size of shipping ~4-6MB voice models.

Workers

For typical cases, use the worker packages. Worker packages create complete self-contained Picovoice Web Worker instances that can be immediately used with @picovoice/web-voice-processor and with the Angular, React, and Vue packages.

Factories

Factory packages allow you to create instances of Picovoice directly. Useful for building your own custom Worker/Worklet, or some other bespoke purpose.

Installation & Usage

Worker: Using modern JavaScript, ES Modules, Bundlers (e.g. Webpack)

To obtain a Picovoice Worker, we can use the static create factory method from the PicovoiceWorkerFactory. Here is a complete example that:

  1. Obtains a Worker from the PicovoiceWorkerFactory (in this case, English) to listen for the built-in English wake word "Picovoice"
  2. Responds to the wake word detection and inference events by setting the worker's onmessage event handler
  3. Starts up the WebVoiceProcessor to forward microphone audio to the Picovoice Worker

E.g.:

yarn add @picovoice/web-voice-processor @picovoice/picovoice-web-en-worker
import { WebVoiceProcessor } from "@picovoice/web-voice-processor"
import { PicovoiceWorkerFactory } from "@picovoice/picovoice-web-en-worker";
async function startPicovoice() {
// Create a Picovoice Worker (English language) to listen for
// the built-in keyword "Picovoice" and follow-on commands in the given Rhino context.
// Note: you receive a Worker object, _not_ an individual Picovoice instance
const picovoiceWorker = await PicovoiceWorkerFactory.create(
{
porcupineKeyword: { builtin: "Picovoice" },
rhinoContext: { base64: RHINO_CONTEXT_BASE64 },
start: true,
}
);
// The worker will send a message with data.command = "ppn-keyword" upon a detection event
// And data.command = "rhn-inference" when the follow-on inference concludes.
// Here, we tell it to log it to the console:
picovoiceWorker.onmessage = (msg) => {
switch (msg.data.command) {
case 'ppn-keyword':
// Wake word detection
console.log("Wake word: " + msg.data.keywordLabel);
break;
case 'rhn-inference:
// Follow-on command inference concluded
console.log("Inference: " + msg.data.inference)
default:
break;
}
};
// Start up the web voice processor. It will request microphone permission
// and immediately (start: true) start listening.
// It downsamples the audio to voice recognition standard format (16-bit 16kHz linear PCM, single-channel)
// The incoming microphone audio frames will then be forwarded to the Picovoice Worker
// n.b. This promise will reject if the user refuses permission! Make sure you handle that possibility.
const webVp = await WebVoiceProcessor.init({
engines: [picovoiceWorker],
start: true,
});
}
startPicovoice()
...
// Finished with Picovoice? Release the WebVoiceProcessor and the worker.
if (done) {
webVp.release()
picovoiceWorker.sendMessage({command: "release"})
}

Worker: Script Tag / IIFE / CDN

Picovoice's worker and factory packages are also available in IIFE format, intended for direct inclusion into HTML instead of a bundler. You can use local node modules, or use the CDN unpkg version for rapid prototyping:

<!<!DOCTYPE html>
<html lang="en">
<head>
<script src="node_modules/@picovoice/picovoice-web-en-worker/dist/iife/index.js"></script>
<script src="node_modules/@picovoice/web-voice-processor/dist/iife/index.js"></script>
<script type="application/javascript">
const RHINO_CONTEXT_BASE64 = /* Base64 representation of a Rhino .rhn file, omitted for brevity */
function writeMessage(message) {
console.log(message)
let p = document.createElement("p")
let text = document.createTextNode(message)
p.appendChild(text)
document.body.appendChild(p)
}
async function startPicovoice() {
writeMessage("Picovoice is loading. Please wait...")
picovoiceWorker = await PicovoiceWebEnWorker.PicovoiceWorkerFactory.create(
{
porcupineKeyword: { builtin: "Picovoice" },
rhinoContext: { base64: RHINO_CONTEXT_BASE64 },
start: true,
}
)
writeMessage("Picovoice worker ready!")
picovoiceWorker.onmessage = msg => {
switch (msg.data.command) {
case "ppn-keyword": {
writeMessage(
"Wake word detected: " + JSON.stringify(msg.data.keywordLabel)
)
break
}
case "rhn-inference":
{
writeMessage(
"Inference detected: " + JSON.stringify(msg.data.inference)
)
break
}
writeMessage(msg)
}
}
writeMessage(
"WebVoiceProcessor initializing. Microphone permissions requested ..."
)
try {
let webVp = await WebVoiceProcessor.WebVoiceProcessor.init({
engines: [picovoiceWorker],
start: true,
})
writeMessage(
"WebVoiceProcessor ready! Say 'Picovoice' to start the interaction."
)
} catch (e) {
writeMessage("WebVoiceProcessor failed to initialize: " + e)
}
}
document.addEventListener("DOMContentLoaded", function () {
startPicovoice()
})
</script>
</head>
<body>
</body>
</html>

Factory

If you wish to build your own worker, or perhaps not use workers at all, use the factory packages. This will let you instantiate Picovoice engine instances directly.

The audio passed to the worker must be of the correct format. The WebVoiceProcessor handles downsampling in the examples above. If you are not using that, you must ensure you do it yourself.

E.g.:

import { Picovoice } from "@picovoice/picovoice-web-en-worker"
let engineControl = 'ppn'
async function startPicovoice(porcupineCallback, rhinoCallback) {
const handle = await Picovoice.create(
{porcupineKeyword: { builtin: "Bumblebee", sensitivity: 0.7 },
{rhinoContext: { base64 : /* Base64 representation of a .rhn file */}}
])
return handle;
}
const porcupineCb = (keywordLabel) => {console.log("Keyword detected: " + keywordLabel)}
const rhinoCb = (inference) => {console.log("Inference concluded: " + inference)}
startPicovoice(porcupineCb, rhinoCb)
// Send Picovoice frames of audio (check handle.frameLength for size of array)
const audioFrames = new Int16Array(/* Provide data with correct format and size */)
const picovoiceResult = handle.process(audioFrames)
switch (engineControl) {
case 'ppn': {
const keywordIndex = handle.process(inputFrame);
if (keywordIndex !== -1) {
engineControl = 'rhn'
porcupineCallback(handle.keywordLabels.get(keywordIndex))
}
break;
}
case 'rhn': {
const inference = handle.process(inputFrame);
if (inference.isFinalized) {
engineControl = 'ppn'
rhinoCallback(inference)
}
break;
}
}

Picovoice Factory Parameters

Picovoice (via the factory or worker factory) accepts a single PorcupineKeyword and RhinoContext argument. These arguments are passed to the static create method, which returns a promise that will resolve to a Picovoice instance or Picovoice Web Worker instance, respectively.

Porcupine Keyword

Picovoice accepts a single PorcupineKeyword object argument(s), where PorcupineKeyword can be either PorcupineKeywordBuiltin or PorcupineKeywordCustom:

export type PorcupineKeywordBuiltin = {
/** Name of a builtin keyword for the specific language (e.g. "Grasshopper" for English, or "Ananas" for German) */
builtin: string
/** Value in range [0,1] that trades off miss rate for false alarm */
sensitivity?: number
}
EnglishFrenchGermanSpanish
  • "Americano"
  • "Blueberry"
  • "Bumblebee"
  • "Grapefruit"
  • "Grasshopper"
  • "Hey Google"
  • "Hey Siri"
  • "Jarvis"
  • "Okay Google"
  • "Picovoice"
  • "Porcupine"
  • "Terminator"
  • "Framboise"
  • "Mon Chouchou"
  • "Parapluie"
  • "Perroquet"
  • "Tournesol"
  • "Ananas"
  • "Heuschrecke"
  • "Himbeere"
  • "Leguan"
  • "Stachelschwein"
  • "Emparedado"
  • "Leopardo"
  • "Manzana"
  • "Murcielago"

If you simply pass a string of a builtin keyword instead of an object, that will also work.

Use PorcupineKeywordCustom for custom keywords:

export type PorcupineKeywordCustom = {
/** Base64 representation of a trained Porcupine keyword (`.ppn` file) */
base64: string
/** An arbitrary label that you want Picovoice to report when the detection occurs */
custom: string
/** Value in range [0,1] that trades off miss rate for false alarm */
sensitivity?: number
}

Rhino Context

The PicovoiceFactoryArgs also accepts a rhinoContext field of type type RhinoContext:

export type RhinoContext = {
/** Base64 representation of a trained Rhino context (`.rhn` file) */
base64: string
/** Value in range [0,1] that trades off miss rate for false alarm */
sensitivity?: number
}

Issue with this doc? Please let us know.