nodejsPicovoice SDK - NodeJS Quick Start

  • End-to-End Voice Platform
  • Offline Voice Recognition
  • Local Speech Recognition
  • Speech-to-Intent
  • Domain-Specific NLU
  • Wake Word Detection
  • NodeJS
  • NPM

Requirements

  • NodeJS 10+
  • yarn (or npm)

Compatibility

  • Linux (x86_64)
  • macOS (x86_64)
  • Raspberry Pi (2,3,4)

Web Browsers

Looking to run Picovoice in-browser? See the JavaScript WebAssembly demo instead.

Setup

Cloning the Repository

If using SSH, clone the repository with:

git clone --recurse-submodules [email protected]:Picovoice/picovoice.git

If using HTTPS, then type

git clone --recurse-submodules https://github.com/Picovoice/picovoice.git

Microphone

macOS

See the documentation for node-record-lpm16 for instructions on installing SoX.

Linux & Raspberry Pi

Connect the microphone and get the list of available input audio devices:

arecord -L

The output will be similar to below

null
    Discard all samples (playback) or generate zero samples (capture)
default
mic
sysdefault:CARD=Device
    USB PnP Sound Device, USB Audio
    Default Audio Device
hw:CARD=Device,DEV=0
    USB PnP Sound Device, USB Audio
    Direct hardware device without any conversions
plughw:CARD=Device,DEV=0
    USB PnP Sound Device, USB Audio
    Hardware device with all software conversions

In this case, we pick plughw:CARD=Device,DEV=0. Note that this device comes with software conversions which are handy for resampling. In what follows we note this value as ${INPUT_AUDIO_DEVICE}.

create ~/.asoundrc

pcm.!default {
type asym
capture.pcm "mic"
}
pcm.mic {
type plug
slave {
pcm ${INPUT_AUDIO_DEVICE}
}
}

If you have a speaker add a section for that to ~/.asoundrc as well.

Check if the microphone works properly by recording audio into a file:

arecord --format=S16_LE --duration=5 --rate=16000 --file-type=wav ~/test.wav

If the command above executes without any errors, then the microphone is functioning as expected. We recommend inspecting the recorded file for recording side effects such as clipping.

Demo Applications

The Picovoice NodeJS demo package provides two demonstration command-line applications: a file-based demo, which scans a compatible WAV file for a given wake word followed by voice command, and a microphone demo, which listens for a chosen wake word and then follow-on command.

Install the demo package using the global argument, so that it will be available on the command line:

yarn global add @picovoice/picovoice-node-demo

or

npm install -g @picovoice/picovoice-node-demo

Microphone Demo

The microphone demo allows you to monitor microphone input for occurrences of a given wake phrase using Porcupine and then infers user's intent from follow-on command using Rhino. For audio recording, the node-record-lpm16 package is used. Please follow that documentation for setup and troubleshooting. The node-record-lpm16 library spawns a different microphone recording process depending on the OS used. The microphone program (SoX or Arecord) must be set up manually and is not included with yarn/npm.

Here is an example which will understand commands from the "Smart Lighting" demo from the Picovoice GitHub repository. Note that context files are platform-dependent. Choose the appropriate one for the platform you are using. This demo uses the "mac" version.

Using the global install methods above should add pv-mic-demo to your system path, which we can use to run the mic demo. Specify the wake word (.ppn file) with --keyword_file_path and the context (.rhn file) with --context_file_path. The following assumes you are running the command from the root of the Picovoice GitHub repository.

pv-mic-demo \
--keyword_file_path resources/porcupine/resources/keyword_files/mac/bumblebee_mac.ppn \
--context_file_path resources/rhino/resources/contexts/mac/smart_lighting_mac.rhn

You can use builtin wake word models with --keyword:

pv-mic-demo \
--keyword bumblebee \
--context_file_path resources/rhino/resources/contexts/mac/smart_lighting_mac.rhn

See the list of builtin keywords:

pv-mic-demo --help

The Rhino context source in YAML format will be output to show you the grammar and options that the context supports:

context:
expressions:
changeColor:
- (please) [change, set, switch] (the) $location:location (to) $color:color
- (please) [change, set, switch] (the) $location:location color (to) $color:color
- (please) [change, set, switch] (the) $location:location lights (to) $color:color
...

First, the demo will listen for the wake word (Porcupine engine). Upon the wake word detection, the demo will switch to follow-on command inference (Rhino engine). The demo will listen for a phrase that the context understands, and upon reaching a conclusion (or timeout), it will output the results.

Platform: 'mac'; attempting to use 'sox' to access microphone ...
Listening for speech within the context of 'smart lighting'. Please speak your phrase into the microphone.
# (say "bumblebee", or the custom Porcupine keyword you chose)
Wake word 'bumblebee' detected
# (say e.g. "please turn on the lights in the kitchen")
...
Inference result:
{
"isUnderstood": true,
"intent": "changeLightState",
"slots": {
"state": "on",
"location": "kitchen"
}
}

Now try again, but this time say something that the context is not designed to understand, like "tell me a joke":

Platform: 'mac'; attempting to use 'sox' to access microphone ...
Listening for speech within the context of 'smart_lighting_mac'. Please speak your phrase into the microphone.
# (say "bumblebee", or the custom Porcupine keyword you chose)
Wake word 'bumblebee' detected
# (say e.g. "tell me a joke")
Inference result:
{
"isUnderstood": false
}

File Demo

The file-based demo allows you to scan a compatible WAV file with Picovoice. To run the file-based demo, we need to provide a Porcupine keyword file and Rhino context file, along with a path to a compatible WAV file.

We can use the WAV file that is in the Picovoice GitHub repository. It is meant to be used with the sample "Coffee Maker" context and the "Picovoice" wake word, also available in the repository. Note that keyword and context files are platform-dependent. Choose the appropriate one for the platform you are using. This demo uses the "mac" version of each file.

Run the file demo:

pv-file-demo \
--input_audio_file_path resources/audio_samples/picovoice-coffee.wav \
--keyword_file_path resources/porcupine/resources/keyword_files/mac/picovoice_mac.ppn \
--context_file_path resources/rhino/resources/contexts/mac/coffee_maker_mac.rhn
Wake word 'picovoice' detected
Listening for speech within the context of 'coffee maker'
Inference:
{
"isUnderstood": true,
"intent": "orderDrink",
"slots": {
"size": "large",
"coffeeDrink": "coffee"
}
}

Learn about additional capabilities of demo:

pv-file-demo --help

Custom Wake Word & Context

You can create custom Porcupine wake word and Rhino context models using Picovoice Console.