How to Build a Custom Keyword Spotting System in Node.js

🎯 Voice AI Consulting

Get dedicated support and consultation to ensure your specific needs are met.

Learn how to add custom wake word detection to your Node.js application using on-device keyword spotting—the same technology that powers "Alexa," "Hey Siri," and "Ok Google," but running entirely on your device without cloud connectivity. This tutorial shows you how to build your own voice-activated trigger system that runs across Windows, macOS, Linux, and Raspberry Pi.

You'll use PvRecorder and the Porcupine Wake Word Node.js SDK to capture audio streams and detect custom activation phrases like "Hey Assistant"—all processed locally on your device. Follow along with step-by-step code examples to integrate Alexa-style keyword spotting into your application, whether you're building a voice assistant, hands-free meeting transcription tool, or smart home control system.

Understanding Keyword Spotting

Keyword spotting (also called wake word detection) is a speech recognition process that continuously monitors an audio stream for specific trigger phrases. When a predefined keyword is detected, it activates a subsequent action, such as activating a full speech or AI processing pipeline.

In on-device systems, this process runs locally without sending data to the cloud, enabling faster response times and enhanced privacy.

Learn more in the complete guide to wake word detection.

Set Up Your Node.js Environment

To follow this tutorial, you need Node.js installed and access to your project's microphone or audio stream. These steps prepare your local environment for real-time keyword detection.

Install Node.js (version 18 or later)
Install the required Node packages:

npm install @picovoice/porcupine-node @picovoice/pvrecorder-node

Implement Keyword Spotting in Node.js

Now that your environment is ready, let's integrate the keyword spotting engine.

First, create a Picovoice Console account. Find and copy your AccessKey on the main dashboard. You'll need it later to initialize the Porcupine instance.

1. Train a Custom Keyword Model

Porcupine Wake Word includes built-in keywords, but you can also train custom keywords for your unique trigger phrase. Use the Picovoice Console to train and download your .ppn model file.

Refer to the following resources for guidance on training a custom keyword model:

Tutorial-style guide: Creating a Custom Wake Word with Porcupine
Video walkthrough: Picovoice Console Tutorial: Porcupine Wake Word

Porcupine supports listening simultaneously for multiple keywords, so you can train and download multiple keyword files if you wish.

2. Initialize Porcupine

Initialize Porcupine with your AccessKey and keyword model file:

const { Porcupine, BuiltinKeyword } = require("@picovoice/porcupine-node");

// Initialize Porcupine with desired keywords
const porcupine = new Porcupine(
  "${ACCESS_KEY}", // AccessKey from Picovoice Console
  ["${KEYWORD_FILE_PATH}"], // Your custom keyword model file(s) (.ppn)
  [0.5] // Keyword sensitivities
);

The sensitivities parameter is used to tune the sensitivities of the keywords passed in. A higher sensitivity results in fewer misses at the cost of increasing the false acceptance rate.

3. Set Up Audio Capture for Keyword Detection

Read audio frames with PvRecorder in preparation for keyword detection:

const { PvRecorder } = require("@picovoice/pvrecorder-node");

// 1. Initialize and start the audio capture device
const frameLength = 512;
const recorder = new PvRecorder(frameLength);
recorder.start();

// 2. Continuously read frames of audio, which will be passed to Porcupine
while (true) {
  const audioFrame = await recorder.read();
  // porcupine.process(audioFrame);
}

4. Feed Audio to Porcupine for Keyword Spotting

Pass audio frames to Porcupine for keyword spotting. When a keyword is detected, Porcupine returns the index of the detected keyword, corresponding to the array of keywords used during initialization.

while (true) {
  const audioFrame = await recorder.read();
  const keywordIndex = porcupine.process(audioFrame);
  if (keywordIndex >= 0) {
    // take action based on detected keyword
  }
}

Complete Demo: Keyword Spotting in Node.js

The following complete example combines all previous steps into a functional Node.js script that continuously listens for wake words and logs detections to the console.

const { PvRecorder } = require("@picovoice/pvrecorder-node");
const { Porcupine, BuiltinKeyword } = require("@picovoice/porcupine-node");
const readline = require("readline");

let isRunning = true;

// Listen for spacebar
readline.emitKeypressEvents(process.stdin);
if (process.stdin.isTTY) process.stdin.setRawMode(true);

process.stdin.on("keypress", (str, key) => {
  if (key.name === "space") {
    console.log("Stopping...");
    isRunning = false;
  } else if (key.ctrl && key.name === "c") {
    isRunning = false;
  }
});

async function main() {
  let porcupine = null;
  let recorder = null;
  const keywords = [BuiltinKeyword.GRASSHOPPER, BuiltinKeyword.BUMBLEBEE];

  try {
    console.log("Initializing Porcupine...");
    porcupine = new Porcupine(
      "${ACCESS_KEY}", // AccessKey from Picovoice Console
      keywords, // Your custom keyword model files OR built-in keywords
      [0.5, 0.65] // Keyword sensitivities
    );

    console.log("Starting wake word detection... Press SPACE to stop.");
    recorder = new PvRecorder(porcupine.frameLength);
    recorder.start();

    while (isRunning) {
      const audioFrame = await recorder.read();
      const keywordIndex = porcupine.process(audioFrame);
      if (keywordIndex >= 0) {
        console.log(`Keyword detected: ${keywords[keywordIndex]}`);
      }
    }
  } catch (err) {
    console.error("Error:", err);
  } finally {
    if (recorder) {
      try {
        recorder.stop();
        recorder.release();
      } catch (e) {
        console.warn("Failed to stop/release recorder:", e);
      }
    }
    if (porcupine) {
      try {
        porcupine.release();
      } catch (e) {
        console.warn("Failed to release Porcupine:", e);
      }
    }
    console.log("Recorder and Porcupine released. Exiting.");
    process.exit(0);
  }
}

main();

This demo uses the following packages:

For a more detailed guide, refer to the documentation:

For a complete demo application, check out the Porcupine Wake Word Node.js Demo on GitHub.

Troubleshooting & Optimization Tips

Sensitivity tuning: If wake word detection seems inconsistent, adjust the sensitivity parameter. Higher sensitivity makes detection more responsive but can also raise the chance of false activations.
Audio device permissions: Ensure your Node.js app has access to the system microphone.
Releasing resources: Always release resources (recorder.release(), porcupine.release()) on exit to avoid memory leaks.

Next Steps and Advanced Integrations

Take your project further with:

Multilingual wake word models: Recognize trigger phrases in multiple languages, enabling voice activation for users worldwide.
Custom command recognition: Detect full voice commands like "Start taking notes" or "Record a summary" after your wake word, letting your application respond intelligently.
Integration with streaming speech-to-text and on-device LLMs: Convert speech into text and feed it into AI models for fully interactive conversational experiences.

These extensions can transform your wake word module into a complete voice interaction framework for enterprise systems.

Start Building