Real-time Transcription with React.js

🚀 Best-in-class Voice AI!

Build compliant and low-latency AI apps using ReactJS without sending user data to 3rd party servers.

Real-time speech-to-text enables instant transcription and communication, offering practical applications such as live transcriptions during meetings, or live subtitles in a video call. Integrating real-time speech-to-text into an application can greatly improve user experience, accessibility, and overall function.

In this article, we will walk you through the process of integrating speech-to-text into a React application using Picovoice's Cheetah Streaming Speech-to-Text engine.

1. Prerequisites

Sign up for a free Picovoice Console account. Once you've created an account, copy your AccessKey on the main dashboard.

2. Create a React Project:

If you don't already have a React project, start by creating one with the following command:

npx create-react-app cheetah-react

3. Install Dependencies:

Install @picovoice/cheetah-react and @picovoice/web-voice-processor:

npm install @picovoice/cheetah-react @picovoice/web-voice-processor

4. Cheetah Model

In order to initialize Cheetah, you will need a model file. Download the default model file and place it in the /public directory of your project.

Instead of using the default model file, you may also create a custom model in the Picovoice Console if your application requires custom vocabulary and boosted words. Refer to our written guide or the equivalent video tutorial for guidance.

Create Components

Create a file within /src called VoiceWidget.js and paste the below into it. The code uses Cheetah's React hook to create and perform speech-to-text. Remember to replace ${ACCESS_KEY} with your AccessKey obtained from Picovoice Console.

import React, { useEffect, useState } from "react";
import { useCheetah } from "@picovoice/cheetah-react";

export default function VoiceWidget() {
  const [transcript, setTranscript] = useState("");
  
  const {
    result,
    isLoaded,
    isListening,
    error,
    init,
    start,
    stop,
  } = useCheetah();

  const initEngine = async () => {
    await init(
      "${ACCESS_KEY}",
      { publicPath: "${MODEL_FILE_PATH}" },
      { enableAutomaticPunctuation: true }
    );
  };

  const toggleRecord = async () => {
    if (isListening) {
      await stop();
    } else {
      await start();
    }
  };
  
  useEffect(() => {
    if (result !== null) {
      setTranscript(prev => {
        let newTranscript = prev + result.transcript
        if (result.isComplete) {
          newTranscript += " "
        }
        return newTranscript
      })
    }
  }, [result])
  
  return (
    <div>
      {error && <p className="error-message">{error.toString()}</p>}
      <br />
      <button onClick={initEngine} disabled={isLoaded}>Initialize Cheetah</button>
      <br />
      <br />
      <label htmlFor="audio-record">Record audio to transcribe:</label>
      <button id="audio-record" onClick={toggleRecord} disabled={!isLoaded}>
        {isListening ? "Stop Listening" : "Start Listening"}
      </button> 
      <h3>Transcript:</h3>
      <p>{transcript}</p>
    </div>
  );
}

Modify App.js to display the VoiceWidget:

import VoiceWidget from "./VoiceWidget";

function App() {
  return (
    <div className="App">
      <h1>
        Cheetah React Demo
      </h1>
      <VoiceWidget />
    </div>
  );
}

export default App;

Start the development server:

npm run start

Once it's running, navigate to localhost:3000 and click the "Initialize Cheetah" button. Once Cheetah has loaded, click "Start Listening" to begin recording and processing audio.

Source Code

The source code for the complete demo with Cheetah React is available on its GitHub repository.

Have you seen our other React.js tutorials? Don’t forget to check out Batch Transcription with React.js and Wake Word Detection with React.js.