🚀 Best-in-class Voice AI!
Build compliant and low-latency AI applications running entirely on mobile without sharing user data with 3rd parties.
Start Free

Speech-to-text is one of the most natural ways to interact with devices. From note-taking apps to hands-free controls, it opens up new levels of accessibility and user experience.

Real-time speech-to-text takes it one step further, letting your app transcribe audio instantly without needing to wait for the recording to finish. While Flutter simplifies cross-platform development, handling continuous audio streams and low-latency processing can be challenging.

That's where Picovoice's Cheetah Streaming Speech-to-Text Flutter SDK comes in. It delivers continuous, low-latency transcription directly on-device—no cloud, no delay, and full privacy. In this post, we'll show how to integrate streaming speech-to-text into your Flutter app for fast and reliable real-time voice interaction.

Cheetah Streaming Speech-to-Text delivers higher accuracy than Google Streaming ASR, despite running on the device—being much smaller than Google Cloud.

This guide shows how to add custom, on-device streaming speech-to-text to a Flutter app with Cheetah Streaming Speech-to-Text.

What you'll learn:

What you need:

Enable Microphone Permissions

This tutorial requires recording audio, so before we begin, you'll need to configure your Flutter project to request audio recording permissions from the user. Make sure the appropriate permissions are enabled for each platform:

iOS

Add the following block to Info.plist:

Android

Add the following block to AndroidManifest.xml:

Internet is required only for licensing and usage tracking. Audio remains on-device, and is not streamed. Once Cheetah has been initialized, it can run offline.

Recording Audio with VoiceProcessor

As you'll see later, Cheetah is easy to use—just pass it audio, and it returns text. But how do you record that audio? Like many cross-platform frameworks, recording media in Flutter can be challenging. To simplify this process, we created an audio capture plugin that handles all the complexity for us: flutter_voice_processor.

In this section, we'll show how to record audio in Flutter using this plugin. In the next section, we'll show how to pass this audio to Cheetah for transcription.

  1. Add the flutter_voice_processor plugin as a dependency. Open your project's pubspec.yaml file and add the following:

VoiceProcessor: Step-by-Step Code Walkthrough

  1. Create an instance of VoiceProcessor and add frame listeners. Eventually, we'll pass the audio to Cheetah for transcription, but for now, we won't actually do anything useful with the recorded audio:
  1. Call hasRecordAudioPermission() to prompt the user to give audio recording permissions. Once accepted, call start() to begin recording:
  1. Call stop() to stop recording audio:
  1. When you no longer need to record audio, clean up the frame listeners.

VoiceProcessor: Complete Widget Example

Below is a fully implemented widget you can add to your project to see flutter_voice_processor in action:

This is a simplified example that includes all the essential code to get you started. If you'd like to see a complete working app, check out the Flutter Voice Processor demo on our GitHub repository.

You can also explore our documentation for more details:

Streaming Speech-to-Text with Cheetah

Now that we know how to record audio in Flutter (assuming you've followed the previous section), we'll now learn how to pass recorded audio to Cheetah for streaming speech-to-text.

  1. Add Cheetah Flutter Plugin: To use Cheetah Streaming Speech-to-Text in your Flutter project, add the cheetah_flutter plugin as a dependency. Open your project's pubspec.yaml file and add the following:
  1. Get Your Picovoice Access Key: Sign up for a free Picovoice Console account and obtain your AccessKey. The AccessKey is only required for authentication and authorization.

  2. Train Custom Model: Create and download a custom model using the Picovoice Console. This is useful if you need Cheetah to recognize words outside the standard vocabulary, prioritize certain words for easier recognition, or modify pronunciations of certain words. If you do not need a custom model, use one of the default models.

If you'd like to see a video walkthrough for this step, check out Picovoice Console Tutorial: Leopard & Cheetah Speech-to-Text.

  1. Add Model Files to Your Project: Place your model file into your project's assets/ folder and add the file path to your pubspec.yaml:

Cheetah: Step-by-Step Code Walkthrough

  1. Create an instance of Cheetah:
  1. To transcribe a chunk of audio, pass it to process(). We will be using flutter_voice_processor to handle recording and passing audio to Cheetah, but for now we'll omit the implementation for brevity:
  1. When you no longer need Cheetah, call delete() to release the acquired resources:

Cheetah + VoiceProcessor: Complete Widget Example

Below is a fully implemented widget you can add to your project to see Cheetah and VoiceProcessor in action. Be sure to replace {ACCESS_KEY} with your own AccessKey from Picovoice Console and {MODEL_FILE} with your model file.

This is a simplified example that includes all the essential code to get you started. If you'd like to see a complete working app, check out the Cheetah Flutter demo on our GitHub repository.

This tutorial uses the following packages:

You can also explore our documentation for more details:

Batch Transcription

Streaming Speech-to-Text is ideal when you want transcripts in real time, without waiting for the audio to finish. If real-time transcription isn't critical, and you need features like word-level metadata, consider using Leopard Speech-to-Text. Use leopard_flutter if this is a requirement for your project.

Start Building