Learn how to transcribe speech to text in real time using Picovoice Cheetah Streaming Speech-to-Text Python
SDK. Cheetah performs speech recognition locally, keeping your voice data private (i.e., GDPR
and HIPAA
compliant by design). The SDK runs on Linux
, macOS
, Windows
, Raspberry Pi
, and NVIDIA Jetson
.
Cheetah can also run on Android
, iOS
, and even inside a Web Browser
!
Speech-to-Text
(STT
), Automatic Speech Recognition
(ASR
), Automatic Transcription
, and Large-Vocabulary Speech Recognition
(LVSR
) are the same. Similarly, Real-Time
, Online
, or Streaming
STT (ASR) all refer to an engine that makes transcription available as the user speaks with minimum delay.
Install Streaming Speech-to-Text Python SDK
Install the SDK:
Sign up for Picovoice Console
Log in to (sign up for) Picovoice Console . It is free, and no credit card is required!
Copy your AccessKey
to the clipboard.
Implementation
The transcription implementation has only three steps.
Step 1
Import Cheetah STT package:
Step 2
Create an instance of the STT object with your AccessKey:
Step 3
Implement audio recording. The audio might be from a microphone or a stream you receive from another program. For the following, we assume there is a function available to us that provides the next available audio chunk (frame) as below.
Transcribe an audio stream in real-time: