TLDR: This tutorial walks you through building a voice-enabled Perplexity AI chatbot in Python, with fully on-device speech processing. Unlike cloud-based solutions that send data to remote servers, this approach to create a Perplexity voice assistant reduces latency and ensures real-time processing, making it ideal for Perplexity voice agents and other voice AI applications that require immediate response times and smooth, uninterrupted user interactions.
Voice interfaces are no longer limited to smart speakers or mobile assistants. They’ve become a natural way for users to search, learn, and interact with information through voice-activated AI assistants.
Developers often want to add voice to AI chatbots, but cloud APIs like Google Speech-to-Text or AWS Transcribe can introduce high latency by sending voice recordings to remote servers. For Perplexity AI voice applications, where real-time performance and responsiveness matter, these compromises can become significant.
This tutorial demonstrates how to integrate voice with Perplexity using on-device speech processing with Python. The voice assistant uses Porcupine Wake Word for voice activation, Cheetah Streaming Speech-to-Text to transcribe speech, and Orca Streaming Text-to-Speech to generate voice responses. This keeps voice data fully on-device while still leveraging Perplexity’s intelligence. The architecture removes cloud round-trips for real-time, low-latency performance and scales easily across platforms and use cases.
The entire implementation fits into a single Python script that runs on Windows, macOS, Linux, and Raspberry Pi using Python 3.9+, a microphone, and speakers.
Train Custom Wake Word for Perplexity Voice Assistant
- Sign up on Picovoice Console and open the Porcupine page.
- Enter a wake phrase such as "Hey Perplexity", and test it with the microphone button.
- Click "Train", choose the target platform, and download the
.ppnmodel file for both wake words. - Repeat step 2 & 3 for any additional wake words you would like to support (e.g., "Hey Plex").
With Porcupine Wake Word, the voice assistant can be configured to detect multiple wake words simultaneously, allowing activation with phrases such as "Hey Perplexity" and "Hey Plex." For tips on training effective wake words, refer to the choosing a wake word guide.
Set Up Your Python Environment
Install all required Python SDKs and supporting libraries with a single command in the terminal:
- Porcupine Wake Word Python SDK:
pvporcupine - Cheetah Streaming Speech-to-Text Python SDK:
pvcheetah - Orca Streaming Text-to-Speech Python SDK:
pvorca - Picovoice Python Recorder library:
pvrecorder - Picovoice Python Speaker library:
pvspeaker - Requests HTTP library:
requests— used for sending REST API calls to Perplexity
To use the Picovoice SDKs you will need a Picovoice AccessKey, which authenticates your SDK usage. You can access it in the Picovoice Console.
Embed Wake Word Detection into Perplexity Voice Assistant
The following snippet captures audio from your default microphone and detects your custom wake word locally:
Porcupine Wake Word processes each frame on-device and returns the index of the detected wake word.
Integrate Streaming Speech-to-Text in Perplexity Voice Assistant
Once the wake word has been detected, the transcription loop is activated. The code captures short audio frames and transcribes them using Cheetah Streaming Speech-to-Text:
Each finalized segment returns text that is ready to send to Perplexity AI.
Connect Speech Recognition to Perplexity API
Once the text is transcribed, Perplexity API processes the text prompt:
Add Voice to Perplexity AI Responses
The system transforms the chatbot’s response into natural speech using Orca Streaming Text-to-Speech and PvSpeaker:
Orca Streaming Text-to-Speech synthesizes speech entirely on-device and streams audio as it’s generated, ensuring significantly lower latency than cloud-based alternatives.
Complete Implementation of Voice-Enabled Perplexity AI Assistant
This implementation combines three Picovoice engines: Porcupine Wake Word, Cheetah Streaming Speech-to-Text, and Orca Streaming Text-to-Speech. The voice processing happens entirely on-device, while only text queries are sent to the Perplexity AI API.
Run the Perplexity Voice Assistant
To run the voice-powered Perplexity AI assistant, update the keyword_paths in the command below to match your local wake word model files and ensure both API keys are correctly set:
- Picovoice AccessKey – authenticates your Picovoice SDK usage (copy it from the Picovoice Console)
- Perplexity API key – authorizes requests to the Perplexity API
Looking to integrate voice with other AI platforms? Check out our guides for ChatGPT voice integration and Claude voice integration.
You can start building your own commercial or non-commercial projects leveraging Picovoice's self-service Console.
Start Building






