TLDR: Build a hotel voice assistant that runs entirely on-device using Python. Learn how to create a privacy-first hotel room automation system that responds to guest commands locally. No cloud, no network latency and no data privacy concerns.
Voice AI is transforming hotel guest experiences and hospitality operations. Modern hotel rooms are increasingly equipped with voice-activated controls for lighting, temperature, entertainment, and concierge services. However, most implementations rely on cloud-based voice assistants (like Alexa or Google Home) that introduce latency and significant data privacy concerns. Guest conversations, room preferences, and behavioral data are transmitted to external servers—creating liability risks and potentially undermining guest trust.
On-device voice processing eliminates these issues, enabling smart hotel rooms with zero network latency and full data privacy. All processing happens locally, ensuring GDPR compliance without external data transmission. This tutorial demonstrates how to build a complete hotel room voice assistant using an on-device architecture: it handles structured commands (like lights, temperature) instantly, while seamlessly routing complex conversational queries to a local Large Language Model.
What You'll Build:
A privacy-first hotel assistant that:
- Activates using two custom wake phrases - one for room controls (e.g., "Hey Smart Room") and one for guest queries (e.g., "Hey Concierge")
- Processes room controls instantly without the cloud
- Handles open-ended guest questions (e.g., "When is breakfast?") using a local LLM
What You'll Need:
- Python 3.9+
- Microphone and speakers for testing
- Picovoice
AccessKeyfrom the Picovoice Console
On-Device Voice AI Architecture for Hospitality Applications
The hotel voice assistant system architecture uses deterministic control with generative AI to handle the full spectrum of hotel interactions:
- Voice Activation: Porcupine Wake Word continuously monitors audio for two distinct wake phrases. Detecting "Hey Smart Room" routes to instant room control, while "Hey Concierge" routes to the conversational AI for guest queries. This dual-keyword approach lets guests choose the right path upfront.
- Precise IoT Control: For smart hotel room automation, Rhino Speech-to-Intent maps spoken commands directly to structured JSON instructions. This engine ensures high accuracy for hardware interactions e.g., adjusting thermostats or lighting without the unpredictability of probabilistic models.
- Local Large Language Model (LLM): For dynamic hotel guest services, the system utilizes Cheetah Streaming Speech-to-Text paired with picoLLM. This combination processes natural language inquiries locally, allowing the assistant to function as a knowledgeable concierge (e.g., answering questions about pool hours or checkout times).
This edge AI architecture addresses the unique challenges of hospitality voice assistants: guest privacy expectations, 24/7 reliability requirements, multilingual support, and seamless integration with existing property management systems.
All Picovoice models - Porcupine Wake Word, Rhino Speech-to-Intent, Cheetah Streaming Speech-to-Text, and Orca Streaming Text-to-Speech support multiple languages including English, Spanish, French, German and more.
Smart Hotel Voice Control Workflow:
Conversational Hospitality Workflow:
Train Custom Wake Words for Hotel Voice Assistant
- Sign up for a Picovoice Console account and navigate to the Porcupine page.
- Enter your first wake phrase for room controls (e.g., "Hey Smart Room") and test it using the microphone button.
- Click "Train," select the target platform, and download the
.ppnmodel file. - Repeat Steps 2 & 3 to train an additional wake word for detailed guest queries (e.g., "Hey Concierge").
Porcupine can detect multiple wake words simultaneously. For instance, it can support both "Hey Smart Room" and "Hey Concierge" for different tasks. For tips on designing an effective wake word, review the choosing a wake word guide.
Define Voice Commands for Smart Room Control
- Create an empty Rhino Speech-to-Intent Context.
- Click the "Import YAML" button in the top-right corner of the console and paste the YAML provided below to define intents for structured hotel room commands.
- Test the model with the microphone button and download the
.rhncontext file for your target platform.
You can refer to the Rhino Syntax Cheat Sheet for more details on building custom contexts.
YAML Context for Hotel Room Commands:
This context handles the most common structured room control commands. For conversational queries like "What time is breakfast?" or "I'm feeling cold, can you help?", the assistant will use the picoLLM path.
Set Up Conversational AI Model
- Navigate to the picoLLM page in Picovoice Console.
- Select a function-calling compatible model. This tutorial uses
llama-3.2-1b-instruct-505.pllm. - Download the
.pllmfile and place it in your project directory.
Set Up the Python Environment
The following Python SDKs provide the complete smart hotel room AI stack for voice-enabled room automation. Install all required Python SDKs and dependencies using pip:
- Porcupine Wake Word Python SDK:
pvporcupine - Rhino Speech-to-Intent Python SDK:
pvrhino - Cheetah Streaming Speech-to-Text Python SDK:
pvcheetah - picoLLM Python SDK:
picollm - Orca Streaming Text-to-Speech Python SDK:
pvorca - Picovoice Python Recorder library:
pvrecorder - Picovoice Python Speaker library:
pvspeaker
Add Wake Word Detection for Hands-Free Activation
The following code captures audio from your default microphone and detects the custom wake word locally:
Porcupine Wake Word processes each audio frame on-device and triggers when either keyword is recognized. By listening for multiple wake words simultaneously, it routes guests to the right system path instantly - room control or concierge services - without continuous cloud streaming.
Understand User Voice Commands
Once "Hey Smart Room" is detected, Rhino Speech-to-Intent listens for structured room control commands:
Rhino Speech-to-Intent directly infers intent from speech without requiring a separate transcription step.
Handle Open-Ended Guest Queries
When guests say "Hey Concierge," the system routes directly to streaming speech-to-text and local LLM for natural language queries:
This approach uses Cheetah Streaming Speech-to-Text to transcribe the guest's natural speech, then picoLLM to understand the query and generate an appropriate response based on hotel information.
Add Voice Response Generation
Transform text responses into natural speech:
Orca Streaming Text-to-Speech generates natural voice responses with first audio output in under 130ms, creating a seamless conversational experience.
Execute Room Control Voice Commands on IoT Systems
Route user requests from structured JSON to IoT systems:
Complete Python Code for Hotel Room Voice Assistant
This implementation combines all components for a complete hotel room voice assistant:
Run the Hotel Room Voice Assistant
To run the voice assistant, update the model paths to match your local files and have your Picovoice AccessKey ready:
Example interactions:
You can start building your own commercial or non-commercial projects using Picovoice's self-service Console.
Start Building






