Build a Restaurant Voice Assistant in Python

🤖 Develop Voice Assistants

Build private, accurate and low-latency voice assistants running entirely offline.

TLDR: This tutorial shows how to build an on-device restaurant voice assistant in Python that handles customer orders, reservations, service requests, and menu questions, all without cloud APIs. Implement four core components: wake word activation, speech-to-intent for order capture, local LLM for menu questions, and text-to-speech for voice responses.

Restaurants need voice ordering systems that are fast, reliable, and predictable, especially during peak service hours. Cloud-based voice stacks introduce network latency that slows down customer interactions and creates awkward pauses during ordering. Traditional speech-to-text and Natural Language Understanding (NLU) pipelines also struggle with complex orders that include modifications, dietary preferences, or substitutions.

This tutorial solves both problems with an on-device voice architecture. For structured voice commands like orders and reservations, speech-to-intent maps voice directly to JSON without any intermediate transcription step.

Voice Input: "I want a burger"
JSON Output: {"intent": "placeOrder", "item": "burger"}

For open-ended questions about menu items, allergens, or recommendations, streaming speech-to-text feeds a local large language model (LLM) that handles unpredictable phrasing.

Voice Input: "Is the salmon gluten-free?"
LLM Output: "Yes, the salmon is gluten-free."

Both paths run entirely on-device with voice feedback via text-to-speech.

This guide demonstrates how to build a restaurant voice agent with:

Wake word activation for hands-free operation
Speech-to-intent for restaurant orders, modifications, reservations, and service requests
Structured JSON output for POS and reservation system integration
Streaming Speech-to-Text and Local LLM for menu questions, dietary inquiries, and recommendations
Complete voice feedback loop with streaming text-to-speech for order confirmation

All processing runs locally, enabling low latency, privacy compliant speech recognition, and full control over customer voice data.

Prerequisites:

Python 3.9+
A laptop or computer with microphone and speakers for testing
Picovoice AccessKey from the Picovoice Console

Voice AI Architecture for Restaurant Voice Agent

The restaurant voice agent architecture combines deterministic speech-to-intent for structured orders with generative AI for conversational menu inquiries. This provides both the reliability of rule-based systems and the flexibility of LLM-powered interactions.

Porcupine Wake Word for Voice Activation: Listens continuously for wake phrases and routes to different processing paths, e.g., "Hey Restaurant" for orders, "Hey Menu" for questions.
Rhino Speech-to-Intent for Structured Commands: Maps voice directly to structured intents without intermediate transcription. Produces POS-ready JSON in a single inference step, avoiding the error propagation from Speech-to-Text + Natural Language Understanding pipelines.
Cheetah Streaming Speech-to-Text for Open-Ended Questions: Transcribes menu inquiries in real-time for the LLM path. Cheetah delivers cloud-competitive accuracy with sub-second latency.
picoLLM for Local LLM Inference: Generates natural responses to questions about ingredients, allergens, or recommendations — handling unpredictable phrasing.
Orca Streaming Text-to-Speech for Voice Responses: Converts all responses to natural speech. Orca can begin audio output while the LLM is still generating text, minimizing perceived latency.

All Picovoice voice processing models support multiple languages including English, Spanish, French, German, and more.

Order Placement Workflow:

Customer: "Hey Restaurant, I want a steak"
   ↓
[Wake Word: Porcupine] Detects activation → Routes to speech-to-intent
   ↓
[Speech-to-Intent: Rhino] Direct inference → {"intent": "placeOrder", "item": "steak"}
   ↓
[Integration Layer] Structured JSON → POS API call
   ↓
[TTS: Orca] Synthesis + audio output → "Steak added to your order"

Menu Inquiry Workflow:

Customer: "Hey Menu, is the salmon gluten-free?"
   ↓
[Wake Word: Porcupine] Detects activation → Routes to conversational path
   ↓
[Streaming STT: Cheetah] Real-time transcription → "Is the salmon gluten-free?"
   ↓
[Local LLM: picoLLM] Context-aware inference → Generates response from menu data
   ↓
[TTS: Orca] Synthesis + audio output → Natural language answer

Looking for QSR drive-thru voice ordering solutions? See Challenges of QSR Drive-Thrus and Voice AI.

Train Custom Wake Words for Restaurant Voice Assistant

Sign up for a Picovoice Console account and navigate to the Porcupine page.
Enter your wake phrase for ordering (e.g., "Hey Restaurant") and test it using the microphone button.
Click "Train," select the target platform, and download the .ppn model file.
Repeat Steps 2 & 3 to train an additional wake word for menu queries (e.g., "Hey Menu").

Porcupine Wake Word can detect multiple wake words simultaneously. For instance, it can support both "Hey Restaurant" for placing orders and "Hey Menu" for menu questions. For tips on designing an effective wake word, review the choosing a wake word guide.

Define Voice Commands for Orders, Reservations, and Service

Create an empty context for speech-to-intent processing.
Click the "Import YAML" button in the top-right corner of the console and paste the YAML provided below to define intents for customer ordering.
Test the model with the microphone button and download the .rhn context file for your target platform.

You can refer to the Rhino Syntax Cheat Sheet for more details on building custom contexts.

YAML Context for Customer Order Commands:

context:
  expressions:
    placeOrder:
      - "(@action) @order_action (a) (the) $item:item (please)"
      - "(@action) order (a) (the) $item:item (please)"

    modifyOrder:
      - "$modifier:modifier (on) (the) $item:item"
      - "(please) make it with $modifier:modifier"
      - "$item:item with $modifier:modifier"
      - "(@action) add $modifier:modifier to (the) $item:item"

    makeReservation:
      - "(@action) @reservation_action (a) table for $pv.TwoDigitInteger:party_size (people) (please)"
      - "(@action) @reservation_action for $pv.TwoDigitInteger:party_size (at) $time:time"
      - "table for $pv.TwoDigitInteger:party_size (at) $time:time (please)"

    requestService:
      - "(@action) [get, have] (me) (some) $service:service (please)"
      - "(I) need $service:service (please)"
      - "(@action) bring (me) (some) $service:service (please)"

    cancelItem:
      - "(@action) @cancel_action (the) $item:item"
      - "(I) don't want (the) $item:item (anymore)"
      - "skip (the) $item:item"

    askWaitTime:
      - "how long (is the) wait"
      - "(what's) (the) wait time"
      - "how much longer"

    requestCheck:
      - "(@action) [get, have] (the) @check_action (please)"
      - "(@action) pay (please)"
      - "@check_action (please)"

  slots:
    item:
      - burger
      - steak
      - salmon
      - chicken
      - caesar salad
      - house salad
      - fries
      - soup
      - pasta
      - pizza
      - dessert

    modifier:
      - no onions
      - extra cheese
      - no tomatoes
      - add bacon
      - gluten free bun
      - extra sauce
      - no pickles

    time:
      - six
      - six thirty
      - seven
      - seven thirty
      - eight
      - eight thirty

    service:
      - water
      - napkins
      - silverware
      - extra plates
      - the menu

  macros:
    action:
      - can I
      - could I
      - I want to
      - I'd like to
      - I would like to

    order_action:
      - get
      - have
      - want

    reservation_action:
      - book
      - reserve
      - make a reservation

    check_action:
      - check
      - bill
      - receipt

    cancel_action:
      - cancel
      - remove
      - drop

This context handles:

Customer ordering,
Order modifications,
Reservations,
Service requests.

For menu questions, dietary inquiries, or recommendations, customers say "Hey Menu" to route to the conversational AI path.

Navigate to the picoLLM page in Picovoice Console.
Select a model. This tutorial uses llama-3.2-1b-instruct-505.pllm.
Download the .pllm file and place it in your project directory.

Install Python SDKs for Restaurant Voice AI

The following Python SDKs provide the complete restaurant voice AI stack for hands-free operations. Install all required Python SDKs and dependencies using pip:

Wake word detection: pvporcupine
Speech-to-intent processing: pvrhino
Streaming speech-to-text: pvcheetah
Local LLM inference: picollm
Text-to-speech synthesis: pvorca
Audio input/output: pvrecorder, pvspeaker

pip install pvporcupine pvrhino pvcheetah picollm pvorca pvrecorder pvspeaker

Add Hands-Free Voice Activation

The following code captures audio from your microphone and detects the custom wake words locally:

import pvporcupine
from pvrecorder import PvRecorder

ACCESS_KEY = "${ACCESS_KEY}"
ORDER_KEYWORD_PATH = "${ORDER_KEYWORD_PATH}"  # Path to "Hey Restaurant" .ppn file
MENU_KEYWORD_PATH = "${MENU_KEYWORD_PATH}"  # Path to "Hey Menu" .ppn file

porcupine = pvporcupine.create(
    access_key=ACCESS_KEY,
    keyword_paths=[ORDER_KEYWORD_PATH, MENU_KEYWORD_PATH]
)

recorder = PvRecorder(frame_length=porcupine.frame_length)
recorder.start()

print("Listening for wake word...")

try:
    while True:
        pcm = recorder.read()
        keyword_index = porcupine.process(pcm)
        
        if keyword_index == 0:
            print("Order wake word detected - routing to order capture")
            # Route to Rhino for customer orders
            break
        elif keyword_index == 1:
            print("Menu wake word detected - routing to menu assistant")
            # Route to Cheetah + picoLLM for menu queries
            break
except KeyboardInterrupt:
    print("\nStopping...")
finally:
    recorder.stop()
    recorder.delete()
    porcupine.delete()

Wake word processing happens on-device, triggering the rest of the pipeline when the wake phrase is recognized. This enables hands-free activation ideal for kiosks, counter service, and dine-in table ordering.

Add Voice Command Recognition to Process Customer Orders

Once the wake word is detected, speech-to-intent processing listens for structured customer orders:

import pvrhino
from pvrecorder import PvRecorder

ACCESS_KEY = "${ACCESS_KEY}"
CONTEXT_PATH = "${CONTEXT_PATH}"  # Path to .rhn file

rhino = pvrhino.create(
    access_key=ACCESS_KEY,
    context_path=CONTEXT_PATH
)

recorder = PvRecorder(frame_length=rhino.frame_length)
recorder.start()

print("Listening for order...")

try:
    while True:
        pcm = recorder.read()
        is_finalized = rhino.process(pcm)
        
        if is_finalized:
            inference = rhino.get_inference()
            
            if inference.is_understood:
                print('{')
                print("  intent : '%s'" % inference.intent)
                print('  slots : {')
                for slot, value in inference.slots.items():
                    print("    %s : '%s'" % (slot, value))
                print('  }')
                print('}\n')
                
                # Process customer order
                process_order(ACCESS_KEY, inference.intent, inference.slots)
            else:
                print("Didn't understand the command. Please try again.")
                speak_response(ACCESS_KEY, "Sorry, I didn't catch that. Could you try again?")
            
            break
except KeyboardInterrupt:
    print("\nStopping...")
finally:
    recorder.stop()
    recorder.delete()
    rhino.delete()

The process_order function is defined in the POS integration section below.

For menu inquiries, the system routes to streaming speech-to-text and local LLM for natural language processing:

import pvcheetah
import picollm
from pvrecorder import PvRecorder

ACCESS_KEY = "${ACCESS_KEY}"
PICOLLM_MODEL_PATH = "${PICOLLM_MODEL_PATH}"  # Path to .pllm file

def handle_menu_query():
    """Process menu queries using Cheetah + picoLLM"""
    
    # Initialize Cheetah for speech-to-text
    cheetah = pvcheetah.create(
        access_key=ACCESS_KEY,
        endpoint_duration_sec=1.0
    )
    
    recorder = PvRecorder(frame_length=cheetah.frame_length)
    recorder.start()
    
    print("Speak your question...")
    transcript = ""
    
    try:
        while True:
            pcm = recorder.read()
            partial_transcript, is_endpoint = cheetah.process(pcm)
            transcript += partial_transcript
            print(partial_transcript, end="", flush=True)
            
            if is_endpoint:
                final_transcript = cheetah.flush()
                transcript += final_transcript
                print(final_transcript)
                break
    except KeyboardInterrupt:
        print("\nStopping...")
    finally:
        recorder.stop()
        recorder.delete()
        cheetah.delete()
    
    if not transcript.strip():
        print("No question detected.")
        speak_response(ACCESS_KEY, "I didn't catch your question. Please try again.")
        return None
    
    # Process with picoLLM
    pllm = picollm.create(
        access_key=ACCESS_KEY,
        model_path=PICOLLM_MODEL_PATH
    )
    
    # Restaurant-specific context for LLM
    menu_info = """
    Menu Information:
    - Grilled Salmon: $28, gluten-free, served with roasted vegetables and lemon butter sauce
    - Ribeye Steak: $42, 12oz, served with mashed potatoes and asparagus
    - Chicken Parmesan: $24, contains gluten and dairy, served with spaghetti
    - Mushroom Risotto: $22, vegetarian, contains dairy, gluten-free
    - House Salad: $12, mixed greens, cherry tomatoes, cucumber, balsamic vinaigrette (vegan)
    - Tiramisu: $10, contains gluten, dairy, eggs, and coffee
    
    Today's Specials:
    - Soup of the Day: Tomato Basil (vegan, gluten-free) - $8
    - Fresh Catch: Seared Tuna with wasabi aioli - $34
    
    Wine Pairings:
    - Salmon: Chardonnay or Pinot Grigio
    - Ribeye: Cabernet Sauvignon or Malbec
    - Chicken: Pinot Noir or Sauvignon Blanc
    """
    
    prompt = f"{menu_info}\n\nCustomer question: {transcript}\n\nProvide a helpful, concise response:"
    
    print("\nGenerating response...")
    
    response = pllm.generate(
        prompt=prompt,
        completion_token_limit=150,
        temperature=0.3
    )
    
    print(f"\nAssistant: {response.completion}")
    speak_response(ACCESS_KEY, response.completion)
    
    pllm.release()
    
    return response.completion

This approach uses streaming speech-to-text to transcribe natural speech, then a local LLM to understand the query and generate an appropriate response based on menu information.

Generate Voice Responses with Text-to-Speech

Transform text responses into natural speech:

import pvorca
from pvspeaker import PvSpeaker
from collections import deque

def speak_response(access_key: str, text: str):
    """Convert text to speech and play"""
    orca = pvorca.create(access_key=access_key)
    speaker = PvSpeaker(
        sample_rate=orca.sample_rate,
        bits_per_sample=16
    )
    
    try:
        # Synthesize speech
        pcm_out, _ = orca.synthesize(text)
        
        # Play audio
        speaker.start()
        
        pcm_buffer = deque()
        pcm_buffer.append(pcm_out)
        
        while len(pcm_buffer) > 0:
            pcm = pcm_buffer.popleft()
            written = speaker.write(pcm)
            if written < len(pcm):
                pcm_buffer.appendleft(pcm[written:])
        
        speaker.flush()
        speaker.stop()
    except KeyboardInterrupt:
        print("\nStopping playback...")
        speaker.stop()
    finally:
        # Cleanup
        speaker.delete()
        orca.delete()

Streaming text-to-speech generates natural voice responses, providing clear order confirmations and menu information.

Integrate Voice Stack with POS and Restaurant Management Systems

Route customer orders from structured JSON to restaurant systems:

def process_order(access_key: str, intent: str, slots: dict):
    """Process customer orders and provide voice feedback"""
    
    if intent == "placeOrder":
        item = slots.get('item', 'unknown')
        
        print(f"[POS] Adding {item} to order")
        # Integration point: add_to_order(item)
        
        speak_response(access_key, f"{item.replace('_', ' ').title()} added to your order")
        
    elif intent == "modifyOrder":
        item = slots.get('item', '')
        modifier = slots.get('modifier', '')
        
        print(f"[POS] Modifying: {modifier} on {item}")
        # Integration point: modify_order_item(item, modifier)
        
        speak_response(access_key, f"Got it, {modifier}")
    
    elif intent == "cancelItem":
        item = slots.get('item', 'unknown')
        
        print(f"[POS] Removing {item} from order")
        # Integration point: remove_from_order(item)
        
        speak_response(access_key, f"{item.replace('_', ' ').title()} removed from your order")
    
    elif intent == "makeReservation":
        party_size = slots.get('party_size', '2')
        time = slots.get('time', '')
        
        print(f"[Reservation] Party of {party_size} at {time}")
        # Integration point: create_reservation(party_size, time)
        
        if time:
            speak_response(access_key, f"Table for {party_size} at {time} confirmed")
        else:
            speak_response(access_key, f"Table for {party_size} confirmed")
    
    elif intent == "requestService":
        service = slots.get('service', 'assistance')
        
        print(f"[Service] Customer requested: {service}")
        # Integration point: notify_staff(service)
        
        speak_response(access_key, f"I'll get you {service} right away")
    
    elif intent == "askWaitTime":
        print(f"[Query] Wait time requested")
        # Integration point: get_current_wait_time()
        
        speak_response(access_key, "Current wait time is about 15 minutes")
    
    elif intent == "requestCheck":
        print(f"[POS] Check requested")
        # Integration point: print_check()
        
        speak_response(access_key, "I'll bring your check right away")
    
    return "handled"

Complete Python Code for Restaurant Voice Assistant

This implementation combines all components for a complete restaurant voice assistant:

# Restaurant Voice Assistant

import argparse
import os
from collections import deque

import pvporcupine
import pvrhino
import pvcheetah
import picollm
import pvorca
from pvrecorder import PvRecorder
from pvspeaker import PvSpeaker


def speak_response(access_key: str, text: str):
    """Convert text to speech and play"""
    orca = pvorca.create(access_key=access_key)
    speaker = PvSpeaker(
        sample_rate=orca.sample_rate,
        bits_per_sample=16
    )
    
    try:
        pcm_out, _ = orca.synthesize(text)
        speaker.start()
        
        pcm_buffer = deque()
        pcm_buffer.append(pcm_out)
        
        while len(pcm_buffer) > 0:
            pcm = pcm_buffer.popleft()
            written = speaker.write(pcm)
            if written < len(pcm):
                pcm_buffer.appendleft(pcm[written:])
        
        speaker.flush()
        speaker.stop()
    except KeyboardInterrupt:
        print("\nStopping playback...")
        speaker.stop()
    finally:
        speaker.delete()
        orca.delete()


def process_order(access_key: str, intent: str, slots: dict):
    """Process customer orders and provide voice feedback"""
    
    if intent == "placeOrder":
        item = slots.get('item', 'unknown')
        print(f"[POS] Adding {item} to order")
        speak_response(access_key, f"{item.replace('_', ' ').title()} added to your order")
        
    elif intent == "modifyOrder":
        item = slots.get('item', '')
        modifier = slots.get('modifier', '')
        print(f"[POS] Modifying: {modifier} on {item}")
        speak_response(access_key, f"Got it, {modifier}")
    
    elif intent == "cancelItem":
        item = slots.get('item', 'unknown')
        print(f"[POS] Removing {item} from order")
        speak_response(access_key, f"{item.replace('_', ' ').title()} removed from your order")
    
    elif intent == "makeReservation":
        party_size = slots.get('party_size', '2')
        time = slots.get('time', '')
        print(f"[Reservation] Party of {party_size} at {time}")
        if time:
            speak_response(access_key, f"Table for {party_size} at {time} confirmed")
        else:
            speak_response(access_key, f"Table for {party_size} confirmed")
    
    elif intent == "requestService":
        service = slots.get('service', 'assistance')
        print(f"[Service] Customer requested: {service}")
        speak_response(access_key, f"I'll get you {service} right away")
    
    elif intent == "askWaitTime":
        print(f"[Query] Wait time requested")
        speak_response(access_key, "Current wait time is about 15 minutes")
    
    elif intent == "requestCheck":
        print(f"[POS] Check requested")
        speak_response(access_key, "I'll bring your check right away")
    
    return "handled"


def handle_customer_order(access_key: str, context_path: str):
    """Process customer orders using Rhino Speech-to-Intent"""
    
    try:
        rhino = pvrhino.create(
            access_key=access_key,
            context_path=context_path)
    except pvrhino.RhinoError as e:
        print("Failed to initialize Rhino")
        raise e

    print(f'Rhino version: {rhino.version}')

    recorder = PvRecorder(frame_length=rhino.frame_length)
    recorder.start()

    print('Listening for order...')

    try:
        while True:
            pcm = recorder.read()
            is_finalized = rhino.process(pcm)

            if is_finalized:
                inference = rhino.get_inference()
                if inference.is_understood:
                    print('{')
                    print("  intent : '%s'" % inference.intent)
                    print('  slots : {')
                    for slot, value in inference.slots.items():
                        print("    '%s' : '%s'" % (slot, value))
                    print('  }')
                    print('}\n')
                    
                    process_order(access_key, inference.intent, inference.slots)
                else:
                    print("Didn't understand the command. Please try again.")
                    speak_response(access_key, "Sorry, I didn't catch that. Could you try again?")
                
                break

    except KeyboardInterrupt:
        print('\nStopping...')

    finally:
        recorder.stop()
        recorder.delete()
        rhino.delete()


def handle_menu_query(access_key: str, pllm_model_path: str):
    """Process menu queries using Cheetah + picoLLM"""
    
    cheetah = pvcheetah.create(
        access_key=access_key,
        endpoint_duration_sec=1.0
    )
    
    recorder = PvRecorder(frame_length=cheetah.frame_length)
    recorder.start()
    
    print("Ask your question...")
    transcript = ""
    
    try:
        while True:
            pcm = recorder.read()
            partial_transcript, is_endpoint = cheetah.process(pcm)
            transcript += partial_transcript
            print(partial_transcript, end="", flush=True)
            
            if is_endpoint:
                final_transcript = cheetah.flush()
                transcript += final_transcript
                print(final_transcript)
                break
    except KeyboardInterrupt:
        print("\nStopping...")
    finally:
        recorder.stop()
        recorder.delete()
        cheetah.delete()
    
    if not transcript.strip():
        print("No question detected.")
        speak_response(access_key, "I didn't catch your question. Please try again.")
        return
    
    pllm = picollm.create(
        access_key=access_key,
        model_path=pllm_model_path
    )
    
    menu_info = """
    Menu Information:
    - Grilled Salmon: $28, gluten-free, served with roasted vegetables and lemon butter sauce
    - Ribeye Steak: $42, 12oz, served with mashed potatoes and asparagus
    - Chicken Parmesan: $24, contains gluten and dairy, served with spaghetti
    - Mushroom Risotto: $22, vegetarian, contains dairy, gluten-free
    - House Salad: $12, mixed greens, cherry tomatoes, cucumber, balsamic vinaigrette (vegan)
    - Tiramisu: $10, contains gluten, dairy, eggs, and coffee
    
    Today's Specials:
    - Soup of the Day: Tomato Basil (vegan, gluten-free) - $8
    - Fresh Catch: Seared Tuna with wasabi aioli - $34
    
    Wine Pairings:
    - Salmon: Chardonnay or Pinot Grigio
    - Ribeye: Cabernet Sauvignon or Malbec
    - Chicken: Pinot Noir or Sauvignon Blanc
    """
    
    prompt = f"{menu_info}\n\nCustomer question: {transcript}\n\nProvide a helpful, concise response:"
    
    print("\nGenerating response...")
    
    response = pllm.generate(
        prompt=prompt,
        completion_token_limit=150,
        temperature=0.3
    )
    
    print(f"\nAssistant: {response.completion}")
    
    speak_response(access_key, response.completion)
    
    pllm.release()


def main():
    parser = argparse.ArgumentParser()

    parser.add_argument(
        '--access_key',
        help='AccessKey obtained from Picovoice Console (https://console.picovoice.ai/)',
        required=True)

    parser.add_argument(
        '--order_keyword_path',
        help='Absolute path to order wake word model file (.ppn) for placing orders',
        required=True)

    parser.add_argument(
        '--menu_keyword_path',
        help='Absolute path to menu wake word model file (.ppn) for menu queries',
        required=True)

    parser.add_argument(
        '--context_path',
        help='Absolute path to Rhino context file (.rhn)',
        required=True)

    parser.add_argument(
        '--pllm_model_path',
        help='Absolute path to picoLLM model file (.pllm)',
        required=True)

    args = parser.parse_args()

    print("Restaurant Ordering Assistant")
    print("=" * 50)

    while True:
        try:
            porcupine = pvporcupine.create(
                access_key=args.access_key,
                keyword_paths=[args.order_keyword_path, args.menu_keyword_path])
        except pvporcupine.PorcupineError as e:
            print("Failed to initialize Porcupine")
            raise e

        keywords = []
        for keyword_path in [args.order_keyword_path, args.menu_keyword_path]:
            keyword_phrase_part = os.path.basename(keyword_path).replace('.ppn', '').split('_')
            if len(keyword_phrase_part) > 6:
                keywords.append(' '.join(keyword_phrase_part[0:-6]))
            else:
                keywords.append(keyword_phrase_part[0])

        print(f'Porcupine version: {porcupine.version}')

        recorder = PvRecorder(frame_length=porcupine.frame_length)
        recorder.start()

        print('Listening for wake word... (press Ctrl+C to exit)')
        print(f'  Say "{keywords[0]}" to place an order')
        print(f'  Say "{keywords[1]}" to ask about the menu')

        detected_keyword_index = -1

        try:
            while True:
                pcm = recorder.read()
                result = porcupine.process(pcm)

                if result >= 0:
                    print(f'Detected "{keywords[result]}"')
                    detected_keyword_index = result
                    break

        except KeyboardInterrupt:
            print('\nStopping...')
            recorder.stop()
            recorder.delete()
            porcupine.delete()
            break

        finally:
            recorder.stop()
            recorder.delete()
            porcupine.delete()

        if detected_keyword_index == 0:
            # Order wake word - route to Rhino for order capture
            handle_customer_order(args.access_key, args.context_path)
        elif detected_keyword_index == 1:
            # Menu wake word - route to Cheetah + picoLLM for menu queries
            handle_menu_query(args.access_key, args.pllm_model_path)


if __name__ == '__main__':
    main()

Run the Restaurant Voice Assistant

To run the restaurant voice assistant, update the model paths to match your local files and have your Picovoice AccessKey ready:

python3 restaurant_assistant.py \
  --access_key "$ACCESS_KEY" \
  --order_keyword_path ./models/hey-restaurant.ppn \
  --menu_keyword_path ./models/hey-menu.ppn \
  --context_path ./models/customer-orders.rhn \
  --pllm_model_path ./models/llama-3.2-1b-instruct-505.pllm

Example interactions demonstrating the hybrid architecture:

The hybrid architecture ensures structured commands process instantly while open-ended questions receive natural, informative responses.

Order placement:
Customer: "Hey Restaurant, I want a burger"
→ {"intent": "placeOrder", "item": "burger"}
→ "Burger added to your order"

Reservation:
Customer: "Hey Restaurant, table for four at seven"
→ {"intent": "makeReservation", "party_size": "4", "time": "seven"}
→ "Table for 4 at seven confirmed"

Check request:
Customer: "Hey Restaurant, can I get the check"
→ {"intent": "requestCheck"}
→ "I'll bring your check right away"

Menu inquiry:
Customer: "Hey Menu, what's in the house salad?"
→ Streaming transcription → Local LLM inference
→ "The house salad has mixed greens, cherry tomatoes, cucumber, and balsamic vinaigrette. It's vegan."

Dietary question:
Customer: "Hey Menu, is the salmon gluten-free?"
→ Streaming transcription → Local LLM inference
→ "Yes, the grilled salmon is gluten-free."

You can start building your own commercial or non-commercial projects using Picovoice's self-service Console.

Start Building

Frequently Asked Questions

When should developers choose deterministic speech-to-intent versus LLM-based processing for restaurant ordering?

Use speech-to-intent for structured ordering where customers place, modify, or cancel items—these map to clear POS operations with predictable data structures. Use local LLM processing for open-domain menu questions where customers may ask about ingredients, allergens, recommendations, or pairing suggestions in unpredictable ways. The architecture lets the system optimize both paths independently while maintaining a unified customer experience.

How does this architecture integrate with existing restaurant POS and order management systems?

The system outputs structured JSON for all customer orders, providing clean integration points for POS systems, online ordering platforms, and kitchen display systems. The code demonstrates where to add API calls to existing systems. Most restaurant POS systems expose REST or GraphQL APIs for order management (the assistant's JSON output maps directly to these endpoints). For menu inquiries, the LLM can be prompted with menu data from existing menu management systems or content databases.

Will the voice assistant work accurately in a busy restaurant environment?

Yes. Porcupine Wake Word, Rhino Speech-to-Intent, and Cheetah Streaming Speech-to-Text are designed to work reliably in challenging acoustic environments including busy restaurants with background music, conversations, and ambient noise. The models are trained on diverse acoustic conditions with multiple accents to ensure consistent performance during peak service hours.