Orca Streaming Text-to-Speech
Python Quick Start

Platforms

Linux (x86_64)
macOS (x86_64, arm64)
Windows (x86_64, arm64)
Raspberry Pi (3, 4, 5)

Requirements

Picovoice Account & AccessKey
Python 3.9+

Picovoice Account & AccessKey

Signup or Login to Picovoice Console to get your AccessKey. Make sure to keep your AccessKey secret.

Quick Start

Setup

Install Python 3 (3.8 or higher).
Install the pvorca Python package using PIP:

pip3 install pvorca

Usage

Create an instance of the Orca engine and synthesize speech:

import pvorca

orca = pvorca.create(access_key='${ACCESS_KEY}')

Orca supports two modes of operation: streaming and single synthesis. In the streaming synthesis mode, Orca processes an incoming text stream in real-time and generates audio in parallel. In the single synthesis mode, a complete text is synthesized in a single call to the Orca engine.

Streaming synthesis

To synthesize a text stream, create an Orca.OrcaStream object and add text chunks to it one-by-one:

stream = orca.stream_open()

for text_chunk in text_generator():
    pcm = stream.synthesize(text_chunk)
    if pcm is not None:
        # handle pcm

The text_generator() function can be any stream generating text, such as an LLM response.

The Orca.OrcaStream object buffers input text until there is enough context to generate audio. If there is not enough text to generate audio, None is returned.

Once the text stream is complete, call the flush method to synthesize the remaining text:

pcm = stream.flush()
if pcm is not None:
    # handle pcm

When done with streaming text synthesis, the Orca.OrcaStream object needs to be closed:

stream.close()

Single synthesis

Synthesize speech with a single call to one of the Orca.synthesize methods:

# Return raw PCM
pcm, alignments = orca.synthesize(text='${TEXT}')

# Save the generated audio to a WAV file directly
alignments = orca.synthesize_to_file(text='${TEXT}', path='${WAV_OUTPUT_PATH}')

Free resources used by Orca:

orca.delete()

For more information on our Orca Python SDK, head over to our Orca GitHub repository.

Model File

Orca Streaming Text-to-Speech can synthesize speech in different languages and with a variety of voices, each of which is characterized by a model file (.pv) located in the Orca GitHub repository. The language and gender of the speaker is indicated in the file name.

To create an instance of the engine with a specific language/voice, use:

orca = pvorca.create(access_key='${ACCESS_KEY}', model_path='${MODEL_PATH}')

and replace ${MODEL_PATH} with the path to the model file with the desired language/voice.

Demos

For the Orca Python SDK, we offer a demo application that demonstrates how to use the Text-to-Speech engine.

Setup

Install the pvorcademo Python package using PIP:

pip3 install pvorcademo

This package installs command-line utilities for the Orca Python demos.

Usage

Use the --help flag to see the usage options for the demos, e.g.:

orca_demo_streaming --help

Streaming synthesis demo

This demo showcases how Orca processes an incoming text stream in real-time and generates audio in parallel. The synthesized audio is played as soon as it is generated.

orca_demo_streaming --access_key ${ACCESS_KEY} --text_to_stream ${TEXT}

Single synthesis demo

To synthesize a text and save it to a WAV file, run the following:

orca_demo --access_key ${ACCESS_KEY} --text ${TEXT} --output_path ${WAV_OUTPUT_PATH}

For more information on our Orca demo for Python, head over to our GitHub repository.

Resources

Packages

API

pvorca Python API Docs

GitHub

Benchmark

Text-to-Speech Benchmark

Was this doc helpful?

Issue with this doc?

Orca Streaming Text-to-Speech Python Quick Start

Platforms

Requirements

Picovoice Account & AccessKey

Quick Start

Setup

Usage

Streaming synthesis

Single synthesis

Model File

Demos

Setup

Usage

Streaming synthesis demo

Single synthesis demo

Resources

Packages

API

GitHub

Benchmark

Orca Streaming Text-to-Speech
Python Quick Start