Orca Streaming Text-to-Speech
C Quick Start

Platforms

Linux (x86_64)
macOS (x86_64, arm64)
Windows (x86_64, arm64)
Raspberry Pi (3, 4, 5)

Requirements

C99-compatible compiler
CMake (3.13+)
For Windows Only: MinGW is required to build the demo

Picovoice Account & AccessKey

Signup or Login to Picovoice Console to get your AccessKey. Make sure to keep your AccessKey secret.

Quick Start

Setup

Clone the repository:

git clone --recurse-submodules https://github.com/Picovoice/orca.git

Usage

Include the public header files (picovoice.h and pv_orca.h).
Link the project to an appropriate precompiled library for the target platform and load it.
Choose a voice by selecting a model file located in the Orca Streaming Text-to-Speech GitHub repository.
Construct the Orca object:

static const char* ACCESS_KEY = "${ACCESS_KEY}";

const char *model_path = "${MODEL_PATH}";

pv_orca_t *orca = NULL;

pv_status_t status = pv_orca_init(ACCESS_KEY, model_path, &orca);

if (status != PV_STATUS_SUCCESS) {
    // error handling logic
}

Create a synthesize_params object to control the synthesized speech:

pv_orca_synthesize_params_t *synthesize_params = NULL;
status = pv_orca_synthesize_params_init(&synthesize_params);
// change the default parameters of synthesize_params as desired

Orca Streaming Text-to-Speech supports two modes of operation: streaming and single synthesis. In the streaming synthesis mode, Orca processes an incoming text stream in real-time and generates audio in parallel. In the single synthesis mode, a complete text is synthesized in a single call to the Orca engine.

Streaming synthesis

To synthesize a text stream, run the following after step 5.

Create an orca_stream object using synthesize_params:

pv_orca_stream_t *orca_stream = NULL;
status = pv_orca_stream_open(orca, synthesize_params, &orca_stream);
if (status != PV_STATUS_SUCCESS) {
    // error handling logic
}

Add text chunks to orca_stream one-by-one and handle the synthesized audio:

extern char *get_next_text_chunk(void);

int32_t num_samples_chunk = 0;
int16_t *pcm_chunk = NULL;
status = pv_orca_stream_synthesize(
    orca_stream, 
    get_next_text_chunk(), 
    &num_samples_chunk, 
    &pcm_chunk);
if (status != PV_STATUS_SUCCESS) {
    // error handling logic
}
if (num_samples_chunk > 0) {
    // handle pcm_chunk
}

Once the text stream is complete, call the flush method to synthesize the remaining text:

status = pv_orca_stream_flush(orca_stream, &num_samples_chunk, &pcm_chunk);
if (status != PV_STATUS_SUCCESS) {
    // error handling logic
}
if (num_samples_chunk > 0) {
    // handle pcm_chunk
}

Once the PCM chunks are handled, make sure to release the acquired resources for each chunk with:

pv_orca_pcm_delete(pcm_chunk);

Finally, when done make sure to close the stream:

pv_orca_stream_close(orca_stream);

Single synthesis

To synthesize a complete text in a single call to Orca, run the following after step 5:

Use the Orca object and synthesize_params to synthesize speech:

static const char* text = "${TEXT}";

int32_t num_samples = 0;
int16_t *synthesized_pcm = NULL;
int32_t num_alignments = 0;
pv_orca_word_alignment_t **alignments = NULL;
status = pv_orca_synthesize(
    orca,
    text,
    synthesize_params,
    &num_samples,
    &synthesized_pcm,
    &num_alignments,
    &alignments);
if (status != PV_STATUS_SUCCESS) {
  // error handling logic
}

Release resources

Finally, when done with Orca Streaming Text-to-Speech, release resources explicitly:

pv_orca_pcm_delete(pcm);
pv_orca_word_alignments_delete(num_alignments, alignments);
pv_orca_synthesize_params_delete(synthesize_params);
pv_orca_delete(orca);

Demo

For the Orca Streaming Text-to-Speech C SDK, we offer demo applications that demonstrate how to synthesize speech with Orca.

Setup

Clone the Orca Streaming Text-to-Speech repository from GitHub using HTTPS:

git clone --recurse-submodules https://github.com/Picovoice/orca.git

Usage

Streaming synthesis demo

For the streaming synthesis demo, we simulate a response from a language model by creating a text stream from a user-defined text. After step 1. run the following:

Build the demo:

cd orca
cmake -S demo/c/ -B demo/c/build
cmake --build demo/c/build --target orca_demo_streaming

To see the usage options for the demo:

./demo/c/build/orca_demo_streaming

Run the command corresponding to your platform from the root of the repository:

./demo/c/build/orca_demo_streaming 
-l ${LIBRARY_PATH} 
-m ${MODEL_PATH} 
-a ${ACCESS_KEY} 
-t ${TEXT} 
-o ${OUTPUT_PATH}

Single synthesis demo

To synthesize a complete text in a single call to Orca Streaming Text-to-Speech, run the following after step 1:

Build the demo:

cd orca
cmake -S demo/c/ -B demo/c/build
cmake --build demo/c/build --target orca_demo

Run the command corresponding to your platform from the root of the repository:

./demo/c/build/orca_demo \
-l lib/${PLATFORM}/${ARCH}/libpv_orca.${LIB_EXTENSION} \
-m lib/common/${MODEL_FILE_PATH} \
-a ${ACCESS_KEY} \
-t ${TEXT} \
-o ${WAV_OUTPUT_PATH}

For more information on our Orca Streaming Text-to-Speech demos for C, head over to our GitHub repository.

Resources

API

C API Docs

GitHub

Orca Streaming Text-to-Speech C demo on GitHub

Was this doc helpful?

Issue with this doc?

Orca Streaming Text-to-Speech C Quick Start

Platforms

Requirements

Picovoice Account & AccessKey

Quick Start

Setup

Usage

Streaming synthesis

Single synthesis

Release resources

Demo

Setup

Usage

Streaming synthesis demo

Single synthesis demo

Resources

API

GitHub

Orca Streaming Text-to-Speech
C Quick Start