Orca Streaming Text-to-Speech
C API
API Reference for the Orca C SDK.
pv_orca_t
Container representing the Orca Streaming Text-to-Speech object.
pv_orca_stream_t
Container representing the OrcaStream object to synthesizes audio from a stream of text.
pv_orca_init()
Creates an Orca instance. Resources should be cleaned when you are done using the pv_orca_delete() function.
Parameters
access_key
const char * : AccessKey obtained from Picovoice Console.model_path
const char * : Absolute path to the file containing model parameters (.pv
). This file determines the voice of the synthesized speech.object
pv_orca_t ** : Constructed instance of Orca.
Returns
- pv_status_t : Status code.
pv_orca_delete()
Releases resources acquired by Orca.
Parameters
object
pv_orca_t * : Orca object.
pv_orca_valid_characters()
Getter for the valid characters accepted as input to the Orca synthesize functions.
Parameters
object
pv_orca_t * : Orca object.num_characters
int32_t * : The number of valid characters.characters
const char *** : The array of valid characters.
Returns
- pv_status_t : Status code.
pv_orca_valid_characters_delete()
Deletes the resources acquired when calling pv_orca_valid_characters().
Parameters
characters
const char ** : The array of valid characters.
pv_orca_sample_rate()
Getter for the sample rate of the audio produced by Orca.
Parameters
object
pv_orca_t * : Orca object.sample_rate
int32_t * : Sample rate of the audio produced by Orca.
Returns
- pv_status_t : Status code.
pv_orca_max_character_limit()
Getter for the maximum number of characters that can be synthesized at once.
Parameters
object
pv_orca_t * : Orca object.max_character_limit
int32_t * : Maximum number of characters.
Returns
- pv_status_t : Status code.
pv_orca_synthesize()
Generates audio from text. The returned audio contains the speech representation of the text.
This function returns PV_STATUS_INVALID_STATE
if an OrcaStream object is open.
The memory of the returned audio and the alignment metadata is allocated by Orca and needs to be deleted with pv_orca_pcm_delete() and pv_orca_word_alignments_delete(), respectively.
If you wish to save the synthesized speech to a file, consider using pv_orca_synthesize_to_file().
Parameters
object
pv_orca_t * : Orca object.text
const char * : Text to be converted to audio. The maximum length can be attained by calling pv_orca_max_character_limit(). Allowed characters can be retrieved by calling pv_orca_valid_characters(). Custom pronunciations can be embedded in the text via the syntax "{word|pronunciation}". The pronunciation is expressed in ARPAbet format, e.g.: "I {live|L IH V} in {Sevilla|S EH V IY Y AH}".synthesize_params
pv_orca_synthesize_params_t * : Global parameters that give control over the voice generation. Seepv_orca_synthesize_params_t
for details.num_samples
int32_t * : The length of the output audio.pcm
int16_t ** : The output audio, represented as a 16-bit linearly-encoded integer array.num_alignments
int32_t * : The number of word alignments.alignments
pv_orca_word_alignment_t *** : The word alignments and their associated metadata.
Returns
- pv_status_t : Status code.
pv_orca_synthesize_to_file()
Generates audio from text and saves it to a file. The file contains the speech representation of the text.
This function returns PV_STATUS_INVALID_STATE
if an OrcaStream object is open.
The memory of the alignment metadata is allocated by Orca and needs to be deleted with pv_orca_word_alignments_delete().
Parameters
object
pv_orca_t * : Orca object.text
const char * : Text to be converted to audio. For details see the documentation of pv_orca_synthesize().synthesize_params
pv_orca_synthesize_params_t * : Global parameters that give control over the voice generation. Seepv_orca_synthesize_params_t
for details.output_path
const char * : Absolute path to save the generated audio as a single-channel 16-bit PCM WAV file.num_alignments
int32_t * : The number of word alignments.alignments
pv_orca_word_alignment_t *** : The word alignments and their associated metadata.
Returns
- pv_status_t : Status code.
pv_orca_pcm_delete()
Deletes the audio previously generated by the pv_orca_synthesize() function.
Parameters
pcm
int16_t * : The audio generated by pv_orca_synthesize().
pv_orca_word_alignments_delete()
Deletes word alignments returned from Orca synthesize functions.
Parameters
num_alignments
int32_t : Number of alignments.alignments
pv_orca_word_alignment_t ** : Alignments returned from Orca synthesize functions.
pv_orca_stream_open()
Opens an OrcaStream object to synthesize audio from a stream of text.
Parameters
object
pv_orca_t * : Orca object.synthesize_params
pv_orca_synthesize_params_t * : Global parameters that give control over the voice generation. Seepv_orca_synthesize_params_t
for details.stream
pv_orca_stream_t ** : The OrcaStream object.
Returns
- pv_status_t : Status code.
pv_orca_stream_close()
Closes the OrcaStream object and deletes the resources acquired by it.
Parameters
object
pv_orca_stream_t * : OrcaStream object.
pv_orca_stream_synthesize()
Adds a chunk of text to the OrcaStream object and generates audio if enough text has been added. This function is expected to be called multiple times with consecutive chunks of text from a text stream. The incoming text is buffered as it arrives until there is enough context to convert a chunk of the buffered text into audio. The caller needs to use pv_orca_stream_flush() to generate the audio chunk for the remaining text that has not yet been synthesized. The caller is responsible for deleting the generated audio with pv_orca_pcm_delete().
Parameters
object
pv_orca_stream_t * : The OrcaStream object.text
const char * : A chunk of text from a text input stream, comprised of valid characters. This is typically a word or a token from an LLM response. For more details on the format, see the documentation of pv_orca_synthesize().num_samples
int32_t * : The length of the pcm produced,0
if no audio chunk has been produced.pcm
int16_t ** : The output audio chunk,NULL
if no audio chunk has been produced.
Returns
- pv_status_t : Status code.
pv_orca_stream_flush()
Generates a final audio chunk corresponding to the buffered text added to the OrcaStream object via pv_orca_stream_synthesize(). The caller is responsible for deleting the generated audio with pv_orca_pcm_delete().
Parameters
object
pv_orca_stream_t * : The OrcaStream object.num_samples
int32_t * : The length of the pcm,0
if no audio chunk has been produced.pcm
int16_t ** : The output audio,NULL
if no audio chunk has been produced.
Returns
- pv_status_t : Status code.
pv_orca_version()
Getter for version.
Returns
- const char * : Orca version.
pv_orca_synthesize_params_t
Object holding global parameters that give control over the voice generation. This object is argument to the Orca synthesize functions.
An instance can be created with pv_orca_synthesize_params_init() and must be deleted with pv_orca_synthesize_params_delete().
Use pv_orca_synthesize_params_set_*
functions to set a parameter to its desired values.
Use pv_orca_synthesize_params_get_*
functions to get the current value of a parameter.
pv_orca_synthesize_params_init()
Creates a Synthesize params object. All parameters are set to their default values.
Parameters
object
pv_orca_synthesize_params_t ** : Constructed instance of Synthesize params.
Returns
- pv_status_t : Status code.
pv_orca_synthesize_params_delete()
Releases resources acquired by Synthesize params.
Parameters
object
pv_orca_synthesize_params_t * : Synthesize params object.
pv_orca_synthesize_params_set_speech_rate()
Setter for the speech rate.
Parameters
object
pv_orca_synthesize_params_t * : Synthesize params object.speech_rate
float : Speed of generated speech. Valid values are within[0.7, 1.3]
. Higher (lower) values produce faster (slower) speech. The default is1.0
.
Returns
- pv_status_t : Status code.
pv_orca_synthesize_params_get_speech_rate()
Getter for the speech rate.
Parameters
object
pv_orca_synthesize_params_t * : Synthesize params object.speech_rate
float * : Speed of generated speech. For details see the documentation of pv_orca_synthesize_params_set_speech_rate().
Returns
- pv_status_t : Status code.
pv_orca_synthesize_params_set_random_state()
Setter for the random state.
Parameters
object
pv_orca_synthesize_params_t * : Synthesize params object.random_state
int64_t : Random seed for the synthesis process. This can be used to ensure that the synthesized speech is deterministic across different runs. Valid values are all non-negative integers. If not provided, a random seed will be chosen and the synthesis process will be non-deterministic.
Returns
- pv_status_t : Status code.
pv_orca_synthesize_params_get_random_state()
Getter for the random state.
Parameters
object
pv_orca_synthesize_params_t * : Synthesize params object.random_state
int64_t * : Random seed for the synthesis process. For details see the documentation of pv_orca_synthesize_params_set_random_state().
Returns
- pv_status_t : Status code.
pv_orca_word_alignment_t
Struct for a synthesized word and its associated metadata.
pv_orca_phoneme_alignment_t
Struct for a synthesized phoneme and its associated metadata.
pv_status_t
Status code enum.
pv_status_to_string()
Parameters
status
int32_t : Status code.
Returns
- const char * : String representation of status code.
pv_get_error_stack()
If a function returns a failure (any pv_status_t other than PV_STATUS_SUCCESS
), this function can be
called to get a series of error messages related to the failure. This function can only be called only once per
failure status on another function. The memory for message_stack
must be freed using pv_free_error_stack
.
Regardless of the return status of this function, if message_stack
is not NULL
, then message_stack
contains valid memory. However, a failure status on this function indicates that future error messages
may not be reported.
Parameters
message_stack
const char * * * : Array of messages relating to the failure. Messages are NULL terminated strings. The array and messages must be freed usingpv_free_error_stack()
.message_stack_depth
int32_t * : The number of messages in themessage_stack
array.
pv_free_error_stack()
This function frees the memory used by error messages allocated by pv_get_error_stack()
.
Parameters
message_stack
const char * * : Array of messages relating to the failure.