Orca Streaming Text-to-Speech
.NET API
API Reference for the Orca .NET SDK (NuGet)
namespace: Pv
Orca.Create()
Factory method for Orca Streaming Text-to-Speech engine.
Parameters
accessKeystring : AccessKey obtained from Picovoice Console.modelPathstring : Absolute path to the file containing model parameters (.pv). This file determines the voice of the synthesized speech.
Returns
Orca: An instance of the Orca Streaming Text-to-Speech engine.
Throws
Orca
Class for the Orca Streaming Text-to-Speech engine.
Orca can be initialized either using the module level create() function.
Orca.Vesion
The version string of the Orca library.
Orca.ValidCharacters
The set of valid characters that Orca accepts in the text input to the synthesis methods.
Orca.SampleRate
The audio sample rate of the synthesized speech.
Orca.MaxCharacterLimit
The maximum number of characters allowed in a single synthesis request.
Orca.Dispose()
Releases resources acquired by Orca.
Orca.Synthesize()
Generates audio from text. The returned audio contains the speech representation of the text.
If you wish to save the synthesized speech to a file, consider
using Orca.SynthesizeToFile().
Parameters
textstring : Text to be converted to audio. The maximum number of characters per call isMaxCharacterLimit. Allowed characters can be retrieved by callingValidCharacters. Custom pronunciations can be embedded in the text via the syntax "{word|pronunciation}". The pronunciation is expressed in ARPAbet phonemes, for example: "{read|R IY D} this as {read|R EH D}".speechRatefloat? : Speed of generated speech. Valid values are within[0.7, 1.3]. Higher (lower) values produce faster (slower) speech. The default is1.0.randomStatelong?: Random seed for the synthesis process. This can be used to ensure that the synthesized speech is deterministic across different runs. Valid values are all non-negative integers. If not provided, a random seed will be chosen and the synthesis process will be non-deterministic.
Returns
OrcaAudio: Synthesized audio and word alignment metadata.
Throws
Orca.SynthesizeToFile()
Generates audio from text and saves it to a WAV file. The file contains the speech representation of the text.
Parameters
textstring : Text to be converted to audio. The maximum number of characters per call isMaxCharacterLimit. Allowed characters can be retrieved by callingValidCharacters. Custom pronunciations can be embedded in the text via the syntax "{word|pronunciation}". The pronunciation is expressed in ARPAbet phonemes, for example: "{read|R IY D} this as {read|R EH D}".outputPathstring : Absolute path to save the generated audio as a single-channel 16-bit PCM WAV file.speechRatefloat? : Speed of generated speech. Valid values are within[0.7, 1.3]. Higher (lower) values produce faster (slower) speech. The default is1.0.randomStatelong?: Random seed for the synthesis process. This can be used to ensure that the synthesized speech is deterministic across different runs. Valid values are all non-negative integers. If not provided, a random seed will be chosen and the synthesis process will be non-deterministic.
Returns
OrcaWord[]: Array of synthesized words with their associated metadata.
Throws
Orca.StreamOpen()
Opens an Orca.OrcaStream object for streaming input text synthesis.
Parameters
speechRatefloat? : Speed of generated speech. Valid values are within[0.7, 1.3]. Higher (lower) values produce faster (slower) speech. The default is1.0.randomStatelong?: Random seed for the synthesis process. This can be used to ensure that the synthesized speech is deterministic across different runs. Valid values are all non-negative integers. If not provided, a random seed will be chosen and the synthesis process will be non-deterministic.
Returns
Orca.OrcaStream: An instance ofOrca.OrcaStream.
Throws
Orca.OrcaStream
Class for the Orca OrcaStream object for input text streaming synthesis.
An Orca.OrcaStream object is initialized via the Orca.StreamOpen().
Orca.OrcaStream.Synthesize()
Adds a chunk of text to the Orca.OrcaStream object and generates audio if enough text has been added.
This function is expected to be called multiple times with consecutive chunks of text from a text stream.
The incoming text is buffered as it arrives until there is enough context to convert a chunk of the
buffered text into audio. The caller needs to use Orca.OrcaStream.Flush() to generate the audio chunk
for the remaining text that has not yet been synthesized.
Parameters
textstring : A chunk of text (e.g. an LLM token) from a text input stream, comprised of valid characters. For details see the documentation ofOrca.Synthesize().
Returns
short[]: The generated audio as a sequence of 16-bit linearly-encoded integers,nullif no audio chunk has been produced.
Throws
Orca.OrcaStream.Flush()
Generates audio for all the buffered text that was added to the Orca.OrcaStream object
via Orca.OrcaStream.Synthesize().
Returns
short[]: The generated audio as a sequence of 16-bit linearly-encoded integers,nullif no audio chunk has been produced.
Throws
Orca.OrcaStream.Dispose()
Closes the Orca.OrcaStream object and releases resources acquired by it.
OrcaAudio
Class that contains audio and word alignments returned by Orca's synthesize function.
Parameters
Pcmshort[] : Synthesized speech.WordsOrcaWord[] : Word alignment metadata.
OrcaPhoneme
Class representing a phoneme synthesized by Orca and its associated metadata.
Parameters
Phonemestr : Synthesized phoneme.StartSecfloat : Start time of the phoneme in seconds.EndSecfloat : End time of the phoneme in seconds.
OrcaWord
Class representing a word synthesized by Orca and its associated metadata.
Parameters
WordString : Synthesized word.StartSecfloat : Start of word in seconds.EndSecfloat : End of word in seconds.PhonemesOrcaPhoneme[] : Synthesized phonemes and their associated metadata.
OrcaException
Error thrown if an error occurs within the Orca Text-to-Speech engine.
Exceptions