Orca Streaming Text-to-Speech
.NET API
API Reference for the Orca .NET SDK (NuGet)
namespace: Pv
Orca.Create()
Factory method for Orca Streaming Text-to-Speech engine.
Parameters
accessKey
string : AccessKey obtained from Picovoice Console.modelPath
string : Absolute path to the file containing model parameters (.pv
). This file determines the voice of the synthesized speech.
Returns
Orca
: An instance of the Orca Streaming Text-to-Speech engine.
Throws
Orca
Class for the Orca Streaming Text-to-Speech engine.
Orca can be initialized either using the module level create()
function.
Orca.Vesion
The version string of the Orca library.
Orca.ValidCharacters
The set of valid characters that Orca accepts in the text input to the synthesis methods.
Orca.SampleRate
The audio sample rate of the synthesized speech.
Orca.MaxCharacterLimit
The maximum number of characters allowed in a single synthesis request.
Orca.Dispose()
Releases resources acquired by Orca.
Orca.Synthesize()
Generates audio from text. The returned audio contains the speech representation of the text.
If you wish to save the synthesized speech to a file, consider
using Orca.SynthesizeToFile()
.
Parameters
text
string : Text to be converted to audio. The maximum number of characters per call isMaxCharacterLimit
. Allowed characters can be retrieved by callingValidCharacters
. Custom pronunciations can be embedded in the text via the syntax "{word|pronunciation}". The pronunciation is expressed in ARPAbet phonemes, for example: "{read|R IY D} this as {read|R EH D}".speechRate
float? : Speed of generated speech. Valid values are within[0.7, 1.3]
. Higher (lower) values produce faster (slower) speech. The default is1.0
.randomState
long?: Random seed for the synthesis process. This can be used to ensure that the synthesized speech is deterministic across different runs. Valid values are all non-negative integers. If not provided, a random seed will be chosen and the synthesis process will be non-deterministic.
Returns
OrcaAudio
: Synthesized audio and word alignment metadata.
Throws
Orca.SynthesizeToFile()
Generates audio from text and saves it to a WAV file. The file contains the speech representation of the text.
Parameters
text
string : Text to be converted to audio. The maximum number of characters per call isMaxCharacterLimit
. Allowed characters can be retrieved by callingValidCharacters
. Custom pronunciations can be embedded in the text via the syntax "{word|pronunciation}". The pronunciation is expressed in ARPAbet phonemes, for example: "{read|R IY D} this as {read|R EH D}".outputPath
string : Absolute path to save the generated audio as a single-channel 16-bit PCM WAV file.speechRate
float? : Speed of generated speech. Valid values are within[0.7, 1.3]
. Higher (lower) values produce faster (slower) speech. The default is1.0
.randomState
long?: Random seed for the synthesis process. This can be used to ensure that the synthesized speech is deterministic across different runs. Valid values are all non-negative integers. If not provided, a random seed will be chosen and the synthesis process will be non-deterministic.
Returns
OrcaWord
[]: Array of synthesized words with their associated metadata.
Throws
Orca.StreamOpen()
Opens an Orca.OrcaStream
object for streaming input text synthesis.
Parameters
speechRate
float? : Speed of generated speech. Valid values are within[0.7, 1.3]
. Higher (lower) values produce faster (slower) speech. The default is1.0
.randomState
long?: Random seed for the synthesis process. This can be used to ensure that the synthesized speech is deterministic across different runs. Valid values are all non-negative integers. If not provided, a random seed will be chosen and the synthesis process will be non-deterministic.
Returns
Orca.OrcaStream
: An instance ofOrca.OrcaStream
.
Throws
Orca.OrcaStream
Class for the Orca OrcaStream object for input text streaming synthesis.
An Orca.OrcaStream object is initialized via the Orca.StreamOpen()
.
Orca.OrcaStream.Synthesize()
Adds a chunk of text to the Orca.OrcaStream object and generates audio if enough text has been added.
This function is expected to be called multiple times with consecutive chunks of text from a text stream.
The incoming text is buffered as it arrives until there is enough context to convert a chunk of the
buffered text into audio. The caller needs to use Orca.OrcaStream.Flush()
to generate the audio chunk
for the remaining text that has not yet been synthesized.
Parameters
text
string : A chunk of text (e.g. an LLM token) from a text input stream, comprised of valid characters. For details see the documentation ofOrca.Synthesize()
.
Returns
short[]
: The generated audio as a sequence of 16-bit linearly-encoded integers,null
if no audio chunk has been produced.
Throws
Orca.OrcaStream.Flush()
Generates audio for all the buffered text that was added to the Orca.OrcaStream object
via Orca.OrcaStream.Synthesize()
.
Returns
short[]
: The generated audio as a sequence of 16-bit linearly-encoded integers,null
if no audio chunk has been produced.
Throws
Orca.OrcaStream.Dispose()
Closes the Orca.OrcaStream object and releases resources acquired by it.
OrcaAudio
Class that contains audio and word alignments returned by Orca's synthesize function.
Parameters
Pcm
short[] : Synthesized speech.Words
OrcaWord[] : Word alignment metadata.
OrcaPhoneme
Class representing a phoneme synthesized by Orca and its associated metadata.
Parameters
Phoneme
str : Synthesized phoneme.StartSec
float : Start time of the phoneme in seconds.EndSec
float : End time of the phoneme in seconds.
OrcaWord
Class representing a word synthesized by Orca and its associated metadata.
Parameters
Word
String : Synthesized word.StartSec
float : Start of word in seconds.EndSec
float : End of word in seconds.Phonemes
OrcaPhoneme[] : Synthesized phonemes and their associated metadata.
OrcaException
Error thrown if an error occurs within the Orca
Text-to-Speech engine.
Exceptions