Orca Streaming Text-to-Speech
iOS API
API Reference for the Orca iOS SDK (Cocoapods).
Orca
Class for the Orca Streaming Text-to-Speech engine.
Orca can be initialized using the class constructor. Resources should be cleaned when you are
done using the delete()
method.
Orca.version
The Orca library version string.
Orca.maxCharacterLimit
Maximum number of characters allowed in a single synthesis request.
Orca.validCharacters
Set of characters supported by Orca.
Orca.sampleRate
Audio sample rate of generated audio.
Orca.init()
init
method for Orca Streaming Text-to-Speech Engine.
Parameters
accessKey
String : AccessKey obtained from Picovoice Console.modelPath
String : Absolute path to file containing model parameters (.pv
).
Returns
Orca
: An instance of Orca Streaming Text-to-Speech Engine.
Throws
Parameters
accessKey
String : AccessKey obtained from Picovoice Console.modelURL
URL : URL to file containing model parameters (.pv
).
Returns
Orca
: An instance of Orca Streaming Text-to-Speech Engine.
Throws
Orca.delete()
Releases resources acquired by Orca.
Orca.synthesize()
Generates audio from text. The returned audio contains the speech representation of the text.
Parameters
text
String : Text to be converted to audio. The maximum number of characters per call to.synthesize()
is.maxCharacterLimit
. Allowed characters can be retrieved by calling.validCharacters
. Custom pronunciations can be embedded in the text via the syntax{word|pronunciation}
. The pronunciation is expressed in ARPAbet format, e.g.: "I {live|L IH V} in {Sevilla|S EH V IY Y AH}".speechRate
Double? : Speed of generated speech. Valid values are within[0.7, 1.3]
. Higher (lower) values produce faster (slower) speech. The default is1.0
.randomState
Int64? : Random seed for the synthesis process. This can be used to ensure that the synthesized speech is deterministic across different runs.
Returns
[Int16]
,OrcaWord[]
: The generated audio, stored as a sequence of 16-bit linearly-encoded integers, and the sequence of synthesized words with their associated metadata.
Throws
Orca.synthesizeToFile()
Generates audio from text and saves it to a WAV file. The file contains the speech representation of the text.
Parameters
text
String : Text to be converted to audio. For details see the documentation ofOrca.synthesize()
.outputPath
String : Absolute path to the output audio file. The output file is saved asWAV (.wav)
and consists of a single mono channel.speechRate
Double? : Speed of generated speech. For details see the documentation ofOrca.synthesize()
.randomState
Int64? : Random seed for the synthesis process. For details see the documentation ofOrca.synthesize()
.
Returns
OrcaWord[]
: Sequence of synthesized words with their associated metadata.
Throws
Generates audio from text and saves it to a WAV file. The file contains the speech representation of the text.
Parameters
text
String : Text to be converted to audio. For details see the documentation ofOrca.synthesize()
.outputURL
URL : URL to the output audio file. The output file is saved asWAV (.wav)
and consists of a single mono channel.speechRate
Double? : Speed of generated speech. For details see the documentation ofOrca.synthesize()
.randomState
Int64? : Random seed for the synthesis process. For details see the documentation ofOrca.synthesize()
.
Throws
Orca.streamOpen()
Opens an Orca.OrcaStream
object for streaming input text synthesis.
Parameters
speechRate
Double? : Speed of speech generated byOrcaStream.synthesize
. For details see the documentation ofOrca.synthesize()
.randomState
Int64? : Random seed for the synthesis process. For details see the documentation ofOrca.synthesize()
.
Returns
Orca.OrcaStream
: An instance ofOrca.OrcaStream
.
Throws
Orca.OrcaStream
Class for the Orca OrcaStream object for input text streaming synthesis.
An Orca.OrcaStream object is initialized via the Orca.streamOpen()
method
and needs to be closed with the Orca.OrcaStream.close()
method.
Orca.OrcaStream.synthesize()
Adds a chunk of text to the Orca.OrcaStream object and generates audio if enough text has been added.
This function is expected to be called multiple times with consecutive chunks of text from a text stream.
The incoming text is buffered as it arrives until there is enough context to convert a chunk of the
buffered text into audio. The caller needs to use Orca.OrcaStream.flush()
to generate the audio chunk
for the remaining text that has not yet been synthesized.
Parameters
text
String : A chunk of text (e.g. an LLM token) from a text input stream, comprised of valid characters. For details see the documentation ofOrca.synthesize()
.
Returns
[Int16]?
: The generated audio as a sequence of 16-bit linearly-encoded integers,null
if no audio chunk has been produced.
Throws
Orca.OrcaStream.flush()
Generates audio for all the buffered text that was added to the Orca.OrcaStream object
via Orca.OrcaStream.synthesize()
.
Returns
[Int16]?
: The generated audio as a sequence of 16-bit linearly-encoded integers,null
if no audio chunk has been produced.
Throws
Orca.OrcaStream.close()
Closes the Orca.OrcaStream object and releases resources acquired by it.
OrcaError
Error thrown if an error occurs within Orca
Streaming Text-to-Speech engine.
Exceptions
OrcaWord
Struct for storing word alignment metadata returned from the Orca engine.
OrcaWord.word
The synthesized word.
OrcaWord.startSec
Start of word in seconds.
OrcaWord.endSec
End of word in seconds.
OrcaWord.phonemeArray
Phoneme metadata as an array of OrcaPhoneme objects.
OrcaPhoneme
Struct for storing word phoneme metadata returned from the Orca engine.
OrcaPhoneme.word
The synthesized phoneme.
OrcaPhoneme.startSec
Start of phoneme in seconds.
OrcaPhoneme.endSec
End of phoneme in seconds.