Orca Streaming Text-to-Speech
iOS API

API Reference for the Orca iOS SDK (Cocoapods).

Orca

public class Orca { }

Class for the Orca Streaming Text-to-Speech engine.

Orca can be initialized using the class constructor. Resources should be cleaned when you are done using the delete() method.

Orca.`getAvailableDevices()`

public static func getAvailableDevices() throws -> [String]

Retrieves a list of devices that can be specified when constructing Orca.

Returns

[String] : An array of available devices.

Throws

OrcaError: If an error occurs while retrieving the devices.

Orca.`maxCharacterLimit`

public var maxCharacterLimit: Int32

Maximum number of characters allowed in a single synthesis request.

Orca.`init()`

public init(
        accessKey: String,
        modelPath: String,
        device: String? = nil) throws

init method for Orca Streaming Text-to-Speech Engine.

Parameters

accessKey String : AccessKey obtained from Picovoice Console.
modelPath String : Absolute path to file containing model parameters (.pv).
device String? : String representation of the device (e.g., CPU or GPU) to use. If set to best, the most suitable device is selected automatically. If set to gpu, the engine uses the first available GPU device. To select a specific GPU device, set this argument to gpu:${GPU_INDEX}, where ${GPU_INDEX} is the index of the target GPU. If set to cpu, the engine will run on the CPU with the default number of threads. To specify the number of threads, set this argument to cpu:${NUM_THREADS}, where ${NUM_THREADS} is the desired number of threads.

Returns

Orca : An instance of Orca Streaming Text-to-Speech Engine.

Throws

OrcaError

public convenience init(
        accessKey: String,
        modelURL: URL,
        device: String? = nil) throws

Parameters

accessKey String : AccessKey obtained from Picovoice Console.
modelURL URL : URL to file containing model parameters (.pv).
device String? : String representation of the device (e.g., CPU or GPU) to use. If set to best, the most suitable device is selected automatically. If set to gpu, the engine uses the first available GPU device. To select a specific GPU device, set this argument to gpu:${GPU_INDEX}, where ${GPU_INDEX} is the index of the target GPU. If set to cpu, the engine will run on the CPU with the default number of threads. To specify the number of threads, set this argument to cpu:${NUM_THREADS}, where ${NUM_THREADS} is the desired number of threads.

Returns

Orca : An instance of Orca Streaming Text-to-Speech Engine.

Throws

OrcaError

Orca.`synthesize()`

public func synthesize(
  text: String,
  speechRate: Double? = nil,
  randomState: Int64? = nil
) throws -> (pcm: [Int16], wordArray: [OrcaWord])

Generates audio from text. The returned audio contains the speech representation of the text.

Parameters

text String : Text to be converted to audio. The maximum number of characters per call to .synthesize() is .maxCharacterLimit. Allowed characters can be retrieved by calling .validCharacters. Custom pronunciations can be embedded in the text via the syntax {word|pronunciation}. The pronunciation is expressed in ARPAbet format, e.g.: "I {live|L IH V} in {Sevilla|S EH V IY Y AH}".
speechRate Double? : Speed of generated speech. Valid values are within [0.7, 1.3]. Higher (lower) values produce faster (slower) speech. The default is 1.0.
randomState Int64? : Random seed for the synthesis process. This can be used to ensure that the synthesized speech is deterministic across different runs.

Returns

[Int16], OrcaWord[] : The generated audio, stored as a sequence of 16-bit linearly-encoded integers, and the sequence of synthesized words with their associated metadata.

Throws

OrcaError

Orca.`synthesizeToFile()`

public func synthesizeToFile(
  text: String,
  outputPath: String,
  speechRate: Double? = nil,
  randomState: Int64? = nil
) throws -> [OrcaWord]

Generates audio from text and saves it to a WAV file. The file contains the speech representation of the text.

Parameters

text String : Text to be converted to audio. For details see the documentation of Orca.synthesize().
outputPath String : Absolute path to the output audio file. The output file is saved as WAV (.wav) and consists of a single mono channel.
speechRate Double? : Speed of generated speech. For details see the documentation of Orca.synthesize().
randomState Int64? : Random seed for the synthesis process. For details see the documentation of Orca.synthesize().

Returns

OrcaWord[] : Sequence of synthesized words with their associated metadata.

Throws

OrcaError

public func synthesizeToFile(
  text: String,
  outputURL: URL,
  speechRate: Double? = nil,
  randomState: Int64? = nil
) throws -> [OrcaWord]

Generates audio from text and saves it to a WAV file. The file contains the speech representation of the text.

Parameters

text String : Text to be converted to audio. For details see the documentation of Orca.synthesize().
outputURL URL : URL to the output audio file. The output file is saved as WAV (.wav) and consists of a single mono channel.
speechRate Double? : Speed of generated speech. For details see the documentation of Orca.synthesize().
randomState Int64? : Random seed for the synthesis process. For details see the documentation of Orca.synthesize().

Throws

OrcaError

Orca.`streamOpen()`

public func streamOpen(speechRate: Double? = nil, randomState: Int64? = nil) throws -> OrcaStream

Opens an Orca.OrcaStream object for streaming input text synthesis.

Parameters

speechRate Double? : Speed of speech generated by OrcaStream.synthesize. For details see the documentation of Orca.synthesize().
randomState Int64? : Random seed for the synthesis process. For details see the documentation of Orca.synthesize().

Returns

Orca.OrcaStream : An instance of Orca.OrcaStream.

Throws

OrcaError

Orca.OrcaStream

public class OrcaStream { }

Class for the Orca OrcaStream object for input text streaming synthesis.

An Orca.OrcaStream object is initialized via the Orca.streamOpen() method and needs to be closed with the Orca.OrcaStream.close() method.

Orca.OrcaStream.`synthesize()`

public func synthesize(text: String) throws -> [Int16]?

Adds a chunk of text to the Orca.OrcaStream object and generates audio if enough text has been added. This function is expected to be called multiple times with consecutive chunks of text from a text stream. The incoming text is buffered as it arrives until there is enough context to convert a chunk of the buffered text into audio. The caller needs to use Orca.OrcaStream.flush() to generate the audio chunk for the remaining text that has not yet been synthesized.

Parameters

text String : A chunk of text (e.g. an LLM token) from a text input stream, comprised of valid characters. For details see the documentation of Orca.synthesize().

Returns

[Int16]? : The generated audio as a sequence of 16-bit linearly-encoded integers, null if no audio chunk has been produced.

Throws

OrcaError

Orca.OrcaStream.`flush()`

public func flush() throws -> [Int16]?

Generates audio for all the buffered text that was added to the Orca.OrcaStream object via Orca.OrcaStream.synthesize().

Returns

[Int16]? : The generated audio as a sequence of 16-bit linearly-encoded integers, null if no audio chunk has been produced.

Throws

OrcaError

Orca.OrcaStream.`close()`

public func close()

Closes the Orca.OrcaStream object and releases resources acquired by it.

OrcaError

public class OrcaError : LocalizedError { }

Error thrown if an error occurs within Orca Streaming Text-to-Speech engine.

Exceptions

public class OrcaMemoryError : OrcaError {}
public class OrcaIOError : OrcaError {}
public class OrcaInvalidArgumentError : OrcaError {}
public class OrcaStopIterationError : OrcaError {}
public class OrcaKeyError : OrcaError {}
public class OrcaInvalidStateError : OrcaError {}
public class OrcaRuntimeError : OrcaError {}
public class OrcaActivationError : OrcaError {}
public class OrcaActivationLimitError : OrcaError {}
public class OrcaActivationThrottledError : OrcaError {}
public class OrcaActivationRefusedError : OrcaError {}