Picovoice Wordmark
Start Building
Introduction
Introduction
AndroidC.NETiOSLinuxmacOSNode.jsPythonRaspberry PiWebWindows
AndroidC.NETiOSNode.jsPythonWeb
SummaryPicovoice picoLLMGPTQ
Introduction
AndroidC.NETFlutteriOSJavaLinuxmacOSNode.jsPythonRaspberry PiReactReact NativeRustWebWindows
AndroidC.NETFlutteriOSJavaNode.jsPythonReactReact NativeRustWeb
SummaryPicovoice LeopardAmazon TranscribeAzure Speech-to-TextGoogle ASRGoogle ASR (Enhanced)IBM Watson Speech-to-TextWhisper Speech-to-Text
FAQ
Introduction
AndroidC.NETFlutteriOSJavaLinuxmacOSNode.jsPythonRaspberry PiReactReact NativeRustWebWindows
AndroidC.NETFlutteriOSJavaNode.jsPythonReactReact NativeRustWeb
SummaryPicovoice Cheetah
FAQ
Introduction
AndroidC.NETiOSLinuxmacOSNode.jsPythonRaspberry PiWebWindows
AndroidC.NETiOSNode.jsPythonWeb
SummaryAmazon PollyAzure TTSElevenLabsOpenAI TTSPicovoice Orca
Introduction
AndroidCiOSLinuxmacOSPythonRaspberry PiWebWindows
AndroidCiOSPythonWeb
SummaryPicovoice KoalaMozilla RNNoise
Introduction
AndroidCiOSLinuxmacOSNode.jsPythonRaspberry PiWebWindows
AndroidCNode.jsPythoniOSWeb
SummaryPicovoice EaglepyannoteSpeechBrainWeSpeaker
Introduction
AndroidCiOSLinuxmacOSPythonRaspberry PiWebWindows
AndroidCiOSPythonWeb
SummaryPicovoice FalconAmazon TranscribeAzure Speech-to-TextGoogle Speech-to-Textpyannote
Introduction
AndroidArduinoCChrome.NETEdgeFirefoxFlutteriOSJavaLinuxmacOSMicrocontrollerNode.jsPythonRaspberry PiReactReact NativeRustSafariUnityWebWindows
AndroidC.NETFlutteriOSJavaMicrocontrollerNode.jsPythonReactReact NativeRustUnityWeb
SummaryPorcupineSnowboyPocketSphinx
Wake Word TipsFAQ
Introduction
AndroidCChrome.NETEdgeFirefoxFlutteriOSJavaLinuxmacOSNode.jsPythonRaspberry PiReactReact NativeRustSafariUnityWebWindows
AndroidC.NETFlutteriOSJavaNode.jsPythonReactReact NativeRustUnityWeb
SummaryPicovoice RhinoGoogle DialogflowAmazon LexIBM WatsonMicrosoft LUIS
Expression SyntaxFAQ
Introduction
AndroidC.NETiOSLinuxmacOSNode.jsPythonRaspberry PiRustWebWindows
AndroidC.NETiOSNode.jsPythonRustWeb
SummaryPicovoice CobraWebRTC VAD
FAQ
Introduction
AndroidC.NETFlutteriOSNode.jsPythonReact NativeRustUnityWeb
AndroidC.NETFlutteriOSNode.jsPythonReact NativeRustUnityWeb
Introduction
C.NETNode.jsPython
C.NETNode.jsPython
FAQGlossary

Orca Streaming Text-to-Speech
iOS API

API Reference for the Orca iOS SDK (Cocoapods).


Orca

public class Orca { }

Class for the Orca Streaming Text-to-Speech engine.

Orca can be initialized using the class constructor. Resources should be cleaned when you are done using the delete() method.


Orca.version

public static let version: String

The Orca library version string.


Orca.maxCharacterLimit

public var maxCharacterLimit: Int32

Maximum number of characters allowed in a single synthesis request.


Orca.validCharacters

public var validCharacters: [String]

Set of characters supported by Orca.


Orca.sampleRate

public var sampleRate: Int32

Audio sample rate of generated audio.


Orca.init()

public init(accessKey: String, modelPath: String) throws

init method for Orca Streaming Text-to-Speech Engine.

Parameters

  • accessKey String : AccessKey obtained from Picovoice Console.
  • modelPath String : Absolute path to file containing model parameters (.pv).

Returns

  • Orca : An instance of Orca Streaming Text-to-Speech Engine.

Throws

  • OrcaError
public convenience init(accessKey: String, modelURL: URL) throws

Parameters

  • accessKey String : AccessKey obtained from Picovoice Console.
  • modelURL URL : URL to file containing model parameters (.pv).

Returns

  • Orca : An instance of Orca Streaming Text-to-Speech Engine.

Throws

  • OrcaError

Orca.delete()

public func delete()

Releases resources acquired by Orca.


Orca.synthesize()

public func synthesize(
text: String,
speechRate: Double? = nil,
randomState: Int64? = nil
) throws -> (pcm: [Int16], wordArray: [OrcaWord])

Generates audio from text. The returned audio contains the speech representation of the text.

Parameters

  • text String : Text to be converted to audio. The maximum number of characters per call to .synthesize() is .maxCharacterLimit. Allowed characters can be retrieved by calling .validCharacters. Custom pronunciations can be embedded in the text via the syntax {word|pronunciation}. The pronunciation is expressed in ARPAbet format, e.g.: "I {live|L IH V} in {Sevilla|S EH V IY Y AH}".
  • speechRate Double? : Speed of generated speech. Valid values are within [0.7, 1.3]. Higher (lower) values produce faster (slower) speech. The default is 1.0.
  • randomState Int64? : Random seed for the synthesis process. This can be used to ensure that the synthesized speech is deterministic across different runs.

Returns

  • [Int16], OrcaWord[] : The generated audio, stored as a sequence of 16-bit linearly-encoded integers, and the sequence of synthesized words with their associated metadata.

Throws

  • OrcaError

Orca.synthesizeToFile()

public func synthesizeToFile(
text: String,
outputPath: String,
speechRate: Double? = nil,
randomState: Int64? = nil
) throws -> [OrcaWord]

Generates audio from text and saves it to a WAV file. The file contains the speech representation of the text.

Parameters

  • text String : Text to be converted to audio. For details see the documentation of Orca.synthesize().
  • outputPath String : Absolute path to the output audio file. The output file is saved as WAV (.wav) and consists of a single mono channel.
  • speechRate Double? : Speed of generated speech. For details see the documentation of Orca.synthesize().
  • randomState Int64? : Random seed for the synthesis process. For details see the documentation of Orca.synthesize().

Returns

  • OrcaWord[] : Sequence of synthesized words with their associated metadata.

Throws

  • OrcaError
public func synthesizeToFile(
text: String,
outputURL: URL,
speechRate: Double? = nil,
randomState: Int64? = nil
) throws -> [OrcaWord]

Generates audio from text and saves it to a WAV file. The file contains the speech representation of the text.

Parameters

  • text String : Text to be converted to audio. For details see the documentation of Orca.synthesize().
  • outputURL URL : URL to the output audio file. The output file is saved as WAV (.wav) and consists of a single mono channel.
  • speechRate Double? : Speed of generated speech. For details see the documentation of Orca.synthesize().
  • randomState Int64? : Random seed for the synthesis process. For details see the documentation of Orca.synthesize().

Throws

  • OrcaError

Orca.streamOpen()

public func streamOpen(speechRate: Double? = nil, randomState: Int64? = nil) throws -> OrcaStream

Opens an Orca.OrcaStream object for streaming input text synthesis.

Parameters

  • speechRate Double? : Speed of speech generated by OrcaStream.synthesize. For details see the documentation of Orca.synthesize().
  • randomState Int64? : Random seed for the synthesis process. For details see the documentation of Orca.synthesize().

Returns

  • Orca.OrcaStream : An instance of Orca.OrcaStream.

Throws

  • OrcaError

Orca.OrcaStream

public class OrcaStream { }

Class for the Orca OrcaStream object for input text streaming synthesis.

An Orca.OrcaStream object is initialized via the Orca.streamOpen() method and needs to be closed with the Orca.OrcaStream.close() method.


Orca.OrcaStream.synthesize()

public func synthesize(text: String) throws -> [Int16]?

Adds a chunk of text to the Orca.OrcaStream object and generates audio if enough text has been added. This function is expected to be called multiple times with consecutive chunks of text from a text stream. The incoming text is buffered as it arrives until there is enough context to convert a chunk of the buffered text into audio. The caller needs to use Orca.OrcaStream.flush() to generate the audio chunk for the remaining text that has not yet been synthesized.

Parameters

  • text String : A chunk of text (e.g. an LLM token) from a text input stream, comprised of valid characters. For details see the documentation of Orca.synthesize().

Returns

  • [Int16]? : The generated audio as a sequence of 16-bit linearly-encoded integers, null if no audio chunk has been produced.

Throws

  • OrcaError

Orca.OrcaStream.flush()

public func flush() throws -> [Int16]?

Generates audio for all the buffered text that was added to the Orca.OrcaStream object via Orca.OrcaStream.synthesize().

Returns

  • [Int16]? : The generated audio as a sequence of 16-bit linearly-encoded integers, null if no audio chunk has been produced.

Throws

  • OrcaError

Orca.OrcaStream.close()

public func close()

Closes the Orca.OrcaStream object and releases resources acquired by it.


OrcaError

public class OrcaError : LocalizedError { }

Error thrown if an error occurs within Orca Streaming Text-to-Speech engine.

Exceptions

public class OrcaMemoryError : OrcaError {}
public class OrcaIOError : OrcaError {}
public class OrcaInvalidArgumentError : OrcaError {}
public class OrcaStopIterationError : OrcaError {}
public class OrcaKeyError : OrcaError {}
public class OrcaInvalidStateError : OrcaError {}
public class OrcaRuntimeError : OrcaError {}
public class OrcaActivationError : OrcaError {}
public class OrcaActivationLimitError : OrcaError {}
public class OrcaActivationThrottledError : OrcaError {}
public class OrcaActivationRefusedError : OrcaError {}

OrcaWord

public struct OrcaWord { }

Struct for storing word alignment metadata returned from the Orca engine.


OrcaWord.word

OrcaWord.word: String

The synthesized word.


OrcaWord.startSec

OrcaWord.startSec: Float

Start of word in seconds.


OrcaWord.endSec

OrcaWord.endSec: Float

End of word in seconds.


OrcaWord.phonemeArray

OrcaWord.phonemeArray: [OrcaPhoneme]

Phoneme metadata as an array of OrcaPhoneme objects.


OrcaPhoneme

public struct OrcaPhoneme { }

Struct for storing word phoneme metadata returned from the Orca engine.


OrcaPhoneme.word

OrcaPhoneme.phoneme: String

The synthesized phoneme.


OrcaPhoneme.startSec

OrcaPhoneme.startSec: Float

Start of phoneme in seconds.


OrcaPhoneme.endSec

OrcaPhoneme.endSec: Float

End of phoneme in seconds.

Was this doc helpful?

Issue with this doc?

Report a GitHub Issue
Orca Streaming Text-to-Speech iOS API
  • Orca
  • version
  • maxCharacterLimit
  • validCharacters
  • sampleRate
  • init()
  • delete()
  • synthesize()
  • synthesizeToFile()
  • streamOpen()
  • Orca.OrcaStream
  • synthesize()
  • flush()
  • close()
  • OrcaError
  • OrcaWord
  • word
  • startSec
  • endSec
  • phonemeArray
  • OrcaPhoneme
  • word
  • startSec
  • endSec
Voice AI
  • Leopard Speech-to-Text
  • Cheetah Streaming Speech-to-Text
  • Orca Text-to-Speech
  • Koala Noise Suppression
  • Eagle Speaker Recognition
  • Falcon Speaker Diarization
  • Porcupine Wake Word
  • Rhino Speech-to-Intent
  • Cobra Voice Activity Detection
Local LLM
  • picoLLM Inference
  • picoLLM Compression
  • picoLLM GYM
Resources
  • Docs
  • Console
  • Blog
  • Use Cases
  • Playground
Sales & Services
  • Consulting
  • Foundation Plan
  • Enterprise Plan
  • Enterprise Support
Company
  • About us
  • Careers
Follow Picovoice
  • LinkedIn
  • GitHub
  • X
  • YouTube
  • AngelList
Subscribe to our newsletter
Terms of Use
Privacy Policy
© 2019-2025 Picovoice Inc.