Picovoice Wordmark
Start Building
Introduction
Introduction
AndroidC.NETiOSLinuxmacOSNode.jsPythonRaspberry PiWebWindows
AndroidC.NETiOSNode.jsPythonWeb
SummaryPicovoice picoLLMGPTQ
Introduction
AndroidC.NETFlutteriOSJavaLinuxmacOSNode.jsPythonRaspberry PiReactReact NativeRustWebWindows
AndroidC.NETFlutteriOSJavaNode.jsPythonReactReact NativeRustWeb
SummaryPicovoice LeopardAmazon TranscribeAzure Speech-to-TextGoogle ASRGoogle ASR (Enhanced)IBM Watson Speech-to-TextWhisper Speech-to-Text
FAQ
Introduction
AndroidC.NETFlutteriOSJavaLinuxmacOSNode.jsPythonRaspberry PiReactReact NativeRustWebWindows
AndroidC.NETFlutteriOSJavaNode.jsPythonReactReact NativeRustWeb
SummaryPicovoice Cheetah
FAQ
Introduction
AndroidC.NETiOSLinuxmacOSNode.jsPythonRaspberry PiWebWindows
AndroidC.NETiOSNode.jsPythonWeb
SummaryAmazon PollyAzure TTSElevenLabsOpenAI TTSPicovoice Orca
Introduction
AndroidCiOSLinuxmacOSPythonRaspberry PiWebWindows
AndroidCiOSPythonWeb
SummaryPicovoice KoalaMozilla RNNoise
Introduction
AndroidCiOSLinuxmacOSNode.jsPythonRaspberry PiWebWindows
AndroidCNode.jsPythoniOSWeb
SummaryPicovoice EaglepyannoteSpeechBrainWeSpeaker
Introduction
AndroidCiOSLinuxmacOSPythonRaspberry PiWebWindows
AndroidCiOSPythonWeb
SummaryPicovoice FalconAmazon TranscribeAzure Speech-to-TextGoogle Speech-to-Textpyannote
Introduction
AndroidArduinoCChrome.NETEdgeFirefoxFlutteriOSJavaLinuxmacOSMicrocontrollerNode.jsPythonRaspberry PiReactReact NativeRustSafariUnityWebWindows
AndroidC.NETFlutteriOSJavaMicrocontrollerNode.jsPythonReactReact NativeRustUnityWeb
SummaryPorcupineSnowboyPocketSphinx
Wake Word TipsFAQ
Introduction
AndroidCChrome.NETEdgeFirefoxFlutteriOSJavaLinuxmacOSNode.jsPythonRaspberry PiReactReact NativeRustSafariUnityWebWindows
AndroidC.NETFlutteriOSJavaNode.jsPythonReactReact NativeRustUnityWeb
SummaryPicovoice RhinoGoogle DialogflowAmazon LexIBM WatsonMicrosoft LUIS
Expression SyntaxFAQ
Introduction
AndroidC.NETiOSLinuxmacOSNode.jsPythonRaspberry PiRustWebWindows
AndroidC.NETiOSNode.jsPythonRustWeb
SummaryPicovoice CobraWebRTC VAD
FAQ
Introduction
AndroidC.NETFlutteriOSNode.jsPythonReact NativeRustUnityWeb
AndroidC.NETFlutteriOSNode.jsPythonReact NativeRustUnityWeb
Introduction
C.NETNode.jsPython
C.NETNode.jsPython
FAQGlossary

Orca Streaming Text-to-Speech
Python API

API Reference for the Python Orca SDK (PyPI).


pvorca.create()

def create(
access_key: str,
model_path: Optional[str] = None,
library_path: Optional[str] = None) -> Orca

Factory method for Orca Streaming Text-to-Speech engine.

Parameters

  • access_key str : AccessKey obtained from Picovoice Console.
  • model_path Optional[str] : Absolute path to the file containing model parameters (.pv). This file determines the voice of the synthesized speech.
  • library_path Optional[str] : Absolute path to Orca's dynamic library.

Returns

  • Orca : An instance of the Orca Streaming Text-to-Speech engine.

Throws

  • OrcaError

pvorca.Orca

class Orca(object)

Class for the Orca Streaming Text-to-Speech engine. Orca can be initialized either using the module level create() function or directly using the class __init__() method. Resources should be cleaned when you are done using the delete() method.


pvorca.Orca.version

self.version: str

The version string of the Orca library.


pvorca.Orca.valid_characters

self.valid_characters: Set[str]

The set of valid characters that Orca accepts in the text input to the synthesis methods.


pvorca.Orca.sample_rate

self.sample_rate: int

The audio sample rate of the synthesized speech.


pvorca.Orca.max_character_limit

self.max_character_limit: int

The maximum number of characters allowed in a single synthesis request.


pvorca.Orca.__init__()

def __init__(
self,
access_key: str,
model_path: str,
library_path: str) -> Orca

__init__ method for Orca Streaming Text-to-Speech engine.

Parameters

  • access_key str : AccessKey obtained from Picovoice Console.
  • model_path str : Absolute path to the file containing model parameters (.pv). This file determines the voice of the synthesized speech.
  • library_path str : Absolute path to Orca's dynamic library.

Returns

  • Orca: An instance of the Orca Streaming Text-to-Speech engine.

Throws

  • OrcaError

pvorca.Orca.delete()

def delete(self)

Releases resources acquired by Orca.


pvorca.Orca.synthesize()

def synthesize(
self,
text: str,
speech_rate: Optional[float] = None,
random_state: Optional[int] = None) -> Tuple[Sequence[int], Sequence[WordAlignment]]

Generates audio from text. The returned audio contains the speech representation of the text.

If you wish to save the synthesized speech to a file, consider using Orca.synthesize_to_file().

Parameters

  • text str : Text to be converted to audio. The maximum number of characters per call is self.max_character_limit. Allowed characters can be retrieved by calling self.pv_orca_valid_characters. Custom pronunciations can be embedded in the text via the syntax "{word|pronunciation}". The pronunciation is expressed in ARPAbet phonemes, for example: "{read|R IY D} this as {read|R EH D}".
  • speech_rate Optional[float] : Speed of generated speech. Valid values are within [0.7, 1.3]. Higher (lower) values produce faster (slower) speech. The default is 1.0.
  • random_state Optional[int]: Random seed for the synthesis process. This can be used to ensure that the synthesized speech is deterministic across different runs. Valid values are all non-negative integers. If not provided, a random seed will be chosen and the synthesis process will be non-deterministic.

Returns

  • Tuple[Sequence[int], Sequence[WordAlignment]] : A tuple containing the generated audio as a sequence of 16-bit linearly-encoded integers and a sequence of WordAlignment objects representing the word alignments.

Throws

  • OrcaError

pvorca.Orca.synthesize_to_file()

def synthesize_to_file(
self,
text: str,
output_path: str,
speech_rate: Optional[float] = None,
random_state: Optional[int] = None) -> Sequence[WordAlignment]

Generates audio from text and saves it to a WAV file. The file contains the speech representation of the text.

Parameters

  • text str : Text to be converted to audio. For details see the documentation of Orca.synthesize().
  • output_path str : Absolute path to save the generated audio as a single-channel 16-bit PCM WAV file.
  • speech_rate Optional[float] : Speed of generated speech. For details see the documentation of Orca.synthesize().
  • random_state Optional[int] : Random seed for the synthesis process. For details see the documentation of Orca.synthesize().

Returns

  • Sequence[WordAlignment] : A sequence of WordAlignment objects representing the word alignments.

Throws

  • OrcaError

pvorca.Orca.stream_open()

def stream_open(
self,
speech_rate: Optional[float] = None,
random_state: Optional[int] = None) -> Orca.OrcaStream

Opens an Orca.OrcaStream object for streaming input text synthesis.

Parameters

  • speech_rate Optional[float] : Speed of speech generated by OrcaStream.synthesize(). For details see the documentation of Orca.synthesize().
  • random_state Optional[int] : Random seed for the synthesis process. For details see the documentation of Orca.synthesize().

Returns

  • Orca.OrcaStream : An instance of Orca.OrcaStream.

Throws

  • OrcaError

pvorca.Orca.WordAlignment

WordAlignment = namedtuple('Word', ['word', 'start_sec', 'end_sec', 'phonemes'])

Metadata representing the alignment of a word in the synthesized audio.

  • word str : Synthesized word.
  • start_sec float : Start time of the word in seconds.
  • end_sec float : End time of the word in seconds.
  • phonemes List[PhonemeAlignment] : List of phoneme alignments for the word.

pvorca.Orca.PhonemeAlignment

PhonemeAlignment = namedtuple('Phoneme', ['phoneme', 'start_sec', 'end_sec'])

Metadata representing the alignment of a phoneme in the synthesized audio.

  • phoneme str : Synthesized phoneme.
  • start_sec float : Start time of the phoneme in seconds.
  • end_sec float : End time of the phoneme in seconds.

pvorca.Orca.OrcaStream

class Orca.OrcaStream(object)

Class for handling input text streaming synthesis. An Orca.OrcaStream object is initialized via Orca.stream_open() method and needs to be closed with Orca.OrcaStream.close() method.


pvorca.Orca.OrcaStream.synthesize()

def synthesize(
self,
text: str) -> Optional[Sequence[int]]

Adds a chunk of text to the Orca.OrcaStream object and generates audio if enough text has been added. This function is expected to be called multiple times with consecutive chunks of text from a text stream. The incoming text is buffered as it arrives until there is enough context to convert a chunk of the buffered text into audio. The caller needs to use Orca.OrcaStream.flush() to generate the audio chunk for the remaining text that has not yet been synthesized.

Parameters

  • text str : A chunk of text (e.g. an LLM token) from a text input stream, comprised of valid characters. For details see the documentation of Orca.synthesize().

Returns

  • Optional[Sequence[int]] : The generated audio as a sequence of 16-bit linearly-encoded integers, None if no audio chunk has been produced.

Throws

  • OrcaError

pvorca.Orca.OrcaStream.flush()

def flush(self) -> Optional[Sequence[int]]

Generates audio for all the buffered text that was added to the Orca.OrcaStream object via Orca.OrcaStream.synthesize().

Returns

  • Optional[Sequence[int]] : The generated audio as a sequence of 16-bit linearly-encoded integers, None if no audio chunk has been produced.

Throws

  • OrcaError

pvorca.Orca.OrcaStream.close()

def close(self)

Closes the Orca.OrcaStream object and releases resources acquired by it.


pvorca.OrcaError

class OrcaError(Exception)

Error thrown if an error occurs within the Orca Text-to-Speech engine.

Exceptions

class OrcaActivationError(OrcaError)
class OrcaActivationLimitError(OrcaError)
class OrcaActivationRefusedError(OrcaError)
class OrcaActivationThrottledError(OrcaError)
class OrcaIOError(OrcaError)
class OrcaInvalidArgumentError(OrcaError)
class OrcaInvalidStateError(OrcaError)
class OrcaKeyError(OrcaError)
class OrcaMemoryError(OrcaError)
class OrcaRuntimeError(OrcaError)
class OrcaStopIterationError(OrcaError)

Was this doc helpful?

Issue with this doc?

Report a GitHub Issue
Orca Streaming Text-to-Speech Python API
  • pvorca.create()
  • pvorca.Orca
  • version
  • valid_characters
  • sample_rate
  • max_character_limit
  • __init__()
  • delete()
  • synthesize()
  • synthesize_to_file()
  • stream_open()
  • pvorca.Orca.WordAlignment
  • pvorca.Orca.PhonemeAlignment
  • pvorca.Orca.OrcaStream
  • synthesize()
  • flush()
  • close()
  • pvorca.OrcaError
Voice AI
  • Leopard Speech-to-Text
  • Cheetah Streaming Speech-to-Text
  • Orca Text-to-Speech
  • Koala Noise Suppression
  • Eagle Speaker Recognition
  • Falcon Speaker Diarization
  • Porcupine Wake Word
  • Rhino Speech-to-Intent
  • Cobra Voice Activity Detection
Local LLM
  • picoLLM Inference
  • picoLLM Compression
  • picoLLM GYM
Resources
  • Docs
  • Console
  • Blog
  • Use Cases
  • Playground
Sales & Services
  • Consulting
  • Foundation Plan
  • Enterprise Plan
  • Enterprise Support
Company
  • About us
  • Careers
Follow Picovoice
  • LinkedIn
  • GitHub
  • X
  • YouTube
  • AngelList
Subscribe to our newsletter
Terms of Use
Privacy Policy
© 2019-2025 Picovoice Inc.