Orca Streaming Text-to-Speech 
 Node.js API
API Reference for the Orca Node.js SDK (npm)
Orca
Class for the Orca Streaming Text-to-Speech engine.
Orca can be initialized using the class constructor().
Resources should be cleaned when you are done using the release() method.
Orca.constructor()
Orca constructor.
Parameters
- accessKeystring : AccessKey obtained from Picovoice Console.
- optionsOrcaOptions: Optional configuration arguments:- modelPathstring : Path to the file containing model parameters (- .pv).
- libraryPathstring : Path to the Orca dynamic library (- .node).
 
Returns
- Orca: An instance of the Orca engine.
Orca.release()
Releases resources acquired by Orca.
Orca.version
Getter for version.
Returns
- string: Current- Orcaversion.
Orca.validCharacters
Getter for the valid characters accepted as input to the synthesize functions.
Returns
- string[]: Valid characters accepted as input to the synthesize functions.
Orca.sampleRate
Getter for the audio sample rate of the synthesized speech.
Returns
- number: Audio sample rate of the synthesized speech.
Orca.maxCharacterLimit
Getter for the maximum number of characters allowed in a single synthesis request.
Returns
- number: Maximum number of characters allowed in a single synthesis request.
Orca.synthesize()
Generates audio from text. The returned audio contains the speech representation of the text.
If you wish to save the synthesized speech to a file, consider
using Orca.synthesizeToFile().
Parameters
- textstring : Text to be converted to audio. The maximum number of characters per call is- Orca.maxCharacterLimit. Allowed characters can be retrieved by calling- Orca.validCharacters. Custom pronunciations can be embedded in the text via the syntax "{word|pronunciation}". The pronunciation is expressed in ARPAbet phonemes, for example: "{read|R IY D} this as {read|R EH D}".
- synthesizeParamsOrcaSynthesizeParams : Optional configuration arguments.- speechRatenumber : Speed of generated speech. Valid values are within [0.7, 1.3]. Higher (lower) values produce faster (slower) speech. The default is- 1.0.
- randomStatenumber: Random seed for the synthesis process. This can be used to ensure that the synthesized speech is deterministic across different runs. Valid values are all non-negative integers. If not provided, a random seed will be chosen and the synthesis process will be non-deterministic.
 
Returns
- OrcaSynthesizeResult: An object containing the generated audio as a sequence of 16-bit linearly-encoded integers and an array of- OrcaAlignmentobjects representing the word alignments.
Orca.synthesizeToFile()
Generates audio from text and saves it to a WAV file. The file contains the speech representation of the text.
Parameters
- textstring : Text to be converted to audio. For details see the documentation of- Orca.synthesize().
- outputPathstring : Absolute path to save the generated audio as a single-channel 16-bit PCM WAV file.
- synthesizeParamsOrcaSynthesizeParams : Optional configuration arguments.- speechRatenumber : Speed of generated speech. For details see the documentation of- Orca.synthesize().
- randomStatenumber : Random seed for the synthesis process. For details see the documentation of- Orca.synthesize().
 
Returns
- OrcaSynthesizeToFileResult: An array of- OrcaAlignmentobjects representing the word alignments.
Orca.streamOpen()
Opens an OrcaStream object for streaming input text synthesis.
Parameters
- synthesizeParams- OrcaSynthesizeParams: Optional configuration arguments.- speechRatenumber : Speed of generated speech. For details see the documentation of- Orca.synthesize().
- randomStatenumber : Random seed for the synthesis process. For details see the documentation of- Orca.synthesize().
 
Returns
- OrcaStream: An instance of- OrcaStream.
OrcaStream
Class for handling input text streaming synthesis.
An OrcaStream object is initialized via OrcaStream.streamOpen()
method
and needs to be closed with OrcaStream.close() method.
OrcaStream.synthesize()
Adds a chunk of text to the OrcaStream object and generates audio if enough text has been
added.
This function is expected to be called multiple times with consecutive chunks of text from a text stream.
The incoming text is buffered as it arrives until there is enough context to convert a chunk of the
buffered text into audio. The caller needs to use OrcaStream.flush() to generate
the audio chunk for the remaining text that has not yet been synthesized.
Parameters
- textstring : A chunk of text (e.g. an LLM token) from a text input stream, comprised of valid characters. For details see the documentation of- Orca.synthesize().
Returns
- OrcaStreamSynthesizeResult: The generated audio as a sequence of 16-bit linearly-encoded integers,- nullif no audio chunk has been produced.
OrcaStream.flush()
Generates audio for all buffered text that was added to the OrcaStream object
via OrcaStream.synthesize().
Returns
- OrcaStreamSynthesizeResult: The generated audio as a sequence of 16-bit linearly-encoded integers,- nullif no audio chunk has been produced.
OrcaStream.close()
Closes the OrcaStream object and releases resources acquired by it.
OrcaOptions
Orca options type.
- modelPathstring : Path to the file containing model parameters (- .pv).
- libraryPathstring : Path to the Orca dynamic library (- .node).
OrcaSynthesizeParams
Orca synthesize params type.
- speechRatenumber : Optional configuration to control the speed of the generated speech. Valid values are within [0.7, 1.3]. A higher value produces speech that is faster, and a lower value produces speech that is slower. The default is- 1.0.
- randomStatenumber : Optional configuration to set the random state for sampling during synthesis. This can be used to ensure that the synthesized speech is deterministic across different runs. Valid values are all non-negative integers. If not provided, a random seed will be chosen and the synthesis process will be non-deterministic.
OrcaAlignment
Orca word alignment type.
- wordstring : Synthesized word.
- startSecnumber : Start time of the word in seconds.
- endSecnumber : End time of the word in seconds.
- phonemesOrcaPhoneme[] : Orca phonemes.
OrcaPhoneme
Orca phoneme alignment type.
- wordstring : Synthesized phoneme.
- startSecnumber : Start time of the phoneme in seconds.
- endSecnumber : End time of the phoneme in seconds.
OrcaSynthesizeResult
Orca synthesize result type.
- pcmInt16Array : The output audio, represented as a 16-bit linearly-encoded integer array.
- alignmentsOrcaAlignment[] : Orca alignments.
OrcaSynthesizeToFileResult
Orca synthesize to file result type.
- OrcaAlignmentOrcaAlignment[] : An array of- OrcaAlignmentobjects representing the word alignments.
OrcaStreamSynthesizeResult
OrcaStream synthesize result type.
This value will be either the generated audio as a sequence of 16-bit linearly-encoded integers, or null if no audio chunk has been produced.
Errors
Exceptions thrown if an error occurs within Orca Text-to-Speech engine.
Exceptions: