Leopard Speech-to-Text
iOS API

API Reference for the iOS Leopard SDK (Cocoapod)

Leopard

public class Leopard { }

Class for the Leopard Speech-to-Text engine. Resources should be cleaned when you are done using the delete() function.

Leopard.`init()`

init method for Leopard Speech-to-Text engine with a mixture of arguments.

public init(accessKey: String, modelPath: String, enableAutomaticPunctuation: Bool = false) throws -> Leopard

Parameters

accessKey String : The AccessKey obtained from Picovoice Console.
modelPath String : Absolute path to file containing model parameters (.pv).
enableAutomaticPunctuation Bool : Set to true to enable automatic punctuation insertion.
enableDiarization Bool : Set to true to enable speaker diarization, which allows Leopard to differentiate speakers as part of the transcription process. Word metadata will include a speaker_tag to identify unique speakers.

Throws

LeopardError: If an error occurs while creating an instance of Leopard Speech-to-Text engine.

public init(accessKey: String, modelURL: URL, enableAutomaticPunctuation: Bool = false) throws -> Leopard

Parameters

accessKey String : The AccessKey obtained from Picovoice Console.
modelURL URL : URL to file containing model parameters (.pv).
enableAutomaticPunctuation Bool : Set to true to enable automatic punctuation insertion.
enableDiarization Bool : Set to true to enable speaker diarization, which allows Leopard to differentiate speakers as part of the transcription process. Word metadata will include a speaker_tag to identify unique speakers.

Throws

LeopardError: If an error occurs while creating an instance of Leopard Speech-to-Text engine.

Leopard.`delete()`

Releases resources acquired by the Leopard engine.

public func delete()

Leopard.`process()`

Processes given audio data with the Leopard Speech-to-Text engine.

public func process(pcm: [Int16]) throws -> (transcript: String, words: [LeopardWord])

Parameters

pcm [Int16] : The incoming audio needs to have a sample rate equal to Leopard.sampleRate and be 16-bit linearly-encoded. Furthermore, Leopard operates on single-channel audio.

Returns

String, [LeopardWord] : Inferred transcription and sequence of transcribed words with their associated metadata.

Throws

LeopardError: If there is an error while processing the audio frame.

Leopard.`processFile()`

Processes a given audio file with the Leopard Speech-to-Text engine.

public func processFile(audioPath: String) throws -> (transcript: String, words: [LeopardWord])

Parameters

audioPath String : Absolute path to the audio file. The supported formats are: 3gp (AMR), FLAC, MP3, MP4/m4a (AAC), Ogg, WAV and WebM.

Returns

String, [LeopardWord] : Inferred transcription and sequence of transcribed words with their associated metadata.

Throws

LeopardError: If there is an error while processing the audio frame.

Leopard.`processFile()`

Processes a given audio file with the Leopard Speech-to-Text engine.

public func processFile(audioURL: URL) throws -> (transcript: String, words: [LeopardWord])

Parameters

audioURL URL : URL of the audio file. The supported formats are: 3gp (AMR), FLAC, MP3, MP4/m4a (AAC), Ogg, WAV and WebM.

Returns

String, [LeopardWord] : Inferred transcription and sequence of transcribed words with their associated metadata.

Throws

LeopardError: If there is an error while processing the audio frame.

Leopard.`sampleRate`

public static let sampleRate: UInt32

Audio sample rate accepted by Leopard.

Leopard.`version`

public static let version: String

Current Leopard version.

LeopardError

public class LeopardError : LocalizedError { }

Error thrown if an error occurs within Leopard Speech-to-Text engine.

public class LeopardMemoryError : LeopardError {}
public class LeopardIOError : LeopardError {}
public class LeopardInvalidArgumentError : LeopardError {}
public class LeopardStopIterationError : LeopardError {}
public class LeopardKeyError : LeopardError {}
public class LeopardInvalidStateError : LeopardError {}
public class LeopardRuntimeError : LeopardError {}
public class LeopardActivationError : LeopardError {}
public class LeopardActivationLimitError : LeopardError {}
public class LeopardActivationThrottledError : LeopardError {}
public class LeopardActivationRefusedError : LeopardError {}

LeopardWord

public struct LeopardWord { }

Struct for storing word metadata returned from the Leopard engine.

LeopardWord.`word`

LeopardWord.word: String

The transcribed word.

LeopardWord.`confidence`

LeopardWord.confidence: Float

Transcription confidence. It is a number within [0, 1].

LeopardWord.`startSec`

LeopardWord.startSec: Float

Start of word in seconds.

LeopardWord.`endSec`

LeopardWord.endSec: Float

End of word in seconds.

LeopardWord.`speakerTag`

LeopardWord.speakerTag: Int

Speaker tag is -1 if diarization is not enabled during initialization; otherwise, it's a non-negative integer identifying unique speakers, with 0 reserved for unknown speakers.

Was this doc helpful?

Issue with this doc?

Leopard Speech-to-Text iOS API

Leopard Speech-to-Text
iOS API