Leopard Speech-to-Text
iOS Quick Start

Platforms

iOS (13.0+)

Requirements

Picovoice Account & AccessKey

Signup or Login to Picovoice Console to get your AccessKey. Make sure to keep your AccessKey secret.

Quick Start

Setup

Install Xcode.
Import the Leopard-iOS package into your project.

To import the package using SPM, open up your project's Package Dependencies in XCode and add:

https://github.com/Picovoice/leopard.git

To import it into your iOS project using CocoaPods, add the following line to your Podfile:

pod 'Leopard-iOS'

Then, run the following from the project directory:

pod install

Add the following to the app's Info.plist file to enable recording with an iOS device's microphone

<key>NSMicrophoneUsageDescription</key>
<string>[Permission explanation]</string>

Model File

Add the Leopard Speech-to-Text model file in Xcode:

Create a model in Picovoice Console or use a default language model.
Add the model as a bundled resource by selecting Build Phases and adding it to Copy Bundle Resources step.

Usage

Create an instance of Leopard Speech-to-Text:

import Leopard

let accessKey = "${ACCESS_KEY}"
let modelPath = Bundle(for: type(of: self)).path(
        forResource: "${LEOPARD_MODEL_FILE}",
        ofType: "pv")!

do {
    let leopard = Leopard(accessKey: accessKey, modelPath: modelPath)
} catch { }

Alternatively, you can provide modelPath as an absolute path to the model file on device.

Transcribe an audio file either by passing the absolute path or an url to the file:

do {
    let audioPath = Bundle(for: type(of: self)).path(
      forResource: "${AUDIO_FILE_NAME}",
      ofType: "${AUDIO_FILE_EXTENSION}")
    let result = leopard.processFile(audioPath);
    print(result.transcript)
} catch let error as LeopardError {
    // handle error
} catch { }

Release resources explicitly when done with Leopard Speech-to-Text:

leopard.delete()

Word Metadata

Along with the transcript, Leopard Speech-to-Text returns metadata for each transcribed word. Available metadata items are:

Start Time: Indicates when the word started in the transcribed audio. Value is in seconds.
End Time: Indicates when the word ended in the transcribed audio. Value is in seconds.
Confidence: Leopard Speech-to-Text's confidence that the transcribed word is accurate. It is a number within [0, 1].
Speaker Tag: If speaker diarization is enabled on initialization, the speaker tag is a non-negative integer identifying unique speakers, with 0 reserved for unknown speakers. If speaker diarization is not enabled, the value will always be -1.

Demo

For the Leopard Speech-to-Text iOS SDK, we offer demo applications that demonstrate how to use the Speech-to-Text engine on audio recordings.

Setup

Clone the Leopard Speech-to-Text repository from GitHub using HTTPS:

git clone --recurse-submodules https://github.com/Picovoice/leopard.git

Usage

Replace "${YOUR_ACCESS_KEY_HERE}" in the file ViewController.swift with a valid AccessKey.
Open LeopardDemo.xcodeproj in XCode.
Go to Product > Scheme and select the scheme for the language you would like to demo (e.g. esDemo -> Spanish Demo, deDemo -> German Demo).
Run the demo with a simulator or connected iOS device.

Resources

Package

Leopard-iOS on Cocoapods

API

Leopard-iOS API Docs

GitHub

Benchmark

Speech-to-Text Benchmark

Was this doc helpful?

Issue with this doc?

Leopard Speech-to-Text iOS Quick Start

Platforms

Requirements

Picovoice Account & AccessKey

Quick Start

Setup

Model File

Usage

Word Metadata

Demo

Setup

Usage

Resources

Package

API

GitHub

Benchmark

Leopard Speech-to-Text
iOS Quick Start