picoLLM Inference Engine
iOS API

API Reference for the picoLLM iOS SDK (Cocoapod)

PicoLLM

public class PicoLLM { }

Class for the picoLLM Inference Engine.

Resources should be cleaned when you are done using the delete() function.

PicoLLM.`model`

public let model: String

Getter for the model's information.

Returns

String : Model name.

PicoLLM.`contextLength`

public let contextLength: Int32

Getter for the model's context length.

Returns

Int32 : Context length.

PicoLLM.`maxTopChoices`

public static let maxTopChoices: Int32

Maximum number of top choices for .generate().

PicoLLM.`init()`

public init(
    accessKey: String,
    modelPath: String,
    device: String = "best:0"
) throws -> PicoLLM

init method for picoLLM Inference Engine with a mixture of arguments.

Parameters

accessKey String : The AccessKey obtained from Picovoice Console.
modelPath String : Absolute path to file containing model parameters (.pllm).
device String : String representation of the device (e.g., CPU or GPU) to use for inference. If set to best, picoLLM picks the most suitable device. If set to gpu, the engine uses the first available GPU device. To select a specific GPU device, set this argument to gpu:${GPU_INDEX}, where ${GPU_INDEX} is the index of the target GPU. If set to cpu, the engine will run on the CPU with the default number of threads. To specify the number of threads, set this argument to cpu:${NUM_THREADS}, where ${NUM_THREADS} is the desired number of threads.

Throws

PicoLLMError: If an error occurs while creating an instance of picoLLM Inference Engine.

PicoLLM.`delete()`

Releases resources acquired by picoLLM Inference Engine.

public func delete()

PicoLLM.`generate()`

public func generate(
    prompt: String,
    completionTokenLimit: Int32? = nil,
    stopPhrases: [String]? = nil,
    seed: Int32? = nil,
    presencePenalty: Float = 0.0,
    frequencyPenalty: Float = 0.0,
    temperature: Float = 0.0,
    topP: Float = 1.0,
    numTopChoices: Int32 = 0,
    streamCallback: ((String) -> Void)? = nil
) throws -> PicoLLMCompletion

Given a text prompt and a set of generation parameters, creates a completion text and relevant metadata.

Parameters

prompt String : Text prompt.
completionTokenLimit Int32 : Maximum number of tokens in the completion. If the generation process stops due to reaching this limit, the endpoint output argument will be PicoLLMEndpoint.completionTokenLimitReached. Set to nil to impose no limit.
stopPhrases [String]? : The generation process stops when it encounters any of these phrases in the completion. The already generated completion, including the encountered stop phrase, will be returned. The endpoint output argument will be PicoLLMEndpoint.stopPhraseEncountered. Set to nil to turn off this feature.
seed Int32 : The internal random number generator uses it as its seed if set to a positive integer value. Seeding enforces deterministic outputs. Set to nil for randomized outputs for a given prompt.
presencePenalty Float : It penalizes logits already appearing in the partial completion if set to a positive value. If set to 0.0, it has no effect.
frequencyPenalty Float : If set to a positive floating-point value, it penalizes logits proportional to the frequency of their appearance in the partial completion. If set to 0.0, it has no effect.
temperature Float : Sampling temperature. Temperature is a non-negative floating-point value that controls the randomness of the sampler. A higher temperature smoothens the samplers' output, increasing the randomness. In contrast, a lower temperature creates a narrower distribution and reduces variability. Setting it to 0 selects the maximum logit during sampling.
topP Float : A positive floating-point number within (0, 1]. It restricts the sampler's choices to high-probability logits that form the topP portion of the probability mass. Hence, it avoids randomly selecting unlikely logits. A value of 1. enables the sampler to pick any token with non-zero probability, turning off the feature.
numTopChoices Int32 : If set to a positive value, picoLLM returns the list of the highest probability tokens for any generated token. Set to 0 to turn off the feature. The maximum number of top choices is .maxTopChoices().
stream_callback ((String) -> Void)? : If not set to nil, picoLLM executes this callback every time a new piece of completion string becomes available.

Returns

PicoLLMCompletion : Object containing stats and generated tokens.

Throws

PicoLLMError: If there is an error while generating the prompt.

PicoLLM.`interrupt()`

Interrupts .generate() if generation is in progress. Otherwise, it has no effect.

public func interrupt() throws

Throws

PicoLLMError: If interrupt fails.

PicoLLM.`tokenize()`

public func tokenize(
    text: String,
    bos: Bool,
    eos: Bool
) throws -> [Int32]

Tokenizes a given text using the model's tokenizer. This is a low-level function meant for benchmarking and advanced usage. .generate() should be used when possible.

Parameters

text String : Text.
bos Bool : If set to true, the tokenizer prepends the beginning of the sentence token to the result.
eos Bool : If set to true, the tokenizer appends the end of the sentence token to the result.

Returns

[Int32] : Tokens representing the input text.

Throws

PicoLLMError: If there is an error while tokenizing.

PicoLLM.`forward()`

Perform a single forward pass given a token and return the logits. .generate() should be used when possible.

public func forward(token: Int32) throws -> [Float]

Parameters

token int32_t : Input token.

Returns

[Float] : Logits.

Throws

PicoLLMError: If there is an error while executing a forward.

PicoLLM.`reset()`

public func reset() throws

Resets the internal state of LLM. It should be called in conjunction with .forward() when processing a new sequence of tokens. This is a low-level function for benchmarking and advanced usage. .generate() should be used when possible.

Throws

PicoLLMError: If there is an error while resetting.

PicoLLM.`getAvailableDevices()`

public static func getAvailableDevices() throws -> [String]

Gets a list of hardware devices that can be specified when calling .init().

Returns

[String] : Array of available hardware devices.

PicoLLMError

public class PicoLLMError : LocalizedError { }

Error thrown if an error occurs within picoLLM Inference engine.

public class PicoLLMMemoryError : PicoLLMError {}
public class PicoLLMIOError : PicoLLMError {}
public class PicoLLMInvalidArgumentError : PicoLLMError {}
public class PicoLLMStopIterationError : PicoLLMError {}
public class PicoLLMKeyError : PicoLLMError {}
public class PicoLLMInvalidStateError : PicoLLMError {}
public class PicoLLMRuntimeError : PicoLLMError {}
public class PicoLLMActivationError : PicoLLMError {}
public class PicoLLMActivationLimitError : PicoLLMError {}
public class PicoLLMActivationThrottledError : PicoLLMError {}
public class PicoLLMActivationRefusedError : PicoLLMError {}

PicoLLMUsage

public struct PicoLLMUsage { }

Struct for the number of tokens in the prompt and completion.

PicoLLMUsage.`promptTokens`

PicoLLMUsage.promptTokens: Int

Number of tokens in the prompt.

PicoLLMUsage.`completionTokens`

PicoLLMUsage.completionTokens: Int

Number of tokens in the completion.

PicoLLMEndpoint

public enum PicoLLMEndpoint: Codable {
    case endOfSentence
    case completionTokenLimitReached
    case stopPhraseEncountered
}

Enum for the endpoint detection types.

PicoLLMToken

public struct PicoLLMToken { }

Struct for a token and its associated log probability.

PicoLLMToken.`token`

PicoLLMToken.token: String

Token.

PicoLLMToken.`logProb`

PicoLLMToken.logProb: Float

Log probability.

PicoLLMCompletionToken

public struct PicoLLMCompletionToken { }

Struct for a token within completion and top alternative tokens.

PicoLLMCompletionToken.`token`

PicoLLMCompletionToken.token: PicoLLMToken

Token. See PicoLLMToken.

PicoLLMCompletionToken.`topChoices`

PicoLLMCompletionToken.topChoices: [PicoLLMToken]

Top choices. See PicoLLMToken.

PicoLLMCompletion

public struct PicoLLMCompletion { }

Result object containing stats and generated tokens.

PicoLLMCompletion.`usage`

PicoLLMCompletion.usage: PicoLLMUsage

Usage. See PicoLLMUsage.

PicoLLMCompletion.`endpoint`

PicoLLMCompletion.endpoint: PicoLLMEndpoint

Endpoint. See PicoLLMEndpoint.

PicoLLMCompletion.`completionTokens`

PicoLLMCompletion.completionTokens: [PicoLLMCompletionToken]

Completion Tokens. See PicoLLMCompletionToken.

PicoLLMCompletion.`completion`

PicoLLMCompletion.completion: String

Completion string.

PicoLLMDialog

public protocol PicoLLMDialog {}

Protocol representing the picoLLM dialog interface.

PicoLLMDialog.`init()`

init(history: Int32?, system: String?) throws

init method for PicoLLMDialog.

Parameters

history Int32? : The AccessKey obtained from Picovoice Console.
system String? : Absolute path to file containing model parameters (.pllm).

Throws

PicoLLMError: If an error occurs while creating an instance of PicoLLMDialog.

PicoLLMDialog.`addHumanRequest()`

func addHumanRequest(content: String) throws

Adds human's request to the dialog.

Parameters

content String : Human's request.

Throws

PicoLLMError: If an error occurs while adding the human's request.

PicoLLMDialog.`addLLMResponse()`

func addLLMResponse(content: String) throws

Adds LLM's response to the dialog.

Parameters

content String : LLM's response.

Throws

PicoLLMError: If an error occurs while adding the LLM's response.

PicoLLMDialog.`prompt()`

func prompt() throws -> String

Creates a prompt string given parameters passed the constructor and dialog's content.

Returns

String : Formatted prompt.

Throws

PicoLLMError: If an error occurs while adding the LLM's response.

BasePicoLLMDialog

public class BasePicoLLMDialog { }

BasePicoLLMDialog is a helper class that stores a chat dialog and formats it according to an instruction-tuned LLM's chat template. BasePicoLLMDialog is the base class. Each supported instruction-tuned LLM has an accompanying concrete subclass.

Phi2Dialog

public class Phi2Dialog: BasePicoLLMDialog { }

Dialog helper for phi-2. This is a base class, use one of the mode-specific subclasses.

Phi2Dialog.`init()`

init(
    humanRequestsTag: String,
    llmResponsesTag: String,
    history: Int32?,
    system: String?
) throws

init method for Phi2Dialog.

Parameters

humanRequestsTag String : Tag to classify human requests.
llmResponsesTag String : Tag to classify llm responses.
history Int32? : The AccessKey obtained from Picovoice Console.
system String? : Absolute path to file containing model parameters (.pllm).

Throws

PicoLLMError: If an error occurs while creating an instance of Phi2Dialog.

Phi2QADialog

public class Phi2QADialog: Phi2Dialog { }

Dialog helper for phi-2 qa mode.

Phi2ChatDialog

public class Phi2ChatDialog: Phi2Dialog { }

Dialog helper for phi-2 chat mode.

Phi3ChatDialog

public class Phi3ChatDialog: BasePicoLLMDialog { }

Dialog helper for phi3.

Phi35ChatDialog

public class Phi35ChatDialog: Phi3ChatDialog { }

Dialog helper for phi3.5.

MistralChatDialog

public class MistralChatDialog: BasePicoLLMDialog { }

Dialog helper for mistral-7b-instruct-v0.1 and mistral-7b-instruct-v0.2.

MixtralChatDialog

public class MixtralChatDialog: MistralChatDialog { }

Dialog helper for mixtral-8x7b-instruct-v0.1.

Llama2ChatDialog

public class Llama2ChatDialog: BasePicoLLMDialog { }

Dialog helper for llama-2-7b-chat, llama-2-13b-chat, and llama-2-70b-chat.

Llama3ChatDialog

public class Llama3ChatDialog: BasePicoLLMDialog { }

Dialog helper for llama-3-8b-instruct and llama-3-70b-instruct.

Llama32ChatDialog

public class Llama32ChatDialog: Llama3ChatDialog { }

Dialog helper for llama-3.2-1b-instruct and llama-3.2-3b-instruct.

GemmaChatDialog

public class GemmaChatDialog: BasePicoLLMDialog { }

Dialog helper for gemma-2b-it and gemma-7b-it.

Was this doc helpful?

Issue with this doc?

picoLLM Inference Engine iOS API

picoLLM Inference Engine
iOS API