picoLLM Inference Engine
Python Quick Start
Platforms
- Linux (x86_64)
- macOS (x86_64, arm64)
- Windows (x86_64, arm64)
- Raspberry Pi (3, 4, 5)
Requirements
- Picovoice Account & AccessKey
- Python 3.9+
- PIP
Picovoice Account & AccessKey
Signup or Login to Picovoice Console to get your AccessKey.
Make sure to keep your AccessKey secret.
Quick Start
Setup
Install Python 3.
Install the picollm Python package using PIP:
- Download a
picoLLMmodel file (.pllm) from Picovoice Console.
Usage
- Create an instance of the inference engine:
- Generate a prompt completion:
- To interrupt completion generation before it has finished:
- When done, be sure to release the resources explicitly:
Vision models
To run a VLM such as qwen3-vl-2b-it:
Replace ${PROMPT} with a text prompt. For the image, you will need to get image height and width in number of pixels and the raw pixel values of the image in 8-bit, RGB format.
OCR models
To run an OCR (Optical Character Recognition) model such as deepseek-ocr-2:
For the image, you will need to get image height and width in number of pixels and the raw pixel values of the image in 8-bit, RGB format.
Embedding models
To run an embedding model such as embeddinggemma-300m:
Replace ${PROMPT} with a text prompt that you want to generate embeddings for.
Demo
For the picoLLM Python SDK, we offer demo applications that demonstrate how to use it to generate text from a prompt or in a chat-based environment.
Setup
Install the picollmdemo Python package using PIP:
This package installs command-line utilities for the picoLLM Python demos.
Usage
Run the demo by entering the following in the terminal:
Replace ${ACCESS_KEY} with yours obtained from Picovoice Console, ${MODEL_PATH} with the path to a model file
downloaded from Picovoice Console, and ${PROMPT} with a prompt string.
To get information about all the available options in the demo, run the following:
For more information on our picoLLM demos for Python or to see a chat-based demo, head over to our GitHub repository.