Blog

Blog Thumbnail
Power of LLM Quantization: Making Large Language Models Smaller and Efficient
August 6, 2024 · 2 min read

As the demand for large language models (LLMs) continues to grow, so does the need for efficient and cost-effective deployment solutions....

Blog Thumbnail
Future of Generative AI: Small Language Models
August 6, 2024 · 2 min read

Over the years, Large Language Models (LLMs) have dominated the scene. However, a notable shift is underway towards Small Language Models (S...

Blog Thumbnail
Streaming Text-to-Speech in Python
July 8, 2024 · 2 min read

Learn how to convert a stream of text into audio in real-time, enabling AI voice assistants with no audio delay....

Blog Thumbnail
Local LLM for Mobile: Run Llama 2 and Llama 3 on iOS
July 2, 2024 · 2 min read

Learn how to run Llama 2 and Llama 3 on iOS with picoLLM Inference engine iOS SDK. Runs locally on an iOS device....

Blog Thumbnail
Local LLM for Desktop Applications: Run Llama 2 & Llama 3 in Python
June 24, 2024 · 2 min read

Learn how to run Llama 2 and Llama 3 in Python with the picoLLM Inference Engine Python SDK. Runs locally on Linux, macOS, Windows, and Rasp...

Blog Thumbnail
Local LLM for Web Browsers: Run Llama with Javascript
June 20, 2024 · 1 min read

Run Llama Locally on Any Browser: GPU-Free Guide with picoLLM JavaScript SDK for Chrome, Edge, Firefox, & Safari...

Blog Thumbnail
Local LLM for Windows, Mac, Linux: Run Llama with Node.js
June 20, 2024 · 1 min read

Large Language Models (LLMs), such as Llama 2 and Llama 3, represent significant advancements in artificial intelligence, fundamentally chan...

Blog Thumbnail
Local LLM for Mobile: Run Llama 2 and Llama 3 on Android
June 19, 2024 · 2 min read

Learn how to run Llama 2 and Llama 3 on Android with the picoLLM Inference Engine Android SDK. Runs locally on an Android device....

Blog Thumbnail
Streaming Text-to-Speech for Low-Latency AI Agents
June 18, 2024 · 2 min read

Dual Streaming Text-to-Speech allows to synthesize an incoming stream of text into consistent audio in real time, making it ideal for latenc...

Blog Thumbnail
AI Voice Assistant for iOS Powered by Local LLM
June 7, 2024 · 4 min read

Create an on-device, LLM-powered Voice Assistant for iOS using Picovoice on-device voice AI and picoLLM local LLM platforms....

Blog Thumbnail
AI Voice Assistant for Android Powered by Local LLM
June 5, 2024 · 4 min read

Create an on-device, LLM-powered Voice Assistant for Android using Picovoice on-device voice AI and picoLLM local LLM platforms....

Blog Thumbnail
How to Run LLMs Locally with Python
June 4, 2024 · 1 min read

Learn how to run LLMs locally using the picoLLM Inference Engine Python SDK. picoLLM performs LLM inference on-device, keeping your data pri...

Blog Thumbnail
Local LLM-Powered Voice Assistant for Web Browsers
June 4, 2024 · 4 min read

Create a local LLM-powered Voice Assistant for Web Browsers using Picovoice on-device voice AI and picoLLM local LLM platforms....

Blog Thumbnail
How to Run a Local LLM using Node.js
May 30, 2024 · 2 min read

Run local LLMs using Node.js with picoLLM, enabling AI assistants to run on-device, on-premises, and in private clouds without sacrificing a...

Blog Thumbnail
Cross-Browser Local LLM Inference Using WebAssembly
May 29, 2024 · 3 min read

picoLLM is a cross-browser local LLM inference engine that runs on all major browsers, including Chrome, Safari, Edge, Firefox, and Opera....

Blog Thumbnail
How to Run an LLM Locally on Mac
May 29, 2024 · 3 min read

Learn how to run local LLMs on macOS using picoLLM, enabling AI assistants to run on-device, on-premises, and in private clouds without sacr...