Integrate real-time noise suppression into your iOS app with the Koala iOS SDK. On-device noise supression for real-time communication apps.
Build ML Kit Android speech-to-speech translation with Kotlin. Complete guide using Cheetah STT, Google ML Kit Translation, and Orca TTS for on-device voice translation.
Complete guide to building a real-time meeting summarization tool in Python with streaming speech-to-text and AI summaries. Full code included.
Complete guide to building a voice note-taking app in Python with wake word activation, stop phrase control, and on-device transcription. Full code included.
Learn how to play audio in Python with PvSpeaker. Stream PCM audio output for text-to-speech, audio synthesis, and real-time audio playback on Windows, macOS, and Linux.
Learn how to record audio in React Native apps for Android and iOS. Capture PCM microphone input for speech recognition, voice commands, and real-time audio processing.
Learn how to enable automatic punctuation and correct casing in speech-to-text with Python. Get formatted transcripts with periods, commas, and capitalization.
Build HIPAA-compliant medical voice agent in Python with on-device speech processing. Complete tutorial with wake word detection, real-time STT, and TTS.
Learn how to run LLMs locally in C across Linux, Windows, macOS, and Raspberry Pi with streaming text generation.
Step-by-step guide to adding speaker diarization to OpenAI Whisper STT in C++ using Falcon Speaker Diarization for multi-speaker transcription.
Voice Activity Detection (VAD) is a core building block for speech and audio systems, used to determine when human speech is present in an audio stream.
Learn how to implement real-time noise cancellation in C across Linux, Windows, macOS, and Raspberry Pi.
Learn how to build a local MCP voice assistant using a local LLM to handle function calling, speech-to-text, text-to-speech, and external API integration in this step-by-step MCP tutorial.
Build a banking voice AI agent with custom wake words and voice activated banking features for secure and compliant financial applications.
Learn how to get word-level confidence scores in Python for speech-to-text. Set word confidence thresholds to improve transcription quality.
Step-by-step tutorial: Build cross-platform speaker recognition in C using Picovoice Eagle. Includes complete code for speaker enrollment & recognition on Linux, Windows, macOS, and Raspberry Pi.















