Eliminate background noise in real time while preserving speech. Up to 17.3× more effective than RNNoise at the same compute cost.
Background noise is a solved problem at the OS and hardware level, if you only care about one platform, one device, or users who know where the settings are. For developers building applications that need to guarantee audio quality for every user on every platform, background noise is still a big problem for real-time applications and hinders user experience significantly.
Koala Noise Suppression gives applications direct control over noise suppression, without depending on the OS, the hardware, the user's settings, or a cloud service. Hardware platforms such as NVIDIA and Apple rely on end-users to find and enable noise cancellation; your application has no control over whether it is active. Cloud APIs such as ElevenLabs and Dolby process pre-recorded files only; they cannot clean live audio as it is captured.
Koala Noise Suppression processes audio on-device in real time, frame by frame, with no file upload and no network round-trip. With its minimal compute requirements, it processes the audio, cleans it, and sends it wherever it needs to go, embedded, mobile, web (inc. mobile web), desktop, and server.
Koala Noise Suppression processes audio frames in real time and returns enhanced audio streams. Drop it between your microphone capture and your audio output or transmission layer, route the clean audio to meeting participants for higher audio quality or to the ASR model for lower WER. Use Koala Noise Suppression with its native SDKs for Python, C, iOS, Android, and Web.
Open-source Noise Suppression Benchmark shows that across all noise levels tested, RNNoise reduces STOI distance by a small fraction while Koala Noise Suppression cuts it by half or more. In the most challenging, noisiest conditions, Koala is up to 17.3× more effective than RNNoise at restoring speech intelligibility.
Open-source Noise Suppression Benchmark uses Microsoft DNS Challenge dataset as test data and the STOI distance to clean speech as a metric. A zero STOI distance to clean speech means it's indistinguishable from clean speech. Tests are run across multiple Signal-to-Noise Ratio (SNR) levels, a measure of how loud the background noise is relative to speech. Lower SNR means noisier conditions.
Koala is an enterprise-ready on-device noise suppression engine built for real-time communication applications and voice AI agents. It processes audio locally at minimal compute cost, runs across every platform without cloud dependency, and is private by architecture.
High-quality, effective, and lightweight noise suppression
Noise suppression, also known as noise reduction, noise cancellation, noise removal, speech enhancement, or speech denoising, combines techniques and tools to reduce or altogether remove unwanted sounds in the background while preserving human voice.
Noise suppression and noise cancellation are technically distinct but very similar technologies. That's why they're often used interchangeably in marketing and communications.
Noise Cancellation generally refers to a hardware technique (destructive interference) that uses microphones and speakers to physically block ambient sound by generating inverse sound waves that cancel out the original noise before it reaches the listener. Noise suppression, on the other hand, reduces unwanted noise components from an audio signal after it has been captured.
Since the user-visible effect is similar and modern systems blend both techniques, they're used interchangeably.
Noise suppression and echo cancellation are complementary technologies and are often applied together in voice communication pipelines, but solve different problems. Noise suppression removes background sounds — fans, traffic, babble — captured by the microphone, while echo cancellation removes the acoustic echo created when speaker output is picked up by the microphone and fed back into the signal.
Noise suppression analyzes audio to distinguish speech from noise, then reduces noise while preserving speech, in five sequential stages, from frequency decomposition to output reconstruction.
Noise Suppression algorithms using traditional signal processing handle simple, stationary noise with minimal resource overhead and minimal latency, while deep learning-powered algorithms handle complex, non-stationary noise with substantially better quality at modest or high additional compute cost, which results in additional latency, depending on the model. Koala Noise Suppression leverages deep learning to handle complex, non-stationary noise — delivering substantially better speech intelligibility than traditional signal processing approaches at comparable compute cost.
Check out the complete noise suppression guide, nuances of speech enhancement, and compare noise suppression alternatives.
Choosing the best noise suppression software depends on priorities such as:
Koala Noise Suppression is the only production-ready high-quality option that runs across platforms and is available in minutes, crossing off all items on the list.
High-quality noise suppression actually improves overall voice quality by removing distracting background elements. However, when it's not chosen or implemented right, it can fail to remove the noise, remove parts of speech along with noise, introduce artifacts or audio glitches, or cause robotic and muffled voice quality. Koala Noise Suppression preserves natural speech characteristics and tone, maintains emotional nuances and inflections, and enhances clarity of speech.
Yes, Koala Noise Suppression is specifically designed for real-time applications. Koala enhances speech locally on the device, eliminating the network without introducing any significant compute latency, maintaining natural conversation flow across mobile, web, desktop, and embedded.
Yes. You can use Koala Noise Suppression for offline noise reduction, as well.
Mozilla RNNoise pioneered real-time on-device noise suppression and is still widely deployed. However, it lacks active maintenance and was built before modern deep learning approaches matured. In the open-source noise suppression benchmark using the Microsoft DNS Challenge test set, Koala reduces STOI distance to clean speech 4.3× more than RNNoise on average. In the noisiest condition (0 dB SNR), the gap widens to 17.3×.
Koala Noise Suppression achieves this at virtually identical compute cost: Koala's real-time factor is 0.0126 versus RNNoise's 0.0120, a difference of less than 5%, independent of the audio being processed. Significantly better speech intelligibility at the same compute cost is a straight upgrade from anyone moving from RNNoise to Koala Noise Suppression. For any application where audio quality affects user experience, Koala Noise Suppression is the appropriate choice.
ElevenLabs Audio Isolation is a cloud API for post-processing recorded audio. Despite the word "stream" appearing in the endpoint name, it requires a complete audio file to be uploaded. It does not process live microphone audio as it is being captured. For real-time communication applications, voice calls, video conferencing, live streaming, and voice AI agents, ElevenLabs Audio Isolation cannot be used.
Koala Noise Suppression processes audio on-device frame by frame with no file upload, no cloud round-trip, and no network dependency. It works in real time at the point of audio capture across every platform, making it a fit for both real-time and post-production voice isolation.
Both Koala Noise Suppression and Krisp Noise Suppression SDKs support real-time streaming noise suppression, and both run on-device. The key differences are access, platform breadth, and deployment model. Krisp's SDK's platform support is more limited compared to Koala Noise Suppression. Krisp SDK supports servers (Linux and Windows), desktop (Windows, macOS, and Linux), mobile (iOS, Android), and desktop browsers (Chrome, Mozilla, and Edge). Krisp SDK doesn't mention any support for Safari, mobile browsers, or embedded at all. Koala has no equivalent restriction and supports all major browsers, both on mobile and desktop, iOS, Android, desktop, server, and embedded systems.
Both Adobe Podcast Enhance and Dolby offer noise suppression as cloud post-processing APIs, requiring audio files to be uploaded to their cloud, wait for processing, and transmit the cleaned result. Neither supports real-time streaming from a live microphone. For communication applications, voice agents, or any use case where audio needs to be cleaned as it is captured, neither is applicable. Both are suitable for post-production use cases like podcast editing and content creation, but not for real-time developer applications.
Unlike Adobe Podcast Enhance and Dolby, Koala Noise Suppression supports both real-time streaming and offline post-processing on-device, covering both use cases in a single SDK.
OS-level noise suppression, such as Windows Speech Enhancement, macOS Mic Mode, and iOS Voice Isolation, is controlled by the end user, not the application. End users must find and enable it themselves, and many don't. It also varies by OS version, device, and hardware configuration. 3rd-party applications have no programmatic control over whether it is active.
Koala Noise Suppression gives applications direct control over noise suppression, so product teams can apply it in the audio pipeline, consistently, for every user on every device, regardless of their OS version, settings, or hardware. This is the difference between depending on users to configure their environment and guaranteeing audio quality at the application level.
Yes, Koala Noise Suppression supports 8 kHz telephony applications. You can reach out to your Picovoice contact for more information.
Yes, Koala Noise Suppression can be used in Voice AI agents and other voice AI pipelines to improve the quality of speech and accuracy.
If you're not sure how to use Koala Noise Suppression as a preprocessing step for other voice AI engines — ASR, wake word detection, voice commands — contact sales to get the Picovoice technical team to review your code. You can also work with Picovoice researchers on a custom configuration for your specific acoustic environment through a Non-Recurring Engineering (NRE) engagement.
Koala Noise Suppression is trained on diverse stationary and non-stationary noise conditions, including babble noise, keyboard typing, HVAC and air conditioning, traffic, background music, and so on. For specialised acoustic environments — industrial machinery, specific noise profiles, or unique acoustic conditions — custom model training is available for Enterprise Plan customers via Picovoice Consulting.
SNR, Signal-to-Noise Ratio, is the foundational measure of acoustic conditions in a voice AI deployment. In simple terms, signal-to-noise ratio is the ratio of the power of a signal (meaningful input) to the power of background noise (meaningless or unwanted input).
Picovoice docs, blog, Medium posts, and GitHub are great resources to learn about voice AI, Picovoice technology, and how to enhance speech quality. Enterprise customers get dedicated support specific to their applications from Picovoice Product & Engineering teams. Reach out to your Picovoice contact or contact sales to discuss support options.