Koala: Noise Suppression for Developers

🎯 Voice AI Consulting

Get dedicated support and consultation to ensure your specific needs are met.

Noise Suppression technology mutes the unwanted background noise, providing clear audio for communication. Noise Suppression, or noise cancellation as widely known, has become more popular with remote work. Given the popularity, one may expect a mature market with no room for new players. However, the Picovoice team decided to build yet another Noise Suppression engine, and you may ask why.

Things you should know about Noise Suppression Engines

A good Noise Suppression engine should:

run on the edge for minimal latency. Latency over 200ms affects the sense of presence, and humans start talking over each other.
increase intelligibility. Like any product, it should create value, i.e., enhance speech.
be platform-agnostic. It should offer unified experiences across platforms, whether on mobile, web, desktop, or all.

Why another Noise Suppression engine?

We could not find anything readily available that meets all three criteria above.

Small and efficient traditional DSP models that can run on the device with minimal latency return subpar quality. They especially struggle with irregular non-stationary noises. Typical deep learning models that offer higher speech intelligibility have large model sizes with high power and compute requirements. Thus, running them in real time and across platforms is not feasible. Due to network latency, running large Noise Suppression models in the cloud is not viable either.

The trade-off between latency and quality makes Noise Suppression a challenging problem. Building any technology is more attainable when the scope is smaller. Thus, the production-grade solutions in the market are either platform-dependent or not readily available. It leaves developers with open-source options, in fact, only with Mozilla RNNoise. It’s efficient and was great when first released six years ago. However, it is no longer state-of-the-art.

Introducing Koala Noise Suppression

Koala Noise Suppression is powered by deep learning and is efficient, achieving high speech intelligibility with minimal latency. Like all Picovoice engines, Koala Noise Suppression:

processes voice data locally (private and reliable)
runs across platforms (including desktop, mobile, web, and embedded)
is ready to be deployed (in minutes with intuitive SDKs)
is available to all developers (with the Free Plan, no strings attached)

koala = pvkoala.create(access_key)

while True:
  enhanced_audio = 
    koala.process(get_next_audio_frame())
Build with Python

How to measure quality

To measure the quality, the Picovoice team developed an open-source benchmark comparing the speech intelligibility of Koala Noise Suppression and Mozilla RNNoise. Koala Noise Suppression is four times more effective than RNNoise. Listen to the difference between RNNoise and Koala Noise Suppression.

Sounds too good to be true? See for yourself. The web demo runs within your web browser in real time.

Press the button
to start removing noise with Koala

What’s next?

Your feedback is an essential part of the process. Please create a GitHub issue and share your feedback. If you enjoy building with Koala Noise Suppression, you can give it a star to help fellow developers quickly find it, and Picovoice democratize voice AI.