Voice Search

Add voice search to mobile applications, websites, and devices.

Find keywords and phrases in audio and video streams.

Work with an expert

Analyze and utilize voice data to offer better products and stay compliant

In the age of Netflix, Tiktok, Youtube, Twitch, Zoom and podcasts, we have more devices, mobile, and web applications generating data. UpKeep predicts that 175 ZB of new data will be generated by 2025. More than 80% of it will be unstructured, including audio and video. The need for making audio and video content searchable, just like Google Search Engine did for websites, is growing. Enabling keyword search for audio for monitoring, compliance, and further analysis helps enterprises reduce their risks. However, given the scale of data, it’s not easy nor affordable for everyone. The standard approach to Voice Search has two steps. First, convert voice to text via a speech-to-text engine, then perform a text-based keyword search. However, speech-to-text engines struggle with the proper names like brands, people, industry-specific jargon, and homophones. Some speech-to-text solutions, like Leopard Speech-to-Text, allow customization to a certain degree. Even when tuning is not required, transcribing voice in the cloud has inherent costs. These costs can be a show-stopper even for large enterprises considering the millions of hours of voice data. Not anymore! On-device voice recognition enables enterprises to analyze audio and video files at a small fraction of cloud API costs.

Each day someone is in danger in unmonitored areas. They're attacked, threatened or experience a medical emergency and need someone to hear the call for help. With HALO, using Picovoice technology to recognize keywords, security personnel can now respond to the call.
David Antar
President, IPVideo
Octopus Speech-to-Index icon on a red background to show it could be used for voice search in videos too

Improve user experience with the most accurate engine

Open-source, open-data NLU benchmark results show Picovoice Rhino outperforms alternative NLU engines, such as Amazon Lex, Google Dialogflow, IBM Watson and Microsoft Luis. Rhino Speech-to-Intent fuses speech-to-text (STT) and natural language understanding (NLU) to give all you need to add voice commands.

Focus on understanding customers, not complex cloud billing

Picovoice speech-to-text and voice search engines are at least 10x more affordable than Amazon, Google, IBM and Microsoft. Enterprises avoid surprising cloud bills with simple and predictable pricing.

Picovoice Speech-to-Text, Speech-to-Index and Wake Word engines can be used to create customizable voice search

One does not fit all

Every business and use case has different requirements. Customize Picovoice engines or use them together to find spoken keywords in audio files or real-time conversations.

Social Media Listening

Today, people talk about brands, products and news on podcasts, TikTok, YouTube, Twitch, Snapchat, Instagram, and Facebook videos more than they write about them. Detect keywords in real-time conversations or audio files with Porcupine Wake Word. Find keywords and phrases in audio and video files with Octopus Speech-to-Index by indexing them. Once Octopus indexes voice data, you can search for an unlimited number of spoken words in the same file. Agencies and marketing departments can track everything about their ads, products, or brand ambassadors, even historically.

Indexing and Archiving

Audio and video content are irreplaceable for some industries such as media and entertainment. Voice search offers a better user experience and accessibility for rich audio and video content by making them searchable. For example, video streaming platforms, audiobooks or podcast publishers can add voice search to find phrases such as “may the force be with you” or “there is no place like home”. Users can see the list of files with these phrases and go to the exact moments. Check out Octopus Speech-to-Index to let users perform a keyword and phrase search in audio and video files instead of a text-based description search.

Governance & Compliance

Enterprises monitor or record some conversations to improve user experience and comply with regulations. Voice Search helps them achieve these goals. For example, a gaming company can use Voice Search to monitor profanity and a call center to train new agents. Porcupine Wake Word detects keywords and phrases of interest. Octopus Speech-to-Index can search for competition or profanity. In cases like call centers, where storing large audio files for multiple years is costly, Leopard Speech-to-Text and Octopus Speech-to-Index make them storable by converting voice files to text or indexing. Cobra Voice Activity Detection makes transcription more affordable for the files with gaps between conversations. Cobra detects conversations so that speech-to-text engines process them only instead of full-length files.

More from Picovoice