When it comes to accuracy, make a decision based on data, not prerecorded demos. We have created an open-source accuracy benchmark to show that Octopus Speech-to-Index achieves accuracies much higher than Google Speech-to-Text.
Process voice data on your premise without sending it to a third-party cloud. Do not risk sensitive information such as call centre recordings with personal information or meeting transcriptions with trade secrets. After clicking on the mic icon or uploading an audio file, turn off your Wi-Fi. The demo will still work!
Speech-to-Index is a technique that makes audio automatically searchable. Audio indexing, speech indexing, acoustic indexing, and acoustic searching are other terms used in the industry and academia. It allows quick searches and rapid access to audio content. Since Octopus Speech-to-Index directly indexes speech, we coined the term Speech-to-Index.
Octopus was born due to market demand. Especially for media & entertainment use cases, customers were asking for something like Google but for audio and video content. Octopus was born to index speech data, like Google indexes websites.
In short, Octopus, Speech-to-Index is built for Voice Search, and Speech-to-Text is a workaround to enable search by converting voice to text.
Voice Search has been a topic of interest to make audio content searchable and discoverable. Given the maturity of text indexing algorithms, converting voice to text and performing a search based on the transcription is one of the first things that comes to mind. However, Speech-to-Text has several limitations: the out-of-vocabulary, longtail proper names, and competing hypotheses (homophones). We say this despite our Speech-to-Text offerings because we value transparency. Google also admits these limitations: “one of the most difficult challenges of automated speech-to-text is correctly identifying these proper nouns, known as named entities.” Google has been working on it for a while by trying to improve ASR. Despite the limitations, Google and other voice technology providers try to fit speech-to-text to every problem. We tried to understand the need first and built a product accordingly.
Enterprises use voice search to find proper nouns: location, brand, product, company, and celebrity names. It is also used for archiving, offering media and entertainment content, and social media listening. The need is to find proper names. However, it is almost impossible to foresee each search when tuning speech-to-text. Re-transcribing already transcribed recordings with extra tuned speech-to-text is also costly. Acknowledging speech-to-text limitations, we built an alternative engine for the Voice Search use case.
Octopus Speech-to-Index offers multilingual support for English, French, German, Italian, Japanese, Korean, Portuguese, Spanish, and more.
Contact Picovoice Sales and tell us about the opportunity, including the use case, requirements and project details.