Find spoken words within audio files
Scalable, efficient and production-ready acoustic indexingStart Building for FreeWork with an expert
When it comes to accuracy, make a decision based on data, not prerecorded demos. We have created an open-source accuracy benchmark to show that Octopus Speech-to-Index achieves accuracies much higher than Google Speech-to-Text.
Process voice data on your premise without sending it to a third-party cloud. Do not risk sensitive information such as call centre recordings with personal information or meeting transcriptions with trade secrets. After clicking on the mic icon or uploading an audio file, turn off your Wi-Fi. The demo will still work!
Start indexing your voice data with the production-ready Octopus Speech-to-Index and the SDK of your choice. Add only a few lines of code and make your voice data searchable.
o = pvoctopus.create(access_key)metadata =o.index_audio_file(path)matches = o.search(metadata,phrases)Build with Python
Octopus o = new Octopus.Builder().setAccessKey(accessKey).build(appContext);OctopusMetadata metadata =o.indexAudioFile(path);HashMap <String, OctopusMatch>matches = o.search(metadata,phrases);Build with Android
let o =Octopus(accessKey: accessKey)let metadata =o.indexAudioFile(path: path)let matches = o.search(metadata: metadata,phrases: phrases)Build with iOS
pv_octopus_t *octopus = NULL;pv_octopus_init(access_key,model_path,&octopus);pv_octopus_index_file(octopus,audio_path,&indices,&num_indices_bytes);pv_octopus_search(octopus,indices,num_indices_bytes,phrase,&matches,&num_matches);Build with C
Make all your voice data searchable without worrying about where they are. Run efficient and scalable Octopus Speech-to-Index anywhere.Start Building
Add voice for truly hands-free search experiences on the websites, mobile applications and devices.Search By Voice
Add voice search to mobile applications, websites, and devices. Find keywords and phrases in audio, video, and streams.Voice Search
Transformative customer and employee experience with speech analytics and intelligence tools powered by the only end-to-end Voice AI platform.Speech Analytics
Add voice commands to devices, mobile or web applications to elevate user experience.Voice Command
Speech-to-Index is a technique that makes audio automatically searchable. Audio indexing, speech indexing, acoustic indexing, and acoustic searching are other terms used in the industry and academia. It allows quick searches and rapid access to audio content. Since Octopus Speech-to-Index directly indexes speech, we coined the term Speech-to-Index.
Octopus was born due to market demand. Especially for media & entertainment use cases, customers were asking for something like Google but for audio and video content. Octopus was born to index speech data, like Google indexes websites.
In short, Octopus, Speech-to-Index is built for Voice Search, and Speech-to-Text is a workaround to enable search by converting voice to text.
Voice Search has been a topic of interest to make audio content searchable and discoverable. Given the maturity of text indexing algorithms, converting voice to text and performing a search based on the transcription is one of the first things that comes to mind. However, Speech-to-Text has several limitations: the out-of-vocabulary, longtail proper names, and competing hypotheses (homophones). We say this despite our Speech-to-Text offerings because we value transparency. Google also admits these limitations: “one of the most difficult challenges of automated speech-to-text is correctly identifying these proper nouns, known as named entities.” Google has been working on it for a while by trying to improve ASR. Despite the limitations, Google and other voice technology providers try to fit speech-to-text to every problem. We tried to understand the need first and built a product accordingly.
Enterprises use voice search to find proper nouns: location, brand, product, company, and celebrity names. It is also used for archiving, offering media and entertainment content, and social media listening. The need is to find proper names. However, it is almost impossible to foresee each search when tuning speech-to-text. Re-transcribing already transcribed recordings with extra tuned speech-to-text is also costly. Acknowledging speech-to-text limitations, we built an alternative engine for the Voice Search use case.
Voice Search is for archiving, making media and entertainment content more discoverable, and social media listening. There are some applications in governance and compliance. Don’t forget to check out use case pages such as Voice Search and Speech Analytics to learn more.
Octopus Speech-to-Index offers multilingual support for English, French, German, Italian, Japanese, Korean, Portuguese, Spanish, and more.
Contact Picovoice Sales and tell us about the opportunity, including the use case, requirements and project details.
Picovoice docs, blogs, Medium posts, and GitHub are great resources to learn about voice recognition, Picovoice engines, and how to start building voice search products. Picovoice also offers GitHub community support to all Free Tier users.
Version changes appear in the Picovoice Newsletter, LinkedIn, and Twitter. Subscribing to GitHub is the best way to get notified of the patch releases. If you enjoy building with Octopus, don’t forget to give it a star when you’re on GitHub!