Find spoken words within audio files
Scalable, efficient and production-ready acoustic indexing
Start Building for FreeWork with an expertWhen it comes to accuracy, make a decision based on data, not prerecorded demos. We have created an open-source accuracy benchmark to show that Octopus Speech-to-Index achieves accuracies much higher than Google Speech-to-Text.
Process voice data on your premise without sending it to a third-party cloud. Do not risk sensitive information such as call centre recordings with personal information or meeting transcriptions with trade secrets. After clicking on the mic icon or uploading an audio file, turn off your Wi-Fi. The demo will still work!
Start indexing your voice data with the production-ready Octopus Speech-to-Index and the SDK of your choice. Add only a few lines of code and make your voice data searchable.
o = pvoctopus.create(access_key)metadata =o.index_audio_file(path)matches = o.search(metadata,phrases)Build with Python
Octopus o = new Octopus.Builder().setAccessKey(accessKey).build(appContext);OctopusMetadata metadata =o.indexAudioFile(path);HashMap <String, OctopusMatch[]>matches = o.search(metadata,phrases);Build with Android
let o =Octopus(accessKey: accessKey)let metadata =o.indexAudioFile(path: path)let matches = o.search(metadata: metadata,phrases: phrases)Build with iOS
const octopus =await OctopusWorkerFactory.create(accessKey,octopusIndexCallback,octopusSearchCallback);octopus.postMessage({command: "index",input: audioSignal,});octopus.postMessage({command: "search",metadata: octopusMetadata,searchPhrase: searchText,});Build with JavaScript
pv_octopus_t *octopus = NULL;pv_octopus_init(access_key,model_path,&octopus);pv_octopus_index_file(octopus,audio_path,&indices,&num_indices_bytes);pv_octopus_search(octopus,indices,num_indices_bytes,phrase,&matches,&num_matches);Build with C
Make all your voice data searchable without worrying about where they are. Run efficient and scalable Octopus Speech-to-Index anywhere.
Start BuildingEnglish
German
Deutsch
Spanish
Español
French
Français
Italian
Italiano
Japanese
日本語
Korean
한국어
Portuguese
Português
Mandarin
普通话
Dutch
Nederlands
Russian
Русский
Hindi
हिन्दी
Polish
Język polski
Vietnamese
Tiếng Việt
Swedish
Svenska
Arabic
اَلْعَرَبِيَّةُ
Add voice for truly hands-free search experiences on the websites, mobile applications and devices.
Search By VoiceAdd voice search to mobile applications, websites, and devices. Find keywords and phrases in audio, video, and streams.
Voice SearchTransformative customer and employee experience with speech analytics and intelligence tools powered by the only end-to-end Voice AI platform.
Speech AnalyticsAdd voice commands to devices, mobile or web applications to elevate user experience.
Voice CommandSpeech-to-Index is a technique that makes audio automatically searchable. Audio indexing, speech indexing, acoustic indexing, and acoustic searching are other terms used in the industry and academia. It allows quick searches and rapid access to audio content. Since Octopus Speech-to-Index directly indexes speech, we coined the term Speech-to-Index.
Octopus was born due to market demand. Especially for media & entertainment use cases, customers were asking for something like Google but for audio and video content. Octopus was born to index speech data, like Google indexes websites.
In short, Octopus, Speech-to-Index is built for Voice Search, and Speech-to-Text is a workaround to enable search by converting voice to text.
Voice Search has been a topic of interest to make audio content searchable and discoverable. Given the maturity of text indexing algorithms, converting voice to text and performing a search based on the transcription is one of the first things that comes to mind. However, Speech-to-Text has several limitations: the out-of-vocabulary, longtail proper names, and competing hypotheses (homophones). We say this despite our Speech-to-Text offerings because we value transparency. Google also admits these limitations: “one of the most difficult challenges of automated speech-to-text is correctly identifying these proper nouns, known as named entities.” Google has been working on it for a while by trying to improve ASR. Despite the limitations, Google and other voice technology providers try to fit speech-to-text to every problem. We tried to understand the need first and built a product accordingly.
Enterprises use voice search to find proper nouns: location, brand, product, company, and celebrity names. It is also used for archiving, offering media and entertainment content, and social media listening. The need is to find proper names. However, it is almost impossible to foresee each search when tuning speech-to-text. Re-transcribing already transcribed recordings with extra tuned speech-to-text is also costly. Acknowledging speech-to-text limitations, we built an alternative engine for the Voice Search use case.
Voice Search is for archiving, making media and entertainment content more discoverable, and social media listening. There are some applications in governance and compliance. Don’t forget to check out use case pages such as Voice Search and Speech Analytics to learn more.
Octopus Speech-to-Index offers multilingual support for English, French, German, Italian, Japanese, Korean, Portuguese, Spanish, and more.
Contact Picovoice Sales and tell us about the opportunity, including the use case, requirements and project details.
Picovoice docs, blogs, Medium posts, and GitHub are great resources to learn about voice recognition, Picovoice engines, and how to start building voice search products. Picovoice also offers GitHub community support to all Free Tier users.
Version changes appear in the Picovoice Newsletter, LinkedIn, and Twitter. Subscribing to GitHub is the best way to get notified of the patch releases. If you enjoy building with Octopus, don’t forget to give it a star when you’re on GitHub!