Octopus Speech-to-Index

Make audio and video archives searchable and discoverable.

Phonetic-based keyword search engine for audio streams, enabling search in massive libraries in seconds

Press the button
to start searching with Octopus

What is Octopus Speech-to-Index?

Octopus Speech-to-Index is a search engine that indexes speech directly without converting it into text, enabling a keyword search within audio and video files.

Octopus Speech-to-Index finds any keyword, including proper names or slang, without knowing the exact spelling, removing the limitations of automated transcription solutions.

Find what matters, even without the exact spelling

o = pvoctopus.create(access_key)
metadata =
matches = o.search(
Build with Python
Octopus o = new Octopus.Builder()
OctopusMetadata metadata =
HashMap <String, OctopusMatch[]>
matches = o.search(
Build with Android
let o =
Octopus(accessKey: accessKey)
let metadata =
o.indexAudioFile(path: path)
let matches = o.search(
metadata: metadata,
phrases: phrases)
Build with iOS
const octopus =
await OctopusWorkerFactory
command: "index",
input: audioSignal,
command: "search",
metadata: octopusMetadata,
searchPhrase: searchText,
Build with JavaScript
pv_octopus_t *octopus = NULL;
Build with C

Why Octopus Speech-to-Index?

Enterprises use automated transcription to find keywords and phrases in the audio and video libraries, despite not being built for this purpose. Automated transcription struggles with homophones and cannot transcribe words if not in the dictionary.

Octopus Speech-to-Index uses an acoustic-based search, achieving much higher accuracy than a generic transcription engine.

Discover your audio and video libraries!

Monetize your content, monitor conversations, or ensure compliance without the limitations of automatic transcription.

Octopus Speech-to-Index

  • ๐Ÿ”
    Accurate keyword search
  • โšก
    50x faster processing
  • ๐Ÿ”’

Automatic Transcription APIs

  • ๐Ÿ”ค
    Accurate generic transcription
  • ๐Ÿ‹๏ธ
    Large and bulky models
  • ๐Ÿ‘‚
    3rd party data sharing
Beyond speech-to-text accuracy

Accurate โ€” Backed by open-source benchmark

Reduce errors by four times compared to Google Speech-to-Text. The open-source benchmark shows Octopus Speech-to-Index is the right tool for the job and outperforms the workarounds.

50x faster processing

Lighting fast indexing across platforms

Make audio and video files searchable in seconds. Compared to Mozilla DeepSpeech, Octopus Speech-to-Index processes voice data 50 times faster while returning ten times more accurate results.

Fully private with on-device processing

Stay compliant with GDPR, CCPA, HIPAA, and more!

Protect sensitive information, such as call center recordings with personal data or legal depositions with confidential information. Automated transcription APIs send voice data to a 3rd party cloud to process it, while Octopus Speech-to-Index processes voice data anywhere.

Get started with

Octopus Speech-to-Index

The best way to learn about Octopus Speech-to-Index is to use it!

Start Now
Forever Free Plan
  • Intuitive SDKs
  • Resource-efficient
  • Unlimited Search
  • English, French, German, Italian, Japanese, Korean, Portuguese, and Spanish
Learn more about

Octopus Speech-to-Index

What is Speech-to-Index?

Speech-to-Index, also known as audio indexing, speech indexing, and acoustic indexing, is a technique that makes audio automatically searchable and discoverable. As it performs searches based on phonetics, itโ€™s also known as phonetic search, phonetic-based search, and acoustic search. It allows quick searches and rapid access to audio content. Picovoice built Octopus Speech-to-Index as a response to market demand. It indexes even massive audio and media libraries as Google indexes websites and returns keyword search results.

Why is Octopus Speech-to-Index better than using speech-to-text to find keywords and phrases in media content?

Octopus Speech-to-Index is built for finding keywords and phrases, whereas speech-to-text is for generic transcription. Given the maturity of text indexing algorithms, transcribing voice to text and then performing a search based on text seem like a good workaround to many. However, speech-to-text has limitations in correctly identifying these proper nouns and homophones. (Katia Leighton vs. Katja Layton and fair vs. fare). Acknowledging speech-to-text limitations, Picovoice built an acoustic-based phonetic search engine, Octopus Speech-to-Index, dedicated to finding keywords and phrases in audio libraries with high accuracy and speed.

What can I build with Octopus Speech-to-Index?

Octopus Speech-to-Index enables many use cases, media asset management, legal e-discovery, dialogue search, and social media listening .

Which platforms does Octopus Speech-to-Index support?

Which languages does Octopus Speech-to-Index support?

Octopus Speech-to-Index offers multilingual support for English, French, German, Italian, Japanese, Korean, Portuguese, Spanish, and more.

What should I do if I need support for other languages?

Contact Picovoice Sales and tell us about the opportunity, including the use case, requirements, and project details.

How do I get technical support for Octopus Speech-to-Index?

Picovoice docs, blog, Medium posts , and GitHub are great resources to learn about voice AI, Picovoice technology, and how to enhance speech quality. Picovoice also offers GitHub community support to all Free Plan users.

How can I get informed about the updates and upgrades?

Version changes appear in the , LinkedIn , and Twitter . Subscribing to GitHub is the best way to get notified of patch releases. If you enjoy building with Koala Noise Suppression, show it by giving a GitHub star!