Recent advances in speech recognition created new opportunities in Sales and Customer Support. New startups such as Gong or Chrous were born and large enterprises invested in building speech analytics solutions. Most enterprises buy cloud-based real-time speech-to-text solutions from a 3rd party, generally Big Tech, and focus on building the best analytics software to convert data into insights. However, cloud-based speech-to-text is not a good fit for Sales and Support Enablement solutions. This article showcases five different Sales and Support Enablement use cases powered by five voice AI solutions.

“Real” Real-time Coaching with Local Speech-to-Text

Latency is critical for Real-Time Coaching solutions as the value of the feedback decreases every second. An agent shouldn’t receive feedback with a 15-second delay. At that point, there might not be so much they can do. Most Real-Time Coaching software available in the market first records speech data. Then it sends to the cloud for transcription and natural language processing. Finally, it provides feedback on the agents’ screen. Processing speech in the cloud with inherent latency risk causes unreliable response time. With on-prem deployment, enterprises can control response time. However, it’s almost impossible while working with 3rd party cloud providers.

Sales Enablement
Processing speech data locally on agents’ devices eliminates the reliability and latency risks, offering “real” real-time coaching.

Searchable audio archive with Speech-to-Index

In Customer Support, one may need to find all the recordings with a person, “Cathie,” an organization, “The Asia Foundation,” or a product, “Qasqai.” It’s an industry-known challenge that generic speech-to-text models struggle with proper nouns and competing hypotheses, such as homophones. An ASR can transcribe “Cathie” as “Cathy” and “The Asia Foundation” as “the age of foundations.” Fine-tuning ASR models address this challenge to a certain degree, but not completely. It’s hard for machines to capture Cathie and Cathy or Brendan, Brenden and Brendon correctly.

Speech-to-Index is the technology built to make audio files searchable and discoverable. Thus, it achieves higher accuracy than any speech-to-text.

Content Monitoring and Moderation with Keyword Spotting

Content Monitoring and Moderation should be real-time. For social media platforms hosting user-generated content, intervening a day or two later might be acceptable. In the U.S., 52% of consumers get so frustrated that they swear or even cry when soliciting Customer Support. Timely actions can differentiate enterprises from competitors easily. One solution is to use streaming speech-to-text and send the transcribed text for NLP to process every 30-45 seconds. However, in Sales and Support, one should intervene immediately when the conversation gets heated.

Content moderation for sales and support enablement is a time-sensitive matter. Keyword spotting allows real-time detection, when action needed.

Voice-activated shortcuts with Speech-to-Intent

Sales and Support should have access to relevant information while helping customers. Voice-activated shortcuts allow employees to find information fast. For example, when a customer asks a question about the refund policy, the employee can respond by saying: “I’m looking for the refund policy.” While the utterance is a natural part of the conversation, it can also bring the refund policy to the employee screen.

Employees save time and keep the focus on customers with voice-activated shortcuts instead of typing to retrieve information.

Power of Silence in Sales by Voice Activity Detection

Remember when an agent told you to hold on a second, and it took ages; or talked to an agent who didn’t allow you to think or respond? Silence is a double-edged sword in human-to-human interaction. Active conversations are good for engagement. However, silence is also powerful as it gives the other party time to think, process and respond. Giving up to 10 seconds to the other party before breaking the silence again is recommended in sales and support. Voice Activity Detection can track the quiet moments in conversations. Moreover, when Voice Activity Detection is a part of Real-Time Coaching software, a timer can start after sharing an offer or recommendation.

Silence in sales and support should be long enough to give customers the opportunity to respond and short enough to keep the conversation active.