A Voice User Interface, Voice UI, or VUI in short, is a user interface that facilitates human-computer interaction through spoken language.

Voice User Interfaces leverage voice AI technology and enable users to engage with devices, software, and systems by articulating verbal instructions, bridging the gap between human communication patterns and digital interactions. A Voice User Interface complements or replaces other interaction methods, e.g., clicking, typing, or swiping.

Voicebots popularized by smart speakers are the typical example of a Voice User Interface.

How Voice User Interfaces Work

Voice UIs are powered by sophisticated algorithms that process spoken language. Software using a VUI, such as a smart speaker, also uses voice AI engines that process speech behind the scenes to extract the meaning. While processing and understanding the spoken language sounds simple, the technology behind it is not. The choices of technology, combination, and implementation can significantly affect the performance and the user experience. A Voice User Interface may use Keyword Spotting, Speaker Recognition, Speech-to-Intent, Speech-to-Text, Natural Language Processing, Voice Activity Detection, Speech Enhancement, and Large Language Models.

Examples of Voice User Interfaces

Voice UIs have found their place in our homes through smart speakers and on our phones. Famous Voice UIs are known by the wake words, such as Alexa and Siri, highlighting the importance of choosing the right wake word. Another typical Voice User Interface example is Interactive Voice Response (IVR) systems. An IVR allows users to navigate menus and obtain information by speaking. Other Voice User Interface examples include in-car entertainment systems, voice picking, voice inspection, and automated drive-thrus.

Embracing the Most Intuitive User Interaction

As Conversation Design Expert Erica Hall states, voice has been central to how humans interact with each other for hundreds of thousands - if not millions - of years. We learn to speak before reading and writing. Voice UIs tap into this inherent ability. They bridge the gap between humans and technology, making devices feel more like companions than machines.

