Five Questions with Conversation Design Expert Erika Hall

May 17, 2022
Blog Thumbnail

Turn your design into a voice product with Picovoice’s Self-Service Console.

At Picovoice, we know building voice features that perform in the real world is hard. We help organizations by building the best-in-class engines that run across platforms to teach machines to understand humans when they speak and let organizations focus on the user experience and iterate their products continuously. We emphasize the importance of including users in the development process, as the success of products is determined by their adaption regardless of how complex the voice technology is or the modality.

Turn your design into a voice product with Picovoice’s Self-Service Console

Erika Hall addresses this in the very first pages of her book Conversational Design:

Conversational design is truly human-centered design, every step of the way. There is no next big thing, only the next step in an unfolding story of how people use technology to be more themselves.

Erika Hall is the co-founder of Mule Design, a strategic design consultancy firm in San Francisco and author of Conversational Design and Just Enough Research. Technologist and designer John Maeda recognizes Erika as one of the few humanist technologists working in Silicon Valley. She has been working as a designer for over two decades; witnessing recent advances in technology and engaging with many everyday brands. First, we started with voice user interfaces: “what” and “why” and then talked about “how”.

What is a voice user interface (VUI) and why is it important?

[Q1]: How did voice interfaces get your interest? Why is voice important?

I’ve always been interested in the human side of human-computer interaction, particularly the importance of natural language in interfaces. Even in a windows-based GUI, the words are the most important part. (Look how many of the apps are menus). But when we talk about “design” visual design is often the first thing that comes to mind.

When we talk about “design” visual design is often the first thing that comes to mind. If you consider how humans interact with each other, for hundreds of thousands —if not millions— of years, voice has been central.

If you consider how humans interact with each other, for hundreds of thousands —if not millions— of years, voice has been central. So the conversation is close to the original interface. But because digital systems are not actually human beings, they lack the sensitivity to context and cognitive style that makes conversation work.

A user naturally expects a system that talks like a person to behave like a person. The chasm between this expectation and reality is huge. And unlike other interfaces, it’s a lot harder to provide cues to the limitations. This makes voice interfaces really challenging. The opportunity is for systems to be more accessible and human-centered, but getting there is much harder than it might seem initially.

When we talk about “design” visual design is often the first thing that comes to mind. If you consider how humans interact with each other, for hundreds of thousands —if not millions— of years, voice has been central.

[Q2] One of the pitfalls that you mention in your book is the device-centered perspective. Given your book was published in 2018, how do you see the business questions evolving?

[Note: Asking “how do we get our customers to make more purchases using the Amazon Echo” instead of asking “how can we make the presence of an Amazon Echo in the home provide more value to both Amazon and the customer?” is an example of device-centered question Erika Hall shares in her book.]

I think assumptions are evolving more than the questions. The basic questions of “What interface is actually the most human-centered (for a given human in a given place and time)?” and “How do we use technology to deliver both customer and business value?” are still valid and still asked too infrequently.

In 2018, there was a lot of hope and hype. Now we’ve had a few more years of living with voice technology and we’re finding that getting it right is hard. It’s always tempting to start from ever progressing technical capabilities, rather than stubborn human nature. Now that we are a few years down the line it is possible to examine the track record of successes and failures and see where the introduction of voice interfaces has made a real difference, and proceed more in that direction.

How to design voice user interfaces

[Q3] The lack of context awareness is another issue you bring up in your book. Could you elaborate on that?

What each of us needs and does is totally dependent on what’s going on around us at a given time. Think about going shopping in a store versus online. In the best-case scenario, a salesperson might observe from context cues that you look like you need help, or you know exactly what you want, or that you’re texting, or carrying a sleeping baby. They can adjust their behaviour and how they communicate with you based on their observations. They might gesture, whisper, point, or raise their voice. An e-commerce site or app can’t do that.

A critical category of context for voice interactions is shared spaces: Who else might be speaking or listening at the same time besides the anticipated user?

For example, think of calling a bank’s customer service line. A context-unaware assumption is that because a customer is calling instead of using a mobile app, for example, the customer wants to use voice for all parts of the interaction. But what if the customer is on public transportation on the way to the airport and wants to quickly check a balance or make a payment in a public place. They probably won’t want to speak their account number or personally identifying information aloud. However, a customer who is in the middle of cooking dinner might want to check on their account in a completely hands-free manner.

A voice interface that works well when one person is alone at home in private, might be terrible when there are two children and two adults in the same space, as in many households during the pandemic.

Bad assumptions make worse interactions. And the only way you can design a system that works in all likely contexts is to understand as much about the messy lived experiences of your potential users as possible.

A critical category of context for voice interactions is shared spaces: Who else might be speaking or listening at the same time besides the anticipated user?

[Q4] When you think of developing a Graphic User Interface, let’s say a website, it’s accessible by every browser in various form factors and let’s say if you want to change the colour, easy! However, building Voice User Interfaces is different. Most vendors do not offer cross-platform support or easy and iterative tools. What’s the impact of this lack of support on you as a design strategist and on your clients?

I think it’s actually too easy to make graphical changes to the surface of an interface. Often the level of visual polish can lead designers and decision-makers to think that the concepts underneath are stronger than they are. And the concepts underneath are the most important part.

Read Picovoice's tips for user experience design to learn more about iterative voice product development.

Designing and building are two different things, and a lot of the most important work of designing can be accomplished with very unsophisticated tools. Talking things through, sketching, role-playing, and making rough prototypes to test concepts. Things like that.

If you are very thoughtful about your research and design process—including being very clear on your organization’s capacity and goals—by the time you get to building out your system, you will be more able to iterate in a strategic, proactive, and efficient manner.

The cliché that “change is the only constant” is half true. It is very useful to go through the exercise of identifying which aspects of your system are likely to remain the same for a given time period while creating space for continuous listening, learning, adaptation, and growth. Finding this flexible middle ground is the essence of system design. There is no silver bullet or substitute for diligent research and cross-discipline collaboration.

Read Picovoice's tips for user experience design to learn more about iterative voice product development.

[Q5] Lastly, what would be your advice for organizations interested in adding voice to their products?

Interested in more interviews? Read Stanford Computer Science Professor Monica Lam’s views on voice AI.

Look at the system holistically to understand how, why, and in what contexts voice interactions will actually make products easier to use and more valuable to both the customers and the business. Because you’re looking at a significant investment and a workflow change... Think across devices and modes, because that’s what humans do. And hire some poets and playwrights.

Interested in more interviews? Read Stanford Computer Science Professor Monica Lam’s views on voice AI.