The silent relationship between humans and machines has been revolutionized with the introduction of voice devices. From speakerphones to voice assistants like Siri, Alexa, and Bixby, people are now speaking out to their devices and getting things done.
People no longer need to touch their smartphones or computers. All they need is a simple “Hey, Siri,” and voila! They have access to an entire world of information at their fingertips.
As a UI designer, the only question is how do you tap into this new era of voice-enabled devices and create a user interface that feels natural?
Well, we’ll explore some of the key considerations to keep in mind when designing for voice, including the importance of using natural language and integrating voice commands into your interfaces.
We’ll also examine the key trends shaping this exciting new space, from adaptive voice interfaces to visual chatbots.
So, if you’re ready to join the next frontier of user interfaces, we suggest you keep reading!
What’s a Voice User Interface (VUI)?
At its core, a VUI is simply a system that allows users to interact with their devices (such as speakerphones, digital assistants, and smartwatches) using voice commands.
The first VUI — then called interactive voice response (IVR) — was developed in the 80s and saw widespread adoption in the 90s and early 2000s as phone systems migrated from analogue to digital platforms.
It was common in customer service, where callers would dial into a company’s automated system and interact with it using voice commands.
By recognizing common words and phrases, these systems could route callers to recorded responses or connect them to a live agent to handle the more complex requests.
Today, VUIs have evolved significantly, thanks in part to advancements in natural language processing (NLP) and artificial intelligence.
With VUIs, users can now speak naturally with their devices to get things done, whether that means checking the weather, playing music, or ordering groceries.
Everyday use cases for VUIs include online shopping, banking and investment, travel, and gaming.
As technology continues to evolve, there’s no doubt that VUIs will play a central role in the future of user interfaces.
VUIs include Siri, Alexa, Cortana, Amazon Echo, Google Now, and Bixby.
Why Is VUI a Growing Trend?
One of the reasons VUI became such a popular trend is that it represents a natural evolution from keyboards and touchscreens.
With touch-based interfaces, users are forced to interact with their devices through small taps and swipes, which proved cumbersome and frustrating.
However, with voice-enabled interfaces, users can speak their commands and get things done quickly and easily.
Another plausible explanation is the prevalent use of smart speakers, which have made it easier for people to integrate voice commands into their daily lives.
These devices offer a hands-free, frictionless experience that allows users to accomplish tasks quickly and easily without switching between different screens or devices.
More specifically, interacting with voice interfaces offers a few advantages over traditional user interfaces, including:
Speed: Think about how frustrating it can be to type out long commands on a keyboard or small screen. With voice, users can speak their commands and get things done almost instantly.
Apparently, direct speech is four times faster than typing on a smartphone.
Ease: With voice interfaces, users don’t have to worry about learning a new set of commands or navigating multiple screens. Instead, they can speak naturally, as if conversing with another person.
A good VUI design should be intuitive, with a clear and consistent voice feature that makes it easy for users to figure out how to use the system.
Safe Multi-tasking: When driving, cooking, or doing other activities that require your hands, voice commands can be a lifesaver.
They offer a hands-free experience that allows users to accomplish tasks without worrying about fumbling with their devices or getting distracted.
Accessible Design: Voice interfaces are also ideal for visual or motor impairments users, as they offer a more accessible and inclusive user experience.
In other words, voice interfaces can help level the playing field, enabling more people to get things done without relying on traditional input methods.
A More Human-like Interaction: Finally, one of the biggest advantages of VUIs is that they allow users to interact with technology in a human-like manner.
Compared to traditional input methods, which often feel cold and detached, voice interfaces tend to feel more familiar and friendly, hence more likely to make the user feel more comfortable using them.
Voice User Interface (VUI) Design: Factors to Consider
When designing a VUI, there are several key factors to consider:
From Visual to Verbal: One of the biggest differences between VUIs and traditional user interfaces is that input is primarily verbal rather than visual.
With traditional interfaces, users are forced to learn and remember a set of commands that they must enter through text or graphical elements.
With VUIs, on the other hand, users simply speak their commands and the system understands and executes them. The user journey will likely differ, relying on vocal commands and feedback instead of aesthetic elements and visual cues.
More Conversational: Another key difference is that VUIs tend to be more conversational and human-like in their interaction style.
Unlike traditional interfaces, which often seem sterile and impersonal, voice interfaces allow users to interact with the system in an informal, human-like way.
The earlier, simpler version, IVR (Interactive Voice Response), worked only through selections and required users to remember a set of pre-determined commands.
However, VUI technology takes things a step further by using natural language processing and machine learning to understand and respond to more complex queries.
The technology doesn’t just take voice commands and give back a response but also processes speech, learns from past interactions and user preferences, and adjusts its responses accordingly.
That way, the system can understand and respond to a broader range of queries, offering a more personalized and natural user experience.
That’s what makes smart speakers like the Amazon Echo and Google Home “smart.” It incorporates Artificial intelligence (AI) and Natural language processing (NPL) to understand and respond to more complex commands, even going as far as considering things like context, user history, and user preferences.
What Did You Say? Given that the input is primarily verbal, we suggest you consider the different dialects and accents your users may have and how your voice commands might be interpreted differently based on those accents.
For example, say your VUI is meant to communicate in UK English, but your user speaks with a strong Australian accent.
In that case, your system might not understand the user’s commands correctly, or it might not respond in a way that feels natural to the user.
The dialect difference doesn’t even have to be that extreme.
Even slight differences like West Coast vs. East Coast American accents are enough to throw off a VUI.
When designing for voice, the key is to create dialogues that are conversational and feel natural.
You want to avoid over-formal speech and word choice, instead using more casual language that seems natural and fits the context of the conversation.
Data Privacy and Security: Security concerns are critical when designing VUIs, as these systems are always on and constantly collecting data.
They’re always taking in information through their microphones or their built-in cameras and transmitting it back to the cloud for processing.
That might raise some privacy concerns, especially since the information is often sent to a remote server rather than a local device.
When designing your VUI, you want to install security measures that ensure users’ data is always protected and kept private.
The VUI technology must have a secure end-to-end encryption protocol to prevent unauthorized user data access.
Speech is Resource-intensive: Similar to AI and NPL, speech recognition is a resource-intensive technology requiring powerful cloud servers.
First, it calls for increased computational power to process the vast amounts of data collected by VUIs.
Then, it requires a lot of energy and robust power sources to keep everything running smoothly.
You also have to factor in the complexities of programming and managing the technology, not to forget costs and storage space.
All these factors mean that, when designing the UI for a VUI, you must consider how much processing power your system will require and, most importantly, how you will manage it.
Voice User Interface Design: VUIs are all the rage right now as more and more users look to make their online experiences hands-free.
Despite the challenges associated with designing for voice, many see technology as a vital component of the future of user experience design.
To stay competitive in this ever-changing landscape, you must keep up with the latest VUI trends and best practices for designing for voice.
How to Design for the Age of Voice
As a designer, how do you embrace this new medium and create a voice user interface (VUI) that genuinely fits the needs of your users?
Step #1: Start by Experimenting with Voice
The idea is to make your website more conversational. And yes, you’re not limited to only one way of doing this.
You can start by trying out different voice dialogues, such as chatbots, voice commands, or natural language processing (NLP).
Think about the voice-based interactions you want your users to have, and explore how you’ll use voice technology to make those interactions as seamless and intuitive as possible.
For instance, instead of embedding voice technology into your website, you might want to create a separate Amazon Alexa Skill for devices like the Amazon Echo or Google Home.
Then, you can experiment with different types of voice interactions to see what works best for your users and iterate on that based on their feedback.
Companies like Capital One, for example, have built robust VUI skills that let users access their banking information and even make payments using just their voice. Users don’t have to open their laptops or pull out their phones to access these features — all they have to do is speak.
To determine what kind of voice interface is right for your users, you must work with them on customer journey maps that identify their key interactions with your product.
These journey maps will help you better understand your users’ needs and how voice fits into their overall experience.
That will also help you determine what features to prioritize when designing an actual VUI. For example, if security is a key concern for your users, you’ll want to ensure that robust security measures are in place when designing the VUI.
Similarly, suppose your VUI requires a lot of computational power. In that case, you might need to consider how much processing power your system will need — and what kind of infrastructure you’ll need to support it.
Say your customers are always asking questions about a brand and its services. You can create a FAQ page supported by bots that can respond to voice queries and offer information on the spot.
Step #2: Examine the Anatomy of Voice Commands
Before creating a dialog flow for your VUI, you must understand how voice commands work. Make sure you understand the objective of the interaction, as well as the expected user responses and the valid values for each response.
A voice consists of three crucial factors for designers:
- Intent: This describes what the user wants to accomplish. It represents the subject and the context of the voice command.
A high utility intent involves a request that requires a direct answer, such as “What time is it?”
A mid-utility intent is less direct, such as “What’s the weather like?”
Low utility intents involve more abstract concepts or questions, such as “How can I be more creative?”
Low utility requests are hard to decipher because they often involve complex concepts and open-ended questions.
One way to deal with low utility requests is to provide contextual data to help the user enter the right intent.
For example, if you have a voice-based interface for a health app, you might ask the user questions about their medical history or symptoms they’re experiencing to narrow down their intent.
- Utterances: This describes the exact words and phrases used by a user to issue an intent.
For example, if the user asks, “What time is it?” or “What’s the weather like?” these can both be considered utterances.
These utterances usually involve a combination of commands, nouns, and adjectives to get a direct answer from your system.
As a designer, you must consider all the possible utterances for each intent and create a dialog flow that can respond to multiple variations.
For instance, this might mean creating a “weather” intent that responds to questions like “What’s the weather?” or “Is it sunny today?”.
Or, you might create more specific intents, such as “What’s the weather in Seattle?” or “What’s the temperature like today?”.
- Optional Variables: These are variables that might appear in an utterance but are not crucial to the intent.
For example, let’s say you’re designing a voice interface for a music-listening app.
You may want to create intents for controlling playback, such as “Play” or “Pause.”
But some of your users may also want to be more specific, such as “Skip this track” or “Play my favourite song.”
In these cases, you can include an optional variable that allows users to specify how they want a command executed.
In the case of a five-star hotel in Las Vegas, the descriptor “five stars” might be considered optional since it’s not necessary to describe the hotel for a user to book a room.
So, you should design voice commands that can include this variable but don’t require it as a base requirement for action.
Step #3: Learn How to Prototype with Dialog Flows
Once you understand the anatomy of a voice command, it’s time to start prototyping interactions.
Prototyping your voice user interface can help you identify potential design flaws and create a more engaging UX.
You can use several tools to prototype dialogue flows for your VUI, such as Dialogflow and Voiceflow.
- Dialogflow: This is one of the most popular prototyping tools for designing VUIs.
It uses natural language processing algorithms to understand user requests and create a matching response.
This way, you can design highly personalized dialogue flows that respond instantly to all types of user input.
One feature that makes Dialogflow especially useful is its “Fulfillment” feature, which allows you to incorporate real-time data into your voice commands.
For example, suppose you’re designing a voice interface for airline passengers. In that case, you can use Fulfillment to pull up flight status information or weather updates in real-time as users are speaking.
It uses a visual interface that allows you to create dialogue flows like a flowchart.
That makes it easier to design complex interactions with multiple paths.
Voiceflow is also integrated with popular AI platforms, such as Amazon Alexa and Google Assistant, which makes it easy to test your prototypes against real devices.
Your dialogue flow should outline the following:
- Keywords that lead to the interaction: To design a voice UI, you will need to understand the key keywords that will lead users to interact with your system. These might include words like “play,” “pause,” or “skip,” used to control playback in a music-listening app. Other keywords may include more specific commands, such as “What’s the weather today?”
- Branches that Represent Different Paths: Next, you will need to create branches in your dialogue flow that represent different potential paths. These might include alternative ways of executing an action, such as “Go back to my favourite song” or “Skip this track.” Each path should lead to a specific outcome or response from the system.
- Conditional statements: You want to include conditional statements in your dialogue flow that alter the path depending on the user’s input. For example, if a user asks, “Play my favourite song,” you might want to check their playlist status before responding with a track name and starting playback.
- Example dialogs for the user and system: Your dialogue flow should include example dialogs for both the user and system. You want to ensure that the response makes sense to both parties and is easy for users to understand. For example, the system might respond, “Okay, playing your favourite song now” or “I’m sorry, that track is not in your playlist.”
Remember, the most crucial part of designing a voice UI is to create a natural and engaging experience that responds instantly to users’ requests. Using the right prototyping tools and following these design best practices, you can create compelling voice UIs that delight your users.
A Few Pointers on How Use to Use Voice UI Design Tools
Voice UI design is an emerging field that requires a combination of creativity, technical know-how, and user experience expertise. To successfully create engaging voice UIs, you will need to use the right prototyping tools and follow a few design practices.
As you design your voice UI, be sure to observe a few things:
Always Confirm When a Task is Completed: With ecommerce and other transactional tasks, it is important to confirm when a task has been completed successfully. For example, if you are designing a voice system for ordering food online, you may want to confirm that the order was received and processed by the restaurant before moving on to the next interaction.
Create a Strong Strategy for Errors and Exceptions: While most voice UIs function reasonably smoothly, it is important to plan for exceptions and errors. For example, suppose a user makes an incorrect request, or the system encounters an error in processing their input. In that case, you will need to design appropriate responses that can help them troubleshoot the problem.
For example, if a user asks to play a song that’s not in their playlist, you might want to offer suggestions for correcting the issue and moving on.
Make Your Voice UI Responsive and Interactive: The key to creating engaging voice UIs is ensuring that your system responds instantly to user requests. That means using natural language processing algorithms and intuitive commands to create a fast and seamless user experience.
Add an Extra Layer of Personality and Emotion: Add an extra layer of personality and emotion to your voice UI designs using conversational cues, animated responses, and subtle humour. That should help create a more natural and engaging experience for users.
Add an Extra Layer of Security: Finally, be sure to add an extra layer of security and privacy to your voice UI designs. That can include speech recognition systems limiting access to sensitive information or user-customized authentication procedures.
You also want to incorporate biometric security features such as voice recognition or retina scans to help ensure that only authorized users can access your voice UI.
For this that requires payments, you want to use an online payment gateway, like Stripe or Braintree, that is well-equipped to handle secure transactions using voice commands.