Incorporating the latest speech tech into your UX

Incorporating the latest speech tech into your UX Q Manning is CEO of Rocksauce Studios, which crafts custom mobile apps for all platforms. You can find him hanging out in the Rocksauce creative loft, drinking coffee, or singing karaoke.

(Image Credit: iStockPhoto/Wachiwit)

Nobody used keyboards in the sci-fi of our childhoods. Whether it was the control system of starships or the hub of a utopian world, every interaction was based on human speech. Opening the pod doors or jettisoning the trash only required a simple command, and many of those systems replied in kind.

Now we’re closing in on that reality. Siri was the first seismic shift in the field, but companies like Google and Amazon have gone much further. Apple has even announced a massive rollout of voice-activated apps into the App Store. With high rollers like Uber, Runkeeper, and Skype all taking up the mantle of voice recognition, this tech is no longer a niche development — it is swiftly transforming into a necessity for app developers hoping to keep up with the competition.

Alexa Upped Our Speech Technology Game

With Amazon Echo, a device without screens or digital input mechanisms, we need only say, “Alexa,” and our wish is her command.

Alexa started out similarly to Siri. She could accomplish a limited set of small tasks when prompted by a human user. But Amazon opened the platform to developers around the world, and the Echo device’s capabilities grew.

It wasn’t just one company building a product from the ground up; instead, Amazon tapped into the wider technology community to build numerous command lines through the shared knowledge of the neural net.

And it isn’t just Amazon that’s making waves with this technology. Google is helping redefine human voice detection, too.

Google’s Voice Access Breakthrough

By analysing dialects, accents, sentence structure, and vocal inflexion, Google is working toward a more precise understanding of human commands. This research will allow programs to differentiate between when a user is asking a question versus making a statement.

This is a huge step in the right direction. Commercial speech recognition has improved by 30 percent over the past few years, but breaking the accent barrier will unleash a new wave of improvements.

The current incarnation of Google’s Voice Access already gives users the ability to control their phones with words instead of actions. But once Google’s research comes to fruition, the real work will begin tying it into voice and intent recognition.

How Designers Can Incorporate Speech Technology Into UX

UX designers need to start considering the consequences of these developments. On-screen displays today function side by side with limited voice recognition, but as networking begins to grow and integrate across multiple IoT devices, users will need a simple, speech-based UX.

So how can developers ensure their apps’ UX takes full advantage of this technology’s potential? There are three key ideas to adopt:

1. Consider the total experience. UX today is a primarily visual experience – but with the incorporation of speech technology – it will become an aural experience, too. Developers need to adjust their approaches accordingly. They can’t simply focus on the laying out of links or buttons; they need to think about the entire journey when somebody interacts with the software.

2. Provide audible cues. There’s nothing more frustrating for a user than confusion, and often voice-based systems leave users dumbfounded over whether their voice commands were recognised in the first place. Don’t fall into this trap. Provide audible cues to users so they know their commands were registered and understood.

3. Provide visual cues. Sometimes users won’t want or understand an audible cue – they might be shouting into their phones in a busy bar, or they might be whispering in a library. When working with visuals as well as audio, visual cues of understanding are very important, especially when there’s a series of questions to be answered. Users need to know that the first entry has been understood and that the system is basing subsequent questions on that first interaction.

The latest breakthroughs in speech technology have the potential to make our sci-fi childhoods a reality. And developers have a big role to play in unleashing our inner geeks. We can make people’s lives easier and their day-to-day tasks faster. Just don’t go build a HAL 9000.

Do you have any other tips for incorporating speech tech into a UX? Let us know in the comments.

View Comments
Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *