Jump to ratings and reviews
Rate this book

The Sound of the Future: The Coming Age of Voice Technology

Rate this book
Why voice technology is the next big thing in technology, as big as mobile a decade ago and the internet in the late 90s, fundamentally altering the way companies do business.

Voice is the next technology—remarkably similar in potential impact to the internet and mobile computing—poised to change the way the world works. Tobias Dengel is in the vanguard of this breakthrough, understanding the deep, wide-ranging implications voice will have for every industry. In The Sound of the Future, he connects the dots about this emerging paradigm to vividly illustrate how business leaders can stay ahead of the game, rather than scrambling to catch up, as voice technology gradually reveals its power, creating a host of new winners and losers.

Using fascinating, colorful stories, Dengel explains how the “voice-first” experience is becoming part of the global technology mainstream, exploring the ways voice will do a better job of serving basic human needs such as safety, speed, accuracy, convenience, and fun, as well as making it possible for hundreds of millions of people around the planet to participate more fully and productively in today’s high-tech world by making interactions with technology virtually effortless.

A pervasive technology like the internet and mobile, voice, with applications in marketing, sales, service, manufacturing, and logistics, will change the way we work at every level and every function, driving down costs, boosting productivity, and enabling the creation of entirely new business models.

This is not simply about Siri and Alexa. They are the tantalizing but incomplete precursors of the ultimate interface that will make technology easier, faster, more accurate, and more human.

PLEASE When you purchase this title, the accompanying PDF will be available in your Audible Library along with the audio.

Audible Audio

Published October 10, 2023

23 people are currently reading
164 people want to read

About the author

Tobias Dengel

1 book4 followers

Ratings & Reviews

What do you think?
Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars
21 (33%)
4 stars
21 (33%)
3 stars
9 (14%)
2 stars
10 (15%)
1 star
2 (3%)
Displaying 1 - 22 of 22 reviews
Profile Image for Alana.
136 reviews4 followers
March 4, 2024
I was so excited by the title of this book because I see a lot of potential with voice technology, so I was disappointed to find out this book was merely a startup pitch to use voice tech for the sake of making profits, be it through better customer service, increasing worker efficiency, or making more enticing and effective advertisements.

If the book had a little more science or history, it could have been a book marking the evolution of voice tech. However, the scope of history it covers is brief and the majority of the book is speculation and market prediction. I think the only compelling use case in this book was safety, but also at the expense of poo-pooing violations of privacy. The other good point is that moving towards a multimodal interface (i.e. being able to use voice and touch/visual input) would make many applications and communication/information tools better.

Voice tech could be so much more, such as giving those who can't speak a voice closer to those who can use their own (look up "augmentative and alternative communication"), but the author can only imagine a use case for voice tech that helps those who can already have a voice, which if technology only focused in this direction would increase the opportunity gaps between verbal communicators and AAC users. In contrast, I wish more imagination and innovation would be invested in giving people voices, such as more training so that AI can predict or catch brain signals or subtle movements to give non-speaking individuals a voice that speaks naturally and is accessed easily.
8 reviews3 followers
November 13, 2023
Engaging, readable, and persuasive book explaining the market forces-- and basic intuition --that indicate that the digital world will be profoundly reshaped by voice technology in the years ahead. Provides compelling data, research, and clear next steps for companies looking to remain ahead of the curve of consumer demand for outstanding voice technology, including AI integration/implications
Profile Image for Josh Callahan.
60 reviews1 follower
May 19, 2024
This book was much too basic to be useful for somebody employed in the conversational AI tech sphere. The author’s main argument is that voice is the next major evolution in human-computer interfaces. He painstakingly lists out every industry and situation in which voice tech could be applied: hospitals, banks, restaurants, homes, factories, etc. He casually dismisses conversational AI as having limited uses and instead describes a future where voice controls are bounded like an IVR. He totally missed the mark on the reasoning potential of LLMs and his book has quickly become dated in one short year. He conveniently suggests that businesses hire out their voice needs to consultants (😉 he runs a consulting firm). There were a few interesting pieces of information scattered throughout but overall it focused too much on the business cases and wasn’t technical whatsoever.
Profile Image for Kathy Reid.
24 reviews4 followers
January 4, 2024
The summary

Written for a business audience, this book has two distinct sections. The first provides a gentle, integrated primer on voice technologies, such as automatic speech recognition (ASR), speech to text (STT), text to speech (TTS) or voice cloning, and natural language processing (NLP), and links these to the human needs fulfilled by voice technology. The second is essentially an extended pitch deck. Unabashedly techno-optimist in outlook, it seeks to grow the market for voice technologies by encouraging the reader to examine their own organisation’s operations for voice technology use cases, and provides a detailed guide to the user research and interface design steps needed to implement a voice technology program.

This is unsurprising, given Tobias Dengel is the CEO of WillowTree, an AI and digital product consulting company recently acquired by TELUS international for $USD 1.2 billion – which focuses on gathering training data for AI applications. His expertise in human-computer interaction (HCI) and user-centred design (UCD) is evident in the first half of the book, where voice technologies are continually grounded in user tasks and experiences. In the second, his experience is shown in the methods advocated for exploring voice use cases, with a focus on HCD methods such as journey mapping. Co-author Karl Weber is an editor; his collaboration with Dengel makes the text readily approachable and succinct; terms unfamiliar to the lay reader are well described, and the use of acronyms is minimal.

The book draws heavily on examples from industry to highlight key claims, however some of these are now dated. Stanford Open Voice Assistant Lab (OVAL)’s Almond assistant was re-named Genie in 2021, however has not had any active development for over two years, and the research group has pivoted to working primarily in the large language model (LLM) space. The Open Voice Network’s initiatives on trustworthy voice assistants have now been folded into the umbrella of the Linux Foundation. This is perhaps unavoidable in such a fast-moving space.

Part One – Aligning the use of voice technology to the human need for communication

Each chapter in the first half of the book details a particular human need that is met by voice technology.

The Prologue paints a picture of the transformative power of voice tech, showing how it was used to help those physically impaired to be able to communicate again – using speech – the most natural form of communication.

The Introduction makes a bolder claim – that voice is a technological revolution – akin to the internet or to the mobile phone: nascent, latent, reaching a tipping point of “ubiquity and popularity” that we should all be prepared for lest it catch us unawares. While acknowledging that voice tech is currently limited in application, and harbours a panoply of challenges, the authors hand-wave these away, pointing to the rapid advances being made across the vibrant voice tech ecosystem – inhabited by companies such as ReadSpeaker, SoundHound, Cerence and others. The sizeable investments made in voice are given as evidence for the technological revolution, but differentiated from over-hyped failures such as blockchain and the metaverse in that voice “fulfills basic human needs”, which are articulated in subsequent chapters.

Speed makes the case for “even marginal improvements in speed/efficiency” when designing user interfaces, highlighting examples such as search engines and online shopping websites to reinforce the point that speaking to machines is often quicker than typing to them. It imagines a world where the keyboard is eschewed in favour of the the microphone as the primary mode of data input, because this is faster – and time is money. The physical toll of such a change – can you imagine speaking for the same amount of time you type? – is left unexamined. I wonder what Mica Endsley or other human factors scholars would make of this claim.

The next chapter demonstrates how voice technology meets the need of Safety – by being available to assist when the user is physically incapacitated. There is a claim made in this chapter that was particularly contentious: that having a voice assistant in the cockpit would “prevent crashes and save lives”. While plane <-> tower communication is definitely a contributory factor to many incidents, there is no discussion here of the complexity introduced by voice assistants. Imagine, for example, the utterance engine one out! being mis-transcribed as engine won naught!. Sure, the language model can be weighted for cockpit utterances, but mis-transcription is still rife, even in state of the art systems (Whisper, for example, has a 9.3% Word Error Rate as tested on Common Voice 15).

Knowledge makes the case for voice technology as an interface to the world’s information. Rather than having data at your finger tips, it’s now available on the tip of your tongue – overcoming the limitations of screen real estate. Dengel and Weber also make the case here for voice where users are not computer literate: you don’t have to know how to use a computer to ask a question of a voice assistant. What is not well explained here is that access to knowledge is mediated through millions of APIs – and to curate or synthesise them requires additional capabilities. The potential for commercialisation to skew results in a particular way (such as booking sites preferencing those providers that pay them the most) is left unaddressed. This chapter also touches on voice technology as one of many anticipatory systems – having predictive capabilities through audio feature detection to infer an event is about to happen, and respond. What isn’t covered is the downside of this form of machine surveillance, covered well by researchers Joel Stern, Sean Dockray and James Parker in their Machine Listening: Exposed collaboration.

In the chapter on Inclusion, the authors make the case for voice technology building “a more inclusive society”, pointing to advancements in screen readers, speech to text and smart hearing aids as mechanisms that help in “…liberating and empowering individuals who have too long been excluded from mainstream society…”. The challenge of machine translation for the world’s 7100 spoken languages is also addressed, and inequities in the availability of tooling for under-resourced languages and the existing Anglo-centrism of the tech sector, quite rightly, highlighted. Kathleen Siminyu’s work with Common Voice’s East Africa project, which is providing speech data and tools for the Kiswahili project, gets a mention, which delighted me, however when chatting with her, she was unaware of being featured. Absent was any argument for addressing the lack of investment in low-yield languages – languages whose speakers are not “profitable”. This is likely to remain the purview of NGOs and governments for the foreseeable future, lamentably.

Engagement makes the case for voice technology making life “more creative, entertaining and enjoyable”, using radio and television as previously emerging technologies that were fun to use, which drove adoption. Dengel and Weber speculate about what might happen to voice actors in a time of synthesised voices, seeing both the economic reality of the cost of live narration, and, counter-intuitively, the increasing value of human voices in a soundscape saturated by synthetic speech. They go on to link voice tech to the metaverse and to virtual reality, showing how it is a necessary building block in “multi-modal” experiences. Again, there was no concomitant discussion of the ethics of synthetic speech – and importantly, how “synthetification” – the growing movement to synthetic media – shapes power relations, labour relations and who profits.

The chapter on Transformation ties voice technology to “fundamental changes to business models”, through mechanisms such as voice identification through biometrics, and the aggregation of services to provide a streamlined, personal offering. It covers the move from click-through rate (CTR) in screen advertising to say-through rate (STR) for voice-enabled advertising; again however, it does not explore the ethical or societal issues such changes might bring. I’m reminded here of Joseph Turow's excellent The Voice Catchers: How Marketers Listen In to Exploit Your Feelings, Your Privacy, and Your Wallet – and how voice is being used as a mechanism to target advertising. The chapter goes one step further, exploring the use of vocal biomarkers in health – but again, without the attendant discussion of unintended consequences. Who stands to benefit if a disease can be diagnosed simply through speaking?

Part Two – A program of work for implementing voice technology use cases within the enterprise

Part Two of The Sound of the Future moves from explicating use cases for voice technology to encouraging the reader to implement them, with attendant advice on strategies for doing so.

The chapter on Falling Barriers traces the recent history of voice assistants like Siri and Alexa, positing that what people really want is something more akin to an “all-purpose valet”. This leads into a discussion on technology breakthroughs, and the factors which incentivise them, and uses the COVID-19 pandemic as a case in point – where hands-free, remote interaction provided by voice-enabled devices helped practitioners avoid infection. Here, I would have enjoyed more grounding on the various innovation theories, however this book is clearly aimed at a business, rather than academic, audience. The chapter goes on to outline the key layers of the voice technology stack, such as automatic speech recognition (ASR), natural language processing (NLP) and conversational AI, providing a precis of the current state of the art of each, and remaining barriers. The paradigm of “multi-modal interaction” is then introduced, situating voice technologies alongside haptics and visual interfaces as a constellation of interfaces that collectively are shifting how we sense and respond to our cyber-physical world. User trust in voice technologies is then introduced as another barrier which must be overcome to ease widespread adoption, in particular citing the Trustmark Initiative from the Open Voice Network as a signal that this barrier is falling. The chapter concludes with an overview of how Dengel sees trajectories of development in voice technology, from automation to business process redesign, to transformation of business models.

Making voice an integral part of your existing business systems encourages the reader to “seize the opportunities” voice technologies present, by first identifying places where voice technology could be integrated into existing business systems. The authors provide a helpful list of six principles for assessing whether an interaction is well suited to voice integration, and go on to use examples from industry to highlight how these principles are applied.

In the Training voice tools to understand your world chapter, the authors cover a problem that has long faced voice technology practitioners – the domain specific nature of spoken language. The utterance (spoken phrase) “twelve fifty” has very different meanings in different contexts – it could mean twelve pounds fifty, 12.50pm, 1250g and so on. The advice here is for organisations to identify the “friction points” their customers face, using tools such as journey mapping to better understand those contexts. The chapter goes on to advocate for prototyping of voice technology tools, using UX methods to elicit feedback to guide iterative development, and ensure that the intent of the user – the task the user wants to perform – is matched by the system. The concepts of error flow handling and conversational repair mechanisms are covered here too – essentially serving as a primer on voice user interface design.

Designing and redesigning the multimodal user experience makes the case for voice technologies as part of an omni-channel digital user experience, highlighting voice’s place in an overall brand experience. It discusses how voice can be used to augment and reinforce other digital channels, such as text-based chatbots or graphical user interfaces. Thankfully, there is little hype about the metaverse – which – given its current white elephant status in industry – would detract from the argument that voice technology on its own is transformative – the argument here is that its transformative power emerges in concert with other technology. The chapter includes advice on how to plan an iterative voice UX (VUX) experience design process, and, also pleasingly, highlights the need for inter-disciplinary teams and executive support.

The concluding paragraphs reiterate the argument that “successful new technology is about meeting basic human needs”, and that to be successful, companies must adopt voice – or face defeat in the marketplace.

The verdict

This book is helpful for businesses who are making their first forays into voice assistants, voice user experience (VUX) or conversational AI, in particular those coming to it from a product management or business analysis background. The use cases for voice are expansively surveyed, and applicable to many industries. However, the technical detail is too light for those needing a deeper guide to the pitfalls of voice technologies, such as accent bias in speech recognition, ambiguous named entity recognition in natural language processing or the privacy dangers of voice cloning.

Moreover, while Dengel and Weber correctly identify that many threads of innovation underpin the current state of the art in voice technology – hardware improvements, advances in deep learning and neural networks, and the availability of more speech data upon which machine learning models may be trained, they gloss over the many challenges in the space – the trust people need to have in assistants, the poor performance of voice technology for accented or disordered speech, the privacy and ethical challenges of requiring user data to be effective, and above all, the question of who profits from speech data gathered from individual people.

Beckoning to previous technological path dependencies, they hold that

“… this is the story of any new technical wave. It takes years for entrepreneurs, designers, and engineers to shift their thinking to take full advantage of any new technological paradigm.” - Introduction

When I think of the precursors to today’s voice and speech technologies – the Audrey and the Shoebox, the Harpy, to the Tangora, to Dragon Naturally Speaking, even as far back as Christian Gottlieb Kratzenstein’s work on synthesising human speech with the “vowel organ”, I can’t help but wonder – are voice technologies really a new technical wave? And in taking full advantage of this new technological paradigm, who is it that is taken advantage of?

If voice is the sound of the future, then we must have other conversations about what that future sounds like – and whose voices are heard.
567 reviews15 followers
October 9, 2023
I am not a technology geek nor avid fan by any stretch of the imagination, so I read THE SOUND OF THE FUTURE by Tobias Dengel with wide open curiosity and a willingness to learn what is possibly coming next for our society. I did learn about the potential of voice technology to enable the formerly voiceless, the swift ways technology can serve based on voice, but given the current developments in AI and machine learning, I remain unconvinced that the mindset and character of those shaping and developing the capabilities will indeed make it a better and more level playing field for everyone. While this book presents compelling cases, many other qualified and experienced voices are necessary as well to realize where we are heading. Interesting and well-expressed, but more views would be helpful. I received a copy of this book and these opinions are my own, unbiased thoughts.
Profile Image for Pranav Dheram.
4 reviews
January 17, 2024
The book - at its heart, a business book (at times, even an overt pitch for the author’s voice company) - succeeds in building enthusiasm for a future with voice tech leaving any deliberation upon this future to the margins. If the future truly is voice, perhaps it is important to discuss whose voice will be heard and whose drowned out? Still, for those intrigued by the potential of voice tech, it stands as a recommended read.

Divided into 2 parts, it first explores the impact of voice technology in multiple facets of our life -
some familiar like entertainment with Alexa, others educative like improving speed and safety in a fire station. Blending the achievements of today with the possibilities of tomorrow (although with no academic discussion on the complexities surrounding them however), it stands as a comprehensive survey of the tech, presented with interesting anecdotes. It is easy to pigeonhole our exposure to the field we work in and this broadened exposure does help borrow ideas which translate well across domains. The same technology which allows us access to the weather forecast can benefit ‘a pilot relay commands to a system’ without taking hands or mind off an important maneuver. The same technology which relays the news on Siri, can read a kid to sleep. For this, I find it a helpful read. Yet, the narrative occasionally blurs the lines between voice tech and the metaverse, necessitating a clearer distinction between their contributions - perhaps a separate section on metaverse and how voice tech can enable it.

Having hopefully awed you with the potential of voice technology, the second part turns to adoption of this tech and practical considerations in the process. Drawing from the author’s own experience empowering smaller companies to be voice-first, he discusses how to assess the suitability of an interaction to voice integration, challenges during adoption such as working with limited data and the shift towards a multi-modal paradigm. The acceptance of this future’s inevitability, does not however come with a critical examination of its challenges - eg: tightened GDPR policies, increased voice fraud and the most important one for me - ethical concerns. True, the risk of delayed adoption is defeat but what is the risk of reactionary partially-informed adoption? What happens to thousands of call center employees, to employees in a pizza hut when voice-enabled tables can take your orders or when the magical experience offered hoodwinks any blatant privacy concerns. The book could have been more insightful with a nuanced discussion on this topic. Companies being empowered to adopt voice tech could benefit from similarly being equipped to think about the larger social footprint of this adoption.
Profile Image for Matthew Gibb.
151 reviews3 followers
March 2, 2025
This book is not for most people. The first couple of chapters were engaging about the emerging capability of voice to make the world safer,but I'm convinced voice tech is far from perfected. It's useful in car when you need directions or traffic info. The whole second half of the book is directed towards business owners and is encouraging them to cut staff and use voice,but it hasnt been perfected. It's saying bounded models which are capable of responding to the most common questions are available. Banks use this voice tech,but waste a users time with complex menus if they're not simply asking for a current balance. Employ humans and value their brains. The real world is real and not this abstract. Perhaps people should return to living in it. My attention flagged halfway through. I would like to have heard more realistic examples of how natural speech was going to supplant digital interface,but now I doubt its ability. In addition,Tobias talks about the uncanny valley where voice becomes so good that humans cant tell it's a machine. If people feel creeped out by realistic voices it's because their organic beings in a real world. They arent bits of insignificant data to be served ads in the hopes of getting yet another subscription as a service or sales conversion.
1 review
September 26, 2025
Language is the ultimate API for humans, and voice is the most natural interface we have.

That’s why Dengel’s book feels more relevant than ever. With the rise of large language models, machines can finally understand and converse with us in ways that feel human. The implications are enormous: any industry where “people talk to people” today:from healthcare to customer service to education, is poised for transformation!

What I particularly value in this book is its breadth of examples and use cases. It doesn’t just speculate about the future of voice AI; it grounds the vision in concrete applications that are already emerging. Dengel also traces the history of voice interfaces, showing how far we’ve come from the frustrations of early assistants like Siri or Alexa.

This is more than a technology book. It’s a guide to understanding where human–machine communication is headed. For anyone looking to grasp the trajectory of voice AI and prepare for its impact, it’s a must-read.
1 review8 followers
October 30, 2023
Siri and Alexa were pretty underwhelming. I was skeptical about how voice enabled experiences were game-changing enough to warrant the kind of investment needed to re-tool my apps with the next big thing. If you're like me and you just haven't bought into the Voice Hype - I'd recommend giving this book a read. Tobias and his fellow authors do a fantastic job of walking through how Voice-first experiences are super charged by AI. The power of LLMs and the speed of Voice make a compelling combination. If you're at all asking yourself if Voice is part of your app's future or what that might look like - can't recommend this book enough!
1 review1 follower
October 30, 2023
Straight to the point, practical, well-researched book on the emergence of voice-first experiences. Paints a compelling picture of the future - given we speak faster than we type, and with continued advances in conversational AI, the digital experiences that surround us will evolve to be increasingly voice-driven. The book has examples from across industries, and lays it all out in a way that's fun to read. Includes some 'behind the scenes' stories from the sale of Siri to Apple, which I hadn't seen before.
Profile Image for Christianna.
71 reviews
April 30, 2024
Reminiscent of my 70-year-old father explaining why his voice to text function saves him some much more time than texting and then proceeding to tell me all the cool articles he’s read about voice tech on yahoo news. Oh, and look at these cool apps I downloaded m!

This is an explanation of the current voice technology we have and use all the time. Yes, we know about Siri. . .

Well written and interesting but nothing new or innovative if you’ve been in the world for the last five years.
1 review
October 30, 2023
Great insights into the rapidly emerging area of AI and technology! Dengel provides clear examples of how these emerging technologies are a critical element of any digital product's roadmap. Every product manager's guide to updating their digital products to make the most of this new capability to keep products relevant and competitive in the new digital landscape.
Profile Image for Hannah.
5 reviews
October 30, 2023
Really fascinating look at voice technology. It felt more pragmatic, approachable, and actionable in comparison to other tech/business books I’ve read. It was most interesting to see real applications of how voice technology can, and has, improved the lives of those with disabilities. Accessibility is critical for our society.
Profile Image for Tim Morrissey.
61 reviews2 followers
June 29, 2024
In into voice technology so was excited when I saw this book in the library. I’ve been disappointed with voice tech thus far. This book is very product and business focused. More about why technology has been limited thus far would have rounded it out more. I’m still very excited for better voice tech. And multi modal voice tech.
1,400 reviews
July 2, 2024
"The Coming Age of Voice Technology" and "The Sound of the Future" are on the book. But it's difficult (at first) to figure out which is the Ireal name of the book.

We know that 'technology' can change lots of things. In a way, the book is taking a new way to understand of our world. Dengel and Weber give us a way to work through a very different "future."
1,831 reviews21 followers
June 14, 2023
Excellent overview of the current and coming voice tech. Get it quickly since, like all tech-focused non-fiction, it will be outdated relatively quickly.

I really appreciate the free copy for review!!
Profile Image for Kylie Whitman.
1 review1 follower
October 30, 2023
This book was such an insightful read on the ways that voice can and will transform all of our digital experiences in the future. I appreciated the real-life examples and the story telling throughout the book that brought the examples to life. Would recommend to all digital product managers!
1 review
October 30, 2023
Unlike most "business books," Dengel backs up his theory with practical examples. Touching on real problems across may industries, this book is a must-read for leaders who are interested in voice but aren't sure how to apply it to their product/business.
4 reviews1 follower
October 26, 2023
Great insights into the coming convergence of GenAI and voice tech. Very readable, with great examples and key takeways for any business leader…
1 review
May 29, 2025
Impactful storytelling of why and how going back to a voice-led engagement is the future.
Profile Image for Heather Higgins.
47 reviews
January 5, 2025
This was an exciting listen. What a fascinating analysis of the #voicetech trend that is underway and the huge potential of where it will head in the coming years. Dozens of case studies brought the story to life. The book is also loaded with lots of in-depth advice and lessons learned for anyone interested in undertaking a #voicetech project. Truly a generous approach by the author, Tobias Dengel of WillowTree.
Displaying 1 - 22 of 22 reviews

Can't find what you're looking for?

Get help and learn more about the design.