A Voice-Activated Future: Q&A with Simon Breakwell
"Consumer voice-activated products will continue to expand and grow both in usage and utility. However, only once there is connectivity between devices and real cross platform use cases will voice really take off."
Whether it is through Expedia, Hotels.com, or HomeAway, Simon Breakwell has always been interested in how technology can improve life for consumers. In 1996, Simon became a founding member of Expedia within the Microsoft mothership, and in 2000, he returned to the U.K. to head Expedia’s European operations. His initial introduction to TCV was back in 2000, when the firm provided Expedia with its first investment as a public company.
Based on his experience working in the travel industry and as a Venture Partner at TCV, Simon expects that voice technology is poised to have a big impact. Within the next few years, he anticipates that voice search—along with machine learning, artificial intelligence, and chatbots—will transform the way people book travel, along with how they do just about everything else.
We recently sat down with Simon to discuss the future of voice technology, including:
- Whether consumers are ready to make the leap to voice-activated products
- Why interoperability, context, and AI are essential to drive mass adoption
- Who can be a key player in the voice-activated landscape
TCV: Manufacturers are betting consumers want voice-activated appliances, thermostats, and lawn sprinklers. Alexa and voice apps are certainly picking up more skills, but are consumers ready to make the leap?
Breakwell: Voice today is where e-books were before mass adoption. When Amazon shipped the first Kindle, it was a bit of a peculiarity. People that love being at the leading edge bought it, but it didn’t really go mainstream until three or four years later. I think that is where voice is today.
People will use voice instinctively once it becomes sophisticated and smart, but even though it is more accurate now than it once was, it is still fairly annoying to use. That’s because it is hard to build software that mimics and understands all the nuances and inferences of people’s conversations. Increasingly, discrete statements can be understood, but local accents and fast speakers are still a struggle. Most importantly, it remains hard to link up multiple statements and actually have a conversation with a device. The technology has come leaps and bounds over the past two years, but conversational understanding seems a few years off.
TCV: What do you think are natural areas of adoption?
Breakwell: I would say the home, cars, and transportation are rich seams since they represent very large marketplaces with multiple use cases.
TCV: You have talked before about how interoperability is essential for voice-activated products and services to really take off and hit the mainstream. Why does interoperability matter?
Breakwell: Interoperability matters because that is how we live and interact. We move from location to location, and carry our thoughts and ideas between those locations. We would expect any voice product to move with us, seamlessly passing information and the context of that information between locations.
However, the real jump in utility comes when you can string the context and interoperability together.
For example, if I am talking about a holiday at home, when I get in the car, I should be able to say, “What is the price of that holiday?” and get an accurate answer. Equally, as a conversation or thought deepens and changes through multiple answers and questions, the technology needs to understand and develop with the conversation.
TCV: What about authentication and security when it comes to voice?
Breakwell: My understanding is that your voice sound is as unique as fingerprints or retinas. When I call up my bank, they don’t ask for any passwords because it is all done by voice recognition. Once your voice becomes your authenticator, all products and services you speak to should recognize you instantly. “Logging in” becomes a thing of the past.
There are difficulties though. Let’s say you are talking to Alexa about your bank accounts and payments, as you would with someone on the phone, and there are a few glitches that detract from a smooth experience. Behaviorally, people have a really low tolerance of failure. Unless the implementation is really smooth, rather than offering improved customer service, companies might be doing the opposite.
TCV: At what point will voice recognition technology be good enough to realize this vision? Is it the 95.5% comprehension rate? Or must it surpass the human comprehension standard? What is the threshold?
Breakwell: I think there are three components to success. Firstly, you need recognition accuracy. In the vast majority of cases, the technology needs to understand your phrasing. Secondly, you need context. The technology needs to develop its understanding of statements and conversations as those conversations develop and easily hop between them. Third, you need platforms to speak to each other so you can move around and have conversations with technology. The first two problems are huge technical issues. The last is technical but it is also a political issue between the major platforms and will probably take as long to solve as the first two issues.
How far away is this? I am not sure whether we are 2-5 years away or 10-15 years. As with most technology, I think we overestimate the short-term impact but underestimate the long-term impact and the time it takes for development.
TCV: What is your take on the business side of voice-activated products and services? Amazon has aggressively moved into the voice-response space, along with other deep-pocketed players like Apple and Microsoft. Where do you see opportunities here for startups to compete?
Breakwell: If you are talking about the big voice platforms—like Alexa or Google Home or Microsoft’s Cortana—I think it is difficult, but not impossible, for startups to break into that space. Those products need a massive amount of data processing, huge engineering capabilities and an already extensive search database — and that plays to the big companies. It’s very difficult to build a voice on-ramp to the internet without those three attributes, and there aren’t that many smaller companies that have those.
TCV: Given that the companies behind the big voice platforms are competitors, do you think that represents an obstacle to interoperability? How long before platforms will be able to speak to one another?
Breakwell: I suspect it will be some time before platforms will be able to speak to each other. If you look at messaging, which has been around for a while, there is little interoperability between platforms at the moment, and I think voice might take a similar path in the short-term. If, and when a standard emerges for platforms to speak to each other, then we will start to see some interoperability.
I am an optimist, and the use cases are so powerful that we might see the platforms speaking to each other sooner than we anticipate. As for the shape of the eco-system, it is very early to make predictions. Between platforms, there will need to be authentication layers and security layers, as well as standards that connect it all. I am certain, that just like we have seen with the smartphone, we will see a whole suite of new services from voice technology that we couldn’t even imagine.
TCV: Search is another area that the big players are grappling with. How will voice affect search ad business models?
Breakwell: Voice could alter where power lies among the big players. For example, Amazon is a vast retailer, and increasingly, a provider of data services. It is also getting into media with Prime. But outside of its own considerable retail platform, you wouldn’t call Amazon an on-ramp for search on the internet—Google is doing that. However, by introducing Alexa, it could become a search on-ramp to the internet. Facebook also is building out a suite of products and services to strengthen its messaging operation, and through that, people will soon start to search and ask for services through Facebook Messenger.
As far as advertising, existing search platforms like Google will have to rethink their advertising model for voice. Customers talking to their Google Home device probably only want one answer, not 3-5 advertising spots. Perhaps, that one answer is a very qualified answer and Google can monetize that better than keywords – who knows?
There are issues of brand power too. If you are speaking to Alexa about the cheapest flights to London, you might care less where the price comes from as long as they give you the cheapest flight. Perhaps customers will see their relationship more with Alexa than with Expedia or Priceline, all of which are investing in voice technology to ensure they stay close to the customer and minimize intermediation.
TCV: Final question. What are you most excited about when it comes to “voice” technology?
Breakwell: The interesting thing is that we have no idea how this technology will transform our world. At some point, there will be mass adoption. At that point, we will see another raft of new products and services, many of which we cannot conceive of right now. That is the exciting thing, and that is the challenge for firms like TCV — to spot these great businesses and help them flourish and develop.
***
The views and opinions expressed in this Q&A are those of the interviewee and do not necessarily reflect those of TCV or its personnel. Venture Partners and Executive Advisors are typically independent consultants who are not employees of TCV but have a strategic relationship with TCV and/or provide valuable advice or services to TCV and/or its portfolio companies. For additional important information regarding this post, please see “Informational Purposes Only” under the Terms of Use section of TCV’s website, available at https://www.tcv.com/terms-of-use/.