The Limiting Factor Of Voice And Dictation Adoption
Harry Stebbins published a podcast with David Beisel this week in which they discussed the importance of voice. David says, “Voice is the most natural user interface possible.” I think the biggest challenge for voice is the skepticism and cynicism engendered by a decade or two of poor experiences. It’s no longer the technology.
One of my partners recently switched from an iPhone to an Essential phone and was stunned by the accuracy of the voice interface Google offers. Voice has become far more sophisticated than just a few years ago, and we are quickly moving past the stage of toy applications.
I’ve written before about how each of my emails are dictated. This blog post is dictated and edited entirely by voice. If I make a mistake, I can say “insert semicolon after dictated” and the computer will insert the; after the word dictated.
If I’m curious about cash conversion cycle, I can stop in the middle of the sentence, and say “search Google for cash conversion cycle”. My Mac will launch Chrome and issue the query in Google. When I’m done, I say “close window” which closes Chrome and “switch to Typora”, which is the name of my favorite text editor for writing blogs. Then I can can continue speaking and drafting this post.
When I speak the command “publish blog post,” the computer executes a shell script which does four things. First, a program verifies the syntax of a new blog post. Second, a script resizes the images I use in that post to the optimal size and format for the website. Third, Hugo, the static blog engine I use, compiles website. Fourth, it calls the Amazon Web Services command line interface to synchronize the local website I just compiled to S3. That happens less than four seconds, in the background, all from a voice command.
If I’m researching a company, I can say “research Looker”. The computer will open up four tabs in Chrome - RelateIQ, Crunchbase, Mattermark, LinkedIn - and issue the query in each of those services so that when they have all loaded, I’m looking at the Looker page in each of those services.
Voice assists with small things too. I can also move windows around my desktop by saying top left, bottom right and the windows will resize. “Play music” starts Spotify. “Next song” changes the music. I can say “Gmail” in my email will load. “Open” opens the first email. “Reply” begins a reply. “Send to Asana” forwards the email to Asana and then archives that message.
I read about an engineer who configured his voice dictation software to allow him to write code entirely by speaking to his computer. Imagine typing C++ code by speaking it. For someone who is native in the language, it’s far more natural, and up to three times quicker than typing. Plus no more RSI issues.
We’re simply not that far away from incredibly sophisticated applications and uses of voice technology when interacting with computers. We are seeing voice become a common use case within the home with Alexa and Google Home. Or in the car, by dictating text messages that you are going to be late or asking for navigation instructions.
That familiarity and consumer use cases will dissolve the skepticism users have accumulated over years of disappointing experiences. Voice will come to the office in a very big way and enable far more sophisticated and complex use cases than we have seen in the past from data input to sophisticated analysis.
CMO turned Industry Analyst | Helping B2B Software companies grow
7 年Voice is indeed on the verge of (re)becoming very big!
Co-Founder at NeuroNav
7 年While I think that is definitely a hurdle, I think a much bigger impediment is the discoverability and recall of all those commands you've just rattled off. Even if you overcome the past experience piece, many people will be staring at a little white tube saying "what am I supposed to do with this?" and "what was that command again?" What I did find striking about your setup that I haven't seen much of is a mixed visual and voice interface that allows for the information density of a screen with the added control of voice commands. How much faster would the adoption and proliferation of virtual assistants be if they weren't so heavily soloed into voice-only interaction models? How much faster could we work if we could continue working on our current screen while calling out commands to execute in the background? (I imagine working in Sketch designing an interface and calling out "add 'find hero image' to my to do list - I can get it down and forget about it without interrupting my workflow)
Senior Rust developer.
7 年I think that open-plan office spaces will hinder the adoption of voice control .
Business Growth Specialist | Business Community Leader| Business Connector
7 年Interesting post. Thanks.