The 2 things holding AI assistants back (and it's not tech)

The 2 things holding AI assistants back (and it's not tech)

We’ve long had a vision for what AI assistants could be capable of. Hal, Computer from Star Trek, Kitt from Knightrider, Holly from Red Dwarf, JARVIS from Ironman, Samantha from Her and so on.?

In real life, there’s been various attempts at creating this all-encompassing assistant. An #AI concierge that’s at your beckoning call, accessible from any device, and that can deliver on anything you ask it.?

---

Register for this free CX maturity workshop with yours truly and Cognigy.

---

Big tech attempts to create AI assistants

We know about the big tech companies and their efforts to deliver this. Apple’s Siri, Amazon’s Alexa, Google’s Assistant, Samsung’s Bixby. Each has their own challenges.?

  • Apple doesn’t allow much third party developer capability, meaning it’s functionality is limited to what Apple decides it should do.?
  • Amazon does have a third party ecosystem but hasn't been able to crack discoverability , monetisation or keep pace with the functionality developers need to deliver robust experiences. A walled garden approach isn’t going to work because it takes too much management and effort. You’ll never move quickly enough and you’re constantly conflicted between your incentives and the needs of third parties.?
  • Google made an attempt to do the same as Amazon with conversational actions, realised how hard it is and how much effort it requires to create and sustain a developer community, and so has now cancelled the program and is pushing for Google Assistant to be a launchpad into existing apps and functionality. A conversational assistant on the front end, a GUI on the back and relying on existing services.?
  • Samsung could have nailed it with Bixby, created by the same team that built the first Siri (Apple acquired Sri, and Samsung acquired Viv, both built by Adam Cheyer and Dag Kittlaus ), but it didn’t have the adoption and user base to make it work.?

What’s interesting is that the big assistants don’t fail on the AI side. Most of them understand almost everything you say today. They fail on the fulfilment side (aside from Samsung which doesn't have a huge user base for Bixby, despite its huge device presence).

Issue no. 1: Fulfilment

They can’t do everything you’d like them to do because they don’t have access to the services they need. How can you ask #Alexa to book you a hotel room for Thursday if?Booking.com ?doesn’t have a skill? How can you ask #Siri to add cucumber to your shopping list if Sainsbury’s can’t play Siri’s game?

Cue the smaller players

Maybe a smaller, more focused player could deliver on this? If so, you’d look at SoundHound AI , Mycroft AI Inc , and other smaller players like Magic , Omega or Velocity Black . Perhaps they could offer the kind of functionality that the big guns can’t??

In some cases, they can. Mycroft has a whole community of developers building out capabilities for the open source assistant. SoundHound has a whole bunch of domains that it can handle really well. It’s language model for finding a place to eat is second-to-none.?

The challenge for those companies again isn’t on the AI side. They can all understand what you say, some even better than the big guns. The challenge for them is on the distribution and fulfilment side.?

Issue no. 2: Distribution

They don’t have the reach of 亚马逊 , 苹果 and 谷歌 , and so they can’t possibly become the go-to assistant for the masses.?

All the devices we have to hand are mainly Apple and Google (iOS or Android) or Amazon (Alexa). Those smaller players can’t compete for distribution because each of those devices has a gatekeeping assistant already.?

And so, if you don’t have the distribution, you don’t have the users, and if you don’t have the users, you can’t encourage third parties to develop the capabilities you need to become an all-encompassing, fully capable assistant.?

Perhaps the new EU legislation will change this and force the big guns into allowing access to other assistants on their devices, but you’d still have the fulfilment issue.?

Solving fulfilment

You could consider building all of the fulfilment services yourself, but it’s impossible. Just look at this simple use case from Velocity Black’s website :

A guy asks Siri to ask Velocity to book him a gym class tonight. Siri responds with “There’s a new Hitt class. Your first one is complimentary for velocity members”.

To deliver the above, all Velocity needs is to have a partnership with one big gym, like a David Lloyd or Nuffield, and throw a promo in there when someone asks for gym classes. But that’s not what the guy asked. He asked it to book him a class.?

To fulfil that properly, Velocity needs to:

  1. know which gym the guy is a member of,
  2. interrogate the classes on offer on that evening,
  3. poll an API to see whether there’s any availability,
  4. suggest the type of class, time and availability back to the user (ideally, it’d be a class that Velocity knows the guy enjoys through knowledge of previous bookings),
  5. have it confirmed by the user,
  6. book it,
  7. confirm the booking

This whole journey is small fry. Not very complex, really. Even if the time didn't workout and you had to go back to the class search step, it's not the end of the world.

Fulfilment challenges

However, this journey can’t be delivered today. None of these gyms have open APIs that would enable a third party to make a booking. You’d have to do it on their website directly.

Now, Velocity could use RPA or humans on the backend to fulfil this, but that'd take forever. The guy may as well just do it himself.?

This is the problem that all assistants face, even the big ones that have the distribution, they don’t have the capability to fulfil the long tail of requests that would make them indispensable.?

This is why Magic, SoundHound, Velocity Black and Omega all pivoted to serve the enterprise market. It's the only way to create revenue at the minute.

What about domain-specific assistants?

So then you’re left with trying to focus on specific domains. Individual companies trying to crack those long tail use cases. Jetson ’s food ordering use case, Flip ’s taxi booking capability, KAI Kasisto ’s banking assistant.

But then, you have neither the distribution (aside from direct assess via an enterprise white labelling the solution, which isn’t part of the all-encompassing assistant vision) nor the breadth of capability required for a personal assistant. You’re again stuck with going to the enterprise for revenue and hoping that, in the long term, you can weave a distribution position with one of the big assistants, providing they can find a way to surface your capability to users.?

The vision is still alive, it's just a lot harder than some make it out to be

Don’t get me wrong, that vision is still there. One assistant (or one overarching assistant that integrates with thousands of other narrow assistants) that can fulfil your every desire. Something that knows what you need before you know you need it and can be proactive. And, to get there, it’s no longer a question of AI technology on the language front. We have that. NLP is so good now that we can understand what you say, no problem.

Many of the big assistants have the distribution and reach millions of people everyday (or are within earshot of millions) as well.

The biggest problem is in fulfilling requests, in integrating other assistants that can fulfil requests or in waiting for the rest of the world's businesses to create infrastructure that'll make their data and available via APIs to be consumed by AI front ends.

Not long now then, ey?

---------------------------------------------------------------------------------------------------------------

About Kane Simms

Kane Simms is the front door to the world of AI-powered customer experience, helping business leaders and teams understand why voice, conversational AI and NLP technologies are revolutionising customer experience and business transformation.

He's a Harvard Business Review-published thought-leader, a top?'voice AI influencer'?(Voicebot and SoundHound), who helps executives formulate the future of customer experience strategies, and guides teams in designing, building and implementing revolutionary products and services built on emerging AI and NLP technologies.

Sam Nanji

Digital Transformation & Customer Experience Leader ? Founder ? NED ? Voice Skills ? KM Expert

2 年

This is a problem that we always hit whenever we have a general as apposed to specific self service situation. And I think its a problem which is not worth solving because its too hard and well because we can make use of assistants now without needing to. Instead focus on the specific use case and federate whenever it makes sense. Sometimes you don't need to know an answer, you just have to know someone (or something) that does or you need a mechanism to find out who knows the answer and who can fulfill your need. And that opens another area which is how do you know who knows and how do you know if they can be trusted to fulfill your need. And that speaks to confidence and trust. Perhaps whats needed is a smart assistant answer broker one that understands security and context.

Ryan Hollander

Senior Software Architect & Developer | LLM & GPT-3 & GPT-4 & ChatGPT & Llama & Claude, Applications, Integrations / Chatbots & NLP & NLU & Voice Input & Text to Speech / Web & Cloud Specialist / AWS / DevOps / DevEx

2 年

Great insight. I'm giving you bonus points for using Holly as your cover image. You've hit the nail on the head, coming up with use cases and getting the input parsed is pretty well-tread. Fulfilling the action feature is often a whole different ball of wax. You see this from answering questions to taking any actions that are not tied into smart home or playing media. Another challenge is context for those actions, which also plays into sensory inputs and awareness to generate context. Often we have to manually configure that context or elicit it through questions/disambiguation techniques, further complicating the implementation and increasing the chance of failure. I think a third challenge we don't talk about enough is that it's not good enough to have an idea, a good implementation often requires a deep understanding of the domain and problems you are solving for and earning that takes time. We've had to live with these assistants for a while before we understood what we really might want them to do and what they might be capable of. I think we are just starting to reach some critical mass in this area as adoption has spread.

PolyAI is solving the fulfillment problem for us. It’s exciting technology. Game changer.

Colleen Fahey

Author & US Managing Director, Sixième Son, Audio Branding & Sound Design, CHIEF member

2 年

This reminds me of how my hopes were raised by my first dazzling encounter with Alexa. I didn't know I could ask for jokes and figured I'd have to buy something. When the voice asked me what I wanted, my eye lit on my book, and that's what I requested. Boom. Amazon transaction complete, I received my book within three days. Astounding experience – unfortunately, never replicated. (Though, I didn't actually need the book, as I had a case full from our publisher)

Andrew Francis

To write software that users will adore.

2 年

Kane Simms totally agree about fulfillment (I've argued that in the past). Web services (and supporting "infrastructure") still aren't as ubiquitous as web sites. This limits the scope of voice and its cultural and economic impact. Distribution is also a problem. However, I think the solution will be to abandon the smart speaker in its current form. Myself, the only practical way for me to stay in the Google space is to use Dialogflow ES, control my own back-end/hosting, get some form of enterprise pricing, albeit for a single person, and access it via the web, telephony or custom hardware.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了