More than words: Interfacing with the Agentic Internet

More than words: Interfacing with the Agentic Internet

Text has long been the primary mode of interaction between humans and computers. From the command line to the modern search bar, text-based input has offered a level of clarity and permanence crucial in both our personal and professional lives. It provides a reliable record of interactions, allowing for careful review and consideration. Voice interaction, in contrast, has had a more turbulent history, often failing to live up to its initial promise. We've probably all experienced at least one moment where we've questioned the "intelligence" of our voice assistants and ended up slinging insults at it.?

However, advancements in AI are changing that, fostering more natural, nuanced, and contextually aware voice interactions. This progress raises the question: will voice eventually supersede text? ?

Spoiler alert: the answer is likely no. ?

Text's inherent precision and the tangible record it provides will ensure its continued relevance. The more pertinent question is how these two modalities, along with others, will coexist as AI becomes increasingly integrated into our lives.?

So, what do I mean when I talk about text and voice interactions, and what’s this "agentic internet" thing? Let's break it down.?

The Enduring Power of Text?

We often take text for granted, but its power is undeniable. Text is the backbone of legal contracts, historical archives, and scientific research. Why? Because it offers clarity, a verifiable history, and an asynchronous nature that allows for thoughtful composition and response. Text remains indispensable due to its clarity and ability to create a lasting record. Whether it's confirming a high-value transaction, reviewing a detailed report, or revisiting a critical decision-making process, text offers a level of control and a verifiable history that other forms of interaction struggle to match.?

Imagine trying to negotiate a complex business deal entirely through voice memos. Nightmare. The potential for misinterpretation is enormous. With text, you can meticulously craft your message, review it before sending, and have a permanent record of the entire conversation. It also offers searchability; if you need to remember the specific wording of a clause in a contract signed six months ago, you can search the whole document in a jiffy.?

Text is also the perfect medium for contemplation. You can draft an email, leave it, think about it, edit it, and send it when you're ready. It allows for a level of precision and intentionality that's crucial in many aspects of our lives. It is how we communicate complex ideas, convey nuanced arguments and document the world around us.?

The Strengths of Voice?

Voice interaction excels in specific contexts, particularly where hands-free operation is beneficial or when immediacy is paramount. Tasks like navigating while driving, setting reminders in a busy environment, or executing simple commands are streamlined through voice. It can also facilitate a more natural and fluid way of exploring ideas. When you're bouncing ideas around with colleagues, or even just thinking out loud, talking allows for an unimpeded flow of thoughts. Using a voice, or live mode with AI, you can quickly capture ideas as they come, without the interruption of having to stop and type them out.? However, when tasks increase in complexity or require meticulous accuracy, voice can still fall short. Even today, with all the advancements, it still requires a certain amount of patience. Often, you may go unheard, accents prove unintelligible, and you end up needing to speak to your voice assistant like you're talking to a small child or a puppy, repeating yourself and enunciating very. very. clearly.?

The Agentic Internet: When Your Agents Start Talking to Each Other?

Our New Minds, New Markets research suggests that the emergence of an "agentic internet" is inevitable. A key milestone in its?development are AI agents capable of interacting with computers, navigating websites, and operating software in response to natural language queries. Early examples are already appearing, such as Anthropic's "Computer Use," OpenAI's "Operator," and Google's anticipated advancements with Gemini.?

These developments, however, represent only the beginning. We're transitioning from a time?of static interfaces to one of dynamic, adaptive entities designed to interact with both users and other agents across networks. This will redefine how systems communicate and collaborate, paving the way for a more interconnected and intelligent digital ecosystem.?

Websites are so last year: Enter the World of Autonomous Agents?

But what does this mean for the interfaces we use? Broadly, the traditional boundaries between users, interfaces, and underlying systems will become increasingly blurred. For example, when managing your finances, instead of logging into separate banking, investment, and budgeting apps, you interact with a single financial AI agent that provides a holistic view of your finances, suggests investment strategies, automatically pays bills, and negotiates better rates with service providers. ?

The methods we use to interact, be it text, voice, or something yet to be conceived, will remain important as distinct interfaces, but they will connect to increasingly autonomous systems. The very nature of those systems, and what constitutes, for example, a "website," will be transformed. This does not diminish the importance of user-facing interfaces. Instead, it underscores the need for multimodal systems that effortlessly integrate text, voice, and visual input.?

I've long been passionate about augmented and virtual reality, as they hold the promise of transforming how we experience digital content. However, widespread adoption has remained elusive. Reliable and humanlike voice interaction could be the catalyst that finally propels these technologies into the mainstream. By eliminating the friction associated with cumbersome hardware and complex input methods, natural language commands could make AR and VR experiences more intuitive and accessible. The upcoming AndroidXR operating system, with its emphasis on voice control through Gemini, represents a significant step in this direction.?

Simultaneously, our research shows that the smartphone will continue to play a central role as a primary interaction hub. Its longevity has proven its value and trustworthiness. New AI-powered systems must integrate with these familiar devices, complementing rather than supplanting them.?

Designing for a Headless, Multimodal Reality?

As AI agents become autonomous digital counterparts, designing for headless systems, without one set interaction mode, or UI, in mind, becomes increasingly critical. These systems must be capable of processing and responding to multimodal input while coordinating actions between human users and other AI agents. Success will hinge on their ability to manage this complexity and deliver seamless, adaptive experiences that feel natural and intuitive.?

While the roles of voice, visual interfaces, and agent-driven interactions will undoubtedly expand, text's inherent precision and adaptability will ensure its continued relevance. The future of human-computer interaction lies not in selecting a single dominant mode but in empowering users to transition fluidly between them. In an increasingly agentic world, multimodality is the key to unlocking the full potential of AI and making it truly accessible and useful for everyone.?

?

?

?

John Gnotek

Communications, Media & Technology Consultant at Cognizant

1 个月

Brilliant.

要查看或添加评论,请登录

Duncan Roberts的更多文章

  • Digital Fatigue: The Dawn of Agent-Driven Commerce

    Digital Fatigue: The Dawn of Agent-Driven Commerce

    Let's be honest: the internet, as we've known it, is pretty much on life support. Long gone are the early days.

    3 条评论
  • Think Small, Dream Big

    Think Small, Dream Big

    Generative AI is having a bit of a blockchain moment. Remember when everyone was hyping up blockchain's potential, only…

    8 条评论
  • Tasty Pizza Glue: The new world of generative search

    Tasty Pizza Glue: The new world of generative search

    Back in October 2023, I wrote about the impending shift from traditional search to a new, generative AI-fuelled era of…

    6 条评论
  • From Search to Request: Transitioning to a new consumer landscape

    From Search to Request: Transitioning to a new consumer landscape

    When search engines first became mainstream, they quickly established themselves as the primary gateway to the…

    3 条评论
  • Two Futures, One Path

    Two Futures, One Path

    Remember when web3 was the future? Fast forward a bit, and now it's all about generative AI. As for what comes…

    2 条评论
  • Hype!

    Hype!

    HYPE! It seems to be the fuel of our current era. When it comes to technology, the speed of progress and level of…

    6 条评论
  • I promise I'm Human

    I promise I'm Human

    I promise I’m a human. But should you trust me? In the past readers might scoff at the idea that a computer could…

    8 条评论

社区洞察

其他会员也浏览了