The Present and Future of AI Agents
This article is based on an online conversation between Daniel Huggins of Text Alchemy , Naman Garg of MultiOn and Jeremy Kirshbaum of Handshake . Although the article is based on and quotes from the conversation, the opinions are those of Jeremy Kirshbaum. Some speaker quotes have been very lightly edited for readability.
You can find the original conversation here: https://youtu.be/Cz7fItT-c2M.
Part 1: Navigating the Frontier of Autonomous Assistants
In a nondescript office park on the outskirts of Silicon Valley, a group of engineers and researchers are quietly toiling away on what they believe to be the next great leap forward in artificial intelligence. They are the team behind MultiOn, a startup that is developing autonomous AI agents capable of navigating the web, gathering information, and completing complex tasks without human intervention. These agents, the company believes, represent the future of human-computer interaction - a world in which intelligent machines work alongside humans as trusted partners and collaborators.
Agents go beyond normal chatbots. These AI systems can reach out to external services, databases, or even other AIs to gather information or perform tasks. While a chatbot like ChatGPT can engage in impressively fluent conversation, it is ultimately a passive tool, waiting for a human to provide prompts and direction. An autonomous agent, by contrast, can take initiative, making decisions and taking actions based on its own understanding of a user's goals and preferences.
"What we are trying to do is create a new kind of agent that can work autonomously," explains Naman Garg, a founding engineer at MultiOn. "Our agents are designed to be the new interface to the digital world. We've taken language to action, enabling our agents to understand users' natural queries and autonomously achieve tasks."
In a demonstration, Garg shows how one of MultiOn's agents can book a flight from start to finish based on a simple natural language request. The agent navigates to a travel website, enters the relevant details, compares options, and selects a flight based on the user's preferences - all without any further human input. It's a task that would take a human several minutes to complete, but the agent can do it in a matter of seconds.
But developing agents that can reliably perform such tasks is no simple feat. An agent must be able to break down a high-level request into a series of discrete steps, navigate the complexities of user interfaces designed for humans, and deal with the many potential exceptions and edge cases that can arise. It's a challenge that requires not just advanced natural language processing, computer vision, and planning, but also a deep understanding of how humans think and work.
"An agent needs to have a lot of components to behave effectively," says Garg. "One main thing it needs is memory. It needs to understand short-term memory, like chat history or session history, and long-term memory, where it can recall previous tasks it completed."
Memory is just one of the many capabilities that MultiOn and other companies in the emerging field of autonomous agents are working to develop. Agents will also need to be able to engage in complex planning, breaking down high-level goals into actionable steps. They'll need to be able to learn from past experiences and adapt to new situations. And perhaps most importantly, they'll need to be able to explain their reasoning and decision-making to humans in order to build trust and accountability.
"We think agents are the present, something you can interact with currently," says Garg. "But currently, they can't always do things correctly and often require textual input, but agents of the future will be fully autonomous, used widely in the next five years."
But even as companies like MultiOn race to bring autonomous agents to market, there are many open questions and challenges that will need to be addressed. How can we ensure that these agents are safe and reliable, and that they don't cause unintended harm? How can we align their goals and values with those of their human users? And how will the rise of autonomous agents change the nature of work and the economy more broadly?
"The complexity of these projects will be unprecedented," says Daniel Huggins, co-founder of Bristol-based TextAlchemy. "We're going to see agents taking on incredibly complicated tasks, requiring entirely new tools to manage these projects."
Huggins envisions a future in which autonomous agents become an integral part of the software development process itself, helping to manage the complexity of large-scale projects and even writing code themselves. But he acknowledges that getting there will require not just technical advances, but also new ways of thinking about software engineering and project management.
"We'll need new classes of tools relying on AI to handle these incredibly complex projects," he says. Retort.js, which his company is building, is attempting to serve these kinds of projects.
For businesses and organizations looking to harness the power of autonomous agents, the path forward is still uncertain. While the potential benefits are enormous - from increased efficiency and productivity to entirely new products and services - so too are the risks and challenges.
"Prompting is going to become increasingly important for designers and creators, but it will be much more complex," says Jeremy Kirshbaum, founder of Handshake, a company that helps businesses integrate generative AI into their operations. "There's a set of emerging tools to bridge the gap between natural language experts and software developers."
Kirshbaum believes that success in the age of autonomous agents will require a new kind of collaboration between technical experts and domain specialists, as well as a willingness to experiment and iterate rapidly. He also stresses the importance of thinking deeply about the ethical implications of these technologies from the outset, rather than waiting until problems arise.
"We're still figuring out what we mean by 'agents,'" he says. "For instance, there's rigorous research showing that under certain conditions, multi-agent collaboration can produce better outcomes."
As the world watches the rapid advancement of AI with a mixture of excitement and trepidation, the pioneers of autonomous agents remain focused on the task at hand - building machines that can think, learn, and act in ways that were once the sole province of human intelligence.
"The potential impact of agents is significant," says Garg. "Agents are more than ChatGPT, which is limited to text generation or simple tasks. Agents can take actions and complete tasks autonomously."
Part 2: The Challenges of Engineering Reliable Autonomous Assistants
As the potential of these agents becomes increasingly clear, so too do the challenges of engineering them in a way that is safe, reliable, and aligned with human values. It's a challenge that keeps many in the AI community up at night, grappling with complex questions of ethics, accountability, and control.
"The main thing about creating agents is ensuring a secure workflow where only the necessary information is used, without unnecessary data collection," says Garg. "Privacy concerns are a key focus for us, with secure vaults for storing user information."
领英推荐
Privacy is just one of the many thorny issues that companies like MultiOn are grappling with as they work to bring autonomous agents to market. There are also questions of bias and fairness, as these agents learn from and make decisions based on vast troves of human-generated data that may contain implicit or explicit biases. There are questions of transparency and explainability, as humans seek to understand how these complex systems arrive at their conclusions and recommendations. And there are questions of safety and robustness, as engineers work to ensure that these agents can operate reliably in a wide range of real-world situations.
"It's a tough issue. Humans have biases, and so will models," says Georgia Iacovou, Content Director at Handshake, and a tech policy expert who has been studying the societal implications of AI.
One approach that some in the AI community are exploring is the use of formal verification techniques to prove that an agent will behave as intended under a wide range of conditions. This involves breaking down the agent's decision-making process into a series of logical statements that can be analyzed and verified using specialized software tools.
"Designing AI will become unbelievably complex," says Huggins. "Large software projects already have millions of moving parts. Agents will take this complexity to an entirely new level."
But while agents with hard-coded decisionmaking holds promise for ensuring the safety and reliability of autonomous agents in certain narrow domains, it remains an incredibly challenging and time-consuming process that is not yet practical for the kinds of large-scale, open-ended systems that companies like MultiOn are building. In the near term, many experts believe that the key to building trust in these systems will lie in developing robust testing and monitoring frameworks that can detect and correct errors and anomalies in real-time.
Kirshbaum says, "Even when using a few closed LLMs like Claude or ChatGPT, robust orchestration tools are crucial for effective use."
Kirshbaum believes that the key to building reliable autonomous agents will lie in developing a new generation of tools and platforms that can help manage the complexity of these systems at every stage of the development process, from initial design and testing to deployment and ongoing monitoring. He envisions a future in which teams of engineers, domain experts, and even the agents themselves work together in real-time to identify and resolve issues as they arise.
But even as the AI community works to develop new tools and frameworks for building reliable autonomous agents, there is a growing recognition that these systems will never be perfect. Like any complex technology, they will always be subject to errors, biases, and unintended consequences. The key, many believe, will be to develop a culture of transparency, accountability, and continuous improvement that can help mitigate these risks over time.
For Garg and the team at MultiOn, the ultimate goal is to build agents that can be trusted to operate autonomously in the real world, taking on complex tasks and making decisions that have real consequences for the people and businesses that rely on them. It's a daunting challenge, but one they believe is essential to unlocking the full potential of AI to transform our lives and our world.
For now, the most effective agents will be ones that frequently bring a human into the loop. Garg says, "We like to think agents are pretty smart. Once you ask it to create a plan and check back with you after each step, it can confirm whether you want to proceed. For example, when it needs a payment, it can check back with you if the details are correct or if you want to do something else."
Part 3: Shaping the Future of Human-AI Interaction
As autonomous agents become more sophisticated and capable, may reshape not just the way we interact with machines, but the very fabric of our economy and society. From healthcare and education to finance and transportation, there is hardly a domain of human activity that will not be touched by this technology in the years and decades to come.
"The question is, how can we make agents safer and more secure?" says Garg. "How do we ensure they don't perform random actions, empty bank accounts, or make unwanted purchases? We're focused on building safe and secure agents that enhance productivity and benefit humanity."
One of the key challenges in building safe and reliable autonomous agents is ensuring that they are aligned with human values and goals. This is not a trivial problem, as the values and goals of humans are often complex, context-dependent, and sometimes even contradictory. Ensuring that an agent acts in accordance with these values requires not just advanced technical capabilities, but also a deep understanding of human psychology, culture, and ethics.
"My work intersects with this area through volunteer efforts with AI and Faith, a group of computer scientists and faith leaders discussing AI's ethical implications," says Kirshbaum. "I'm not personally religious, but I find the discussion fascinating because it involves experts in ethics and morality."
Kirshbaum believes that building trust in autonomous agents will require engaging with a wide range of stakeholders, including not just technical experts and policymakers, but also community leaders, ethicists, and the general public. He argues that the development of these technologies must be guided by a commitment to transparency, accountability, and democratic oversight, rather than being driven solely by the interests of tech companies and investors.
"Over 80% of the world identifies with a religious faith, and this number is growing," he says. "For many, what constitutes hallucination might differ based on their beliefs. It's crucial to consider these perspectives when developing AI systems."
As autonomous agents become more integrated into our daily lives, they will also raise important questions about privacy, security, and the ownership and control of personal data. Many experts worry that the rise of these agents could lead to a further concentration of power in the hands of a few large tech companies, exacerbating existing inequalities and undermining democratic values.
"It's a tough issue. Humans have biases, and so will models," says Iacovou. "Addressing these challenges requires technical solutions, policy interventions, and a fundamental rethinking of AI development and deployment."
Iacovou argues that addressing these challenges will require not just technical solutions, but also policy interventions and a fundamental rethinking of the way we approach the development and deployment of AI systems. She believes that we need to move beyond the current model of AI development, which is often driven by the interests of a few large tech companies, and towards a more decentralized and democratic approach that puts the needs and values of communities and individuals at the center.
"Having a website was complex in the '90s, but now it's about design and user experience," she says. "Similarly, chatbots are evolving. Conversations can be life-changing or boring; the quality of interaction matters."
Agents may even begin to go beyond just efficiency and productivity…even into the world of art.
"For instance, the other day I was using MultiOn to send a poem to my friend," says Kirshbaum. "I like poetry a lot. I asked the agent to find an article about sustainability, then use ChatGPT to write a poem based on it, and finally email the poem to my friend. It handled all of that perfectly."
Kirshbaum's anecdote encapsulates the exciting potential of autonomous agents - their ability to weave together multiple AI capabilities to accomplish complex, creative tasks that blur the lines between human and machine intelligence. As these agents continue to evolve and become more integrated into our daily lives, the way they communicate and interact with us will be crucial. The chatbot, once a standalone tool, will soon become just one module in a larger ecosystem of AI agents working together to understand and serve human needs in ever more sophisticated ways. The future of human-AI interaction is not just about building smarter, more capable agents - it's about designing a new kind of relationship between humans and machines.
Responsible AI Leader | Research, Policy, Consulting
3 个月Thanks for sharing Jeremy Kirshbaum! I wonder what preliminary agent governance models are like internally and how they approach safety testing. Also thinking through which sectors autonomous agents are likely to make a big splash (data heavy, like financial services?) or where they may have a higher barrier to entry (high risk, like health?). Appreciate that you continue to share insights at the forefront!