- Editorial: AI agents are on the rise
- News from the usual suspects: Anthropic, Cohere, OpenAI, etc.
- The freshest AI&ML research papers from the week of Apr 1 — Apr 7
AI agents are on the rise, though the concept of AI agents isn’t novel. Let’s first discuss the established categorization of them to understand how they are currently utilized and to later establish why they are becoming even more popular now.
- Simple Reflex Agents?—?imagine asking a chatbot, “I need help with my order.” The agent searches for specific keywords (‘help’) it has been programmed to recognize and provides a reply.
- Model-Based Reflex Agents?—?consider Apple’s Siri, Amazon’s Alexa, and smart-home systems, they maintain an internal model of user preferences, interests, and behaviors based on past interactions.
- Goal-Based Agents?—?a chess-playing program decides moves based on a strategy to win the game.
- Utility-Based Agents?—?a finance app advising on investments for maximum return.
- Learning Agents?—?a bright example would be AlphaGo/AlphaZero, which learned Go and chess through self-play and reinforcement learning, becoming extraordinarily skilled without human-coded rules. Another example is non-player characters (NPCs) in a video game that adapt and learn from player actions, creating a more dynamic and challenging gameplay experience.)
- Hybrids of the agents mentioned above also exist. For example, an autonomous vehicle uses sensors (simple reflex), maps (model-based), destination goals (goal-based), safety and efficiency criteria (utility-based), and adapts to driving conditions and learns from user feedback (learning agent).
So, if we are utilizing AI agents so widely already, why is everybody doubling down on them now?
Advancements in NLP, boosted by powerful foundation models, along with increased computational power, have transformed AI agents. This, combined with multimodal capabilities and better function calling, is pushing them toward ‘hyperagent’ status?—?just as the internet revolutionized text into hypertext. All these have resulted in extended research, novel infrastructure, and many more attempts for practical implementation.
Just last week, a few releases highlighted the interest in developing the AI agents ecosystem:
- The Octopus v2 model, developed by Stanford University researchers, exemplifies the progress made in on-device language models. With its impressive performance and efficiency, Octopus v2 demonstrates the feasibility of deploying AI agents on edge devices, opening up possibilities for privacy-conscious and cost-effective solutions.
Our attention also caught how many Coding AI agents have been released last week:
- Cohere AI’s C4AI Command R+, a 104B parameter AI model, sets new standards for coding AI agents with advanced capabilities like code rewrites, snippet generation, and multi-step tool use including Retrieval Augmented Generation (RAG). Designed specifically for developers, it supports ten languages. It’s released under a CC-BY-NC license and made the weights accessible for research use here.
- Anthropic launches function calling, enhancing Claude 3. You can enable it to select the right coding tool from hundreds, using external tools via APIs for complex tasks and calculations. It relies on detailed JSON tool descriptions for accurate selection. It utilizes “chain of thought” for transparent decision-making and can handle complex multi-tool scenarios, enhancing its utility for developers.
- Replit introduces AI tools integrated with their IDE, focusing on building Large Language Models (LLMs) for code repair. By training models with a mix of source code and relevant natural languages, they aim to create Replit-native models for more powerful developer tools.
- CodiumAI introduces Codiumate, an AI coding agent enhancing developer productivity by assisting in task planning and code completion. Codiumate streamlines the coding process with plan-aware auto-completion and quality tips, resulting in a significant boost in efficiency and a reduction in code errors.
- The SWE-agent, developed by researchers at Princeton University, transforms language models like GPT-4 into software engineering agents capable of addressing bugs and issues in real GitHub repositories.
Achieving a new benchmark, the SWE-agent successfully resolves 12.29% of challenges on the complete SWE-bench test set. This advancement is made possible through the introduction of Agent-Computer Interfaces (ACI), which streamline the process for the language model to interact with, analyze, and modify code within repositories.
AIDE, an AI-driven data science agent, achieves human-level performance in Kaggle competitions, outshining half of human competitors autonomously. It surpasses both traditional AutoML systems and ChatGPT (even with human help), excelling in over 60 challenges without any human input. AIDE operates through an iterative, feedback-driven approach, closely mimicking human data scientists’ strategies but with greater efficiency and cost-effectiveness.
And don’t forget, the recent releases of specified agent IOS?—?AIOS, an LLM agent operating system designed by Rutgers University. It optimizes LLM integration by enhancing scheduling, resource allocation, and context maintenance for diverse agents. AIOS includes modules for agent scheduling, memory and storage management, and access control, significantly improving agent performance and efficiency. Open-sourced for broad access, AIOS represents a pivotal step towards creating a more cohesive and powerful ecosystem for LLM-based agents.
This evolution of AI agents opens up exciting possibilities for human-machine collaboration and the development of truly intelligent systems that can assist us in a wide range of tasks and domains.
Enjoyed This?Story?
We write a weekly analysis of the AI world in the Turing Post newsletter. Subscribe for free and receive a free AI essential kit:
- Microsoft and Quantinuum have achieved a notable quantum computing milestone by developing four highly reliable logical qubits from a configuration of 30 physical qubits. This advancement has led to an 800-fold improvement in the logical error rate, a critical step towards the realization of dependable quantum computing systems. Source: Microsoft Azure Quantum Blog.
- DALL-E 3 integration with ChatGPT introduces an enhanced image editing capability, expanding the utility and creative potential of AI in generating and modifying visual content. (Didn’t really work for me so far.)
- Stability AI releases Stable Audio 2.0, advancing generative AI audio with text-to-audio and audio-to-audio capabilities, allowing up to 3-minute track generation. Focuses on musicality, practical application, and copyright respect with licensed AudioSparx data.
- Gretel AI has introduced the largest open-source Text-to-SQL dataset, hosted on Hugging Face, to enhance AI research and model training efficiency. This dataset, comprising over 105,851 records across 100 domains, aims to democratize access to data insights and facilitate the development of AI applications capable of intuitive database interactions.
Last week, a few exciting research papers were published. We categorize them for your convenience ????