Frontiers of Software Architecture: Breaking Down NVIDIA’s Software AI Innovation - Voyager
In late May, a team at NVIDIA published a paper (and source code) titled: VOYAGER: An Open-Ended Embodied Agent with Large Language Models.? Voyager is an AI model designed to continuously explore the world of Minecraft, evolve its own software code, and improve its performance over time through the development of “skills” informed by observations it makes about its environment and feedback it gathers by testing those skills.
Minecraft is a popular sandbox video game that allows players to build and explore virtual worlds using different types of blocks. It offers multiple game modes, including survival mode where players must gather resources to build the world and maintain health, and creative mode where players have unlimited resources.
Voyager shares some similarities with agents that operate in a traditional reinforcement learning software paradigm, as it is able to perform actions within an environment and refine its skills based on feedback from the environment state. However, it also introduces major conceptual innovations to software architecture: (1) Voyager can evolve its own code to create a diverse set of actions, facilitated by its skill library where it stores and retrieves complex behaviors, leading to an unbounded action space. (2) Instead of relying on a reward function to guide its actions, Voyager uses a separate agent called a CurriculumAgent that provides an automatic curriculum of tasks with the goal of helping Voyager to “become the best Minecraft player in the world”. (3) Voyager leverages Large Language Models (in this case primarily GPT-4) as a long-term knowledge store and to generate tasks and code, while also maintaining its own state, including a skills library, records of successfully completed and failed tasks, and recent knowledge acquired.
Voyager's approach is illustrative of how software can be designed to leverage a large language model (LLM) in diverse ways, extending beyond the simple chatbots and content summarization use cases to include task and code generation, and state maintenance.
Breaking Down Voyager
High Level Architecture
At a high level the Voyager system interacts with a variety of remote dependencies, files in the local filesystem and vector datastores.
Voyager Software Components
Software Overview
Voyager is composed of multiple components that are primarily different types of helper agents that get their instructions from pre-authored prompt templates (see Prompt Templates below) that leverage OpenAI’s large language models (LLMs) to perform some knowledge related task and then chains the output of those tasks with instructions encoded using traditional programming methods.? The recently popularized Python Langchain framework is used to manage prompts and LLM usage.
For instance, Voyager’s ActionAgent has a prompt template that will instruct the OpenAI large language model to build new Javascript skills given a certain context.? It will follow static Python instructions to connect with the Minecraft server to try the newly created Javascript based skill.
The major agents that Voyager defines and relies on are:
If the task is successfully completed, the Javascript code is added to Voyager’s skills repository by the Skill’s Manager.? At the end of every step, the loop starts again with the CurriculumAgent selecting a new proposed task for Voyager to pursue.
Detailed software component and sequence diagrams below provide further detail on how Voyager’s software is designed.
领英推荐
Voyager Components
Voyager Main Loop
Prompt Templates
The agents that Voyager depends on themselves are as much programmed using plain English Prompt Templates as they are the Python programming language. Think of Prompt Templates as providing scripts for?each agent to act out a certain responsibility. While agents have a certain amount of freedom to carry out their work, Prompt Templates can provide detailed instructions, guardrails and strict structure to what they produce. It is worthwhile reviewing each Prompt Template in further detail to get a better sense of agent behavior:
CurriculumAgent Prompts:
SkillsManager Prompts:
ActionAgent Prompts: Action and Response Prompt Templates
CriticAgent Prompts:
Summary
Voyager's approach demonstrates how LLMs can be integrated into a system to drive dynamic behavior and continuous learning. The use of pre-authored prompt templates and the Python Langchain framework to manage interactions with the LLMs shows how traditional programming methods can be combined with AI to create more flexible and adaptive systems.
The agent-based architecture of Voyager, with specialized agents responsible for different aspects of the learning process, presents a modular approach that could be applied in other contexts. This modularity allows for the separation of concerns, making the system more manageable and scalable. Each agent can be developed, tested, and improved independently, and new agents can be added as needed to extend the system's capabilities.
In terms of broader applications, this approach could be used in any situation where a system needs to learn and adapt over time based on its interactions with a complex environment. This could include other game environments, virtual simulations, or real-world applications such as autonomous vehicles or robotics. The ability to generate and refine code, propose tasks, and assess outcomes could be particularly useful in situations where the system needs to operate autonomously for extended periods, continually improving its performance without human intervention.
Senior Director @ Northwestern Mutual | Technology Strategy
1 年Thanks for sharing this. Is this conceptually the same as how AlphaGo was designed by DeepMind? That is a board game vs. an open ended world so the "rules" the prompt templates leverage against the LLM are quite different. Very interesting to think how "self learning" software might be applied towards consumer facing services/products (not games).