登录查看更多内容

Introducing The Context and Outputs Library (COOL) Framework For AI Agents:

Franck Boullier

Chief Digital Officer @ UNIQGIFT | Advisor | Entrepreneur | FinTech, DLT, AI | INSEAD MBA

发布日期: 2024年6月19日

Executive Summary:

Large Language Models (LLMs) like GPT (Open AI), Gemini (Google), and Claude (Anthropic) are becoming increasingly powerful and capable.

Research suggests these models are converging towards a shared understanding of the world.

Context and prompt engineering will be essential for AI Agents to deliver the best possible responses.

However, this creates new challenges:

How to manage and maintain vast amounts of context data.
How to keep track of the AI Agents your organization is using.
How to efficiently track and manage the AI Agents' outputs.

Doing this well is critical to ensure accuracy, build user trust, enhance decision-making, enable continuous learning, and comply with regulatory and ethical standards.

The COOL Framework, inspired by the widely used IT Infrastructure Library (ITIL) Framework, is a comprehensive set of principles, processes, and practices designed to track and manage:

Your AI Agents,
The context you provide them so they can perform efficiently, and
The outputs they generate for your organization.

The COntext and Output Library (COOL) Framework wheel - Franck Boullier

COOL's key objectives include:

Enhancing AI-generated content quality and relevance.
Streamlining the management of context data, AI Agents, and outputs generated be these AI Agents.
Providing a structured approach to AI data governance and compliance.
Promoting continuous improvement and optimization of AI applications.

Adopting COOL can significantly improve how organizations harness AI technology and AI Agents, ensuring AI developments align with business goals and contribute meaningfully to decision-making processes.

From Attention To Context:

The 11 pages Attention Is All You Need paper written in 2017 quick-started the Generative AI revolution that gave us Chat GPT, Gemini, Claude, Midjourney, and many more Gen AI-based tools.

The paper introduced the Transformer Architecture and the concept of Attention.

Attention allows AI models to focus on and weigh the importance of what the AI knows (the input) to generate the best possible response (the output) based on the description of the task to perform (the prompt).

In the research paper The Platonic Representation Hypothesis - 13 May 2024, the authors argue that AI models are converging:

Neural networks, trained with different objectives on different data and modalities, are converging to a shared statistical model of reality in their representation spaces.

If true, this finding has major implications for the future of LLMs: they will all have, eventually, more or less the same "model of reality".

With the continuing trend of models scaling up, (...) model alignment will increase over time – we might expect that the next generation of bigger, better models will be even more aligned with each other.

For AI Agents, that means that the quality of the information you provide (the input) and how well you describe the task to perform (the prompts) will be THE key differentiators between AI Agents tomorrow, not the LLMs they are based on anymore.

What Is Context - Why It Matters?

The concept of Context (or Context Window) in Large Language Models (LLMs) today derives from the Attention mechanism we mentioned earlier.

Context refers to the surrounding information that helps the model understand the meaning of words, images, and any other types of content that form the input they receive from the users.

Using the provided context and then combining that context with the user inputs and previous interactions an LLM like Open AI's GPT, Google's Gemini, or Anthropic's Claude can generate coherent, relevant responses even when there are complex interactions with the user.

A human manager explaining a task to an AI Agent (Midjourney)

In essence, AI Agents are like interns - they NEED context to give the best possible responses.

If you have ever built an Open AI GPT (a type of AI Agent), you know that the quality of the responses you'll get from your GPT is directly related to the quality of the prompt AND the quality of the context (the additional information) that you have provided to your AI Agent:

More details about your company's products.
Explain what the AI agent should do.
Give examples of the questions the AI Agent will have to answer and what a "good" answer would look like.

This information allows the AI agent to "narrow down" the scope of the conversation.

It also gives the AI Agent useful information and data it can use when it interacts with users.

Techniques such as Retrieval Augmented Generation (RAG) automatically add additional context to whatever the user is asking. This helps improve the quality of the AI Agent's response, reduces the risks of hallucinations, and is one of the most efficient techniques to ensure that the AI Agent stays on topic.

Context Windows Are Getting Larger:

Open AI Chat GPT was launched in November 2022 with a context window of 4,096 tokens, the equivalent of 3,000 to 4,000 words or about 5 to 6 pages of text.

The context had to stay small to avoid hitting the context window limit.

The size of the context that you can pass to the model became one of the key differentiators between LLMs: more context leads to better responses.

March 2023: Anthropic Claude is launched with a context window of 9,000 tokens
May 2023: Claude's context window increases to 100K tokens.
November 2023: Claude 2.1. Context window of ?200k+ tokens.
As of June 2024: Google Gemini 1.5 Pro has an industry-leading context window of up to 1 million tokens of text, which equates to about 750,000 words - approximately 1,500 pages of text, 30,000 lines of code, or an hour of video.

Researchers are working on the concept of "infinite context", a context window that would equip an AI Agent with the capability to process and utilize ALL the context that you could provide:

What your company does, what your strategy is, what your products are.
All the past interactions you had with every user.
All the issues you ever had and all the solutions and answers you've provided to solve these.

We may soon be able to create AI Agents with a perfect memory of everything they were exposed to.

How to efficiently format, update, manage, and access this context information?

An AI Context and Governance control room (Chat GPT)

We Need A Framework To Manage Context:

In the article Moving Past Gen AI Honeymoon Phase - from May 2024 McKinsey highlights:

The importance of data readiness.
The need to invest in data foundation.
The need for ongoing updates and maintenance efforts as you collect more data.

Context WILL change over time as you get more data and the business environment evolves.

Getting this right requires significant human oversight from people with relevant expertise.

Managing And Monitoring AI Outputs:

Outputs are the content generated by AI Agents based on:

The model they use and the training data the model has been exposed to.
The context given.
The task they have to perform.
The questions they need to answer.

Outputs generated by AI Agents are the tangible results of the AI Agent's work.

If context is key to making sure that an AI Agent has all the information it needs to perform effectively, the AI Agent's outputs must also be tracked and managed efficiently.

In an article from August 2020, I highlighted the 5 reasons why you need explainable AI. The AI Verify Foundation, in collaboration with Singapore's IMDA (Infocomm Media Development Authority), has proposed a Model AI Governance Framework for Generative AI - June 2024 has proposed a comprehensive approach towards Generative AI governance.

Today more than ever, you need to be able to track, explain, and audit the outputs generated by AI and AI Agents.

Be mindful of AI Generated Errors! (Shutterstock - Andrew Rybalko)

Ensuring Accuracy and Relevance:

AI Agents can sometimes generate outputs that are inaccurate or irrelevant.

By tracking and managing these outputs, organizations can quickly identify and correct any discrepancies, ensuring the information provided is both accurate and pertinent to the user's needs.

User Trust and Brand Reputation:

The quality and reliability of AI Agent outputs directly impact user trust and the overall reputation of your brand.

By ensuring that outputs are consistently high-quality, relevant, and free from errors, organizations can build and maintain trust, which is crucial for long-term success.

mParsec 7 个月前

What I think about AI

Ashwin Megha 5 年前

Mastering the Art of Evaluation: Key to Success in…

Shiv Ramanna 4 个月前

Enhancing Decision-Making Processes:

In many cases, the outputs of AI Agents are used to inform decision-making processes.

Efficient tracking and management ensure that these decisions are based on the most accurate and up-to-date information, leading to better outcomes.

Continuous Learning and Improvement:

By monitoring the outputs of AI Agents, organizations can gather valuable data on their performance, which can be used to train the models further or improve the AI Agent itself.

This continuous learning loop will improve your AI Agents over time, making sure that they are adapting to new data and becoming more effective in their tasks.

Identify And Prevent "AI Hallucination":

Large Language Models can sometimes generate plausible but entirely fabricated information: the infamous "hallucinations."

Monitoring outputs is crucial for identifying and mitigating these occurrences and preventing the creation of false or misleading information.

Intellectual Property and Data Security:

Outputs generated by AI Agents may contain sensitive or proprietary information.

Managing AI Agent's outputs helps protect intellectual property and adhere to data security protocols, preventing unauthorized access or misuse of sensitive data.

Regulatory Compliance and Ethical Considerations:

In many industries, regulatory compliance mandates strict oversight of automated decision-making processes.

Most AI regulations that are being drafted around the world are focusing on the need for AI Agents to be auditable and explainable.

Tracking outputs is essential for demonstrating compliance with regulations. Moreover, it'll help your organization follow ethical standards, ensuring that AI Agents do not generate biased, discriminatory, or harmful content.

The Egg Theory of AI Agents:

In The Egg Theory of AI Agents - 30 May 2024 Rex Woodbury reminds us of the importance of making sure that the user can contribute to and work with AI Agents instead of being passive consumers of the outputs generated.

A human cook and a robot baking a cake together (Midjourney)

The egg theory is a consumer psychology concept that explains why people are more likely to use a product if they feel like they have contributed to it in some way:

When instant cake mixes came out, they sold poorly. Making a cake was too quick and simple. People felt guilty about not contributing to the baking. So companies started requiring you to add an egg, which made people feel like they contributed. Sales soared. It turns out that there’s such a thing as too easy

Rex uses the following example to illustrate his point:

I might say to an AI agent, “Book me a flight to Paris on July 3rd.” It would be uncomfortable to remove me entirely from the workflow. I might feel lazy, guilty, even nervous about it. Did the bot book the right flight? Does it know I prefer morning flights, hate red eyes, and am a loyal Delta SkyMiles member?

The Context and Outputs Library (COOL) Framework:

Introducing the Context and Outputs Library (COOL) Framework.

The COOL framework is designed to help organizations manage:

The context AI Agents use.
The virtual teams of AI Agents they have created or are using.
The outputs generated by AI Agents.

The COOL framework is a set of principles, processes, and practices to manage the documents, data, and information (the artifacts) that are:

Used as context by AI Agents.
Relevant to describe what an AI Agent does and how it's supposed to perform.
Generated as outputs by AI Agents within an organization.

The COOL framework covers the full lifecycle of these artifacts.

If this sounds familiar, it's because the COOL framework is directly inspired by the IT Infrastructure Library (ITIL) Framework. ITIL is a widely adopted framework for IT service management.

Let's explore the parallels.

COOL and ITIL Share Common Principles:

ITIL emphasizes viewing IT as a service provider focused on customer needs, managing expectations, and continuously improving. It aims to align IT with the business, optimize costs, and enhance quality across strategy, design, transition, operations, and improvement.

Similarly, COOL positions AI Agents as a service to the organization.

It focuses on:

Understanding the needs of human stakeholders,
Managing expectations around AI Agents capabilities, and
Continuously improving the quality and relevance of AI-generated content.

?The objectives of the COOL framework are to:

Align AI development with business goals,
Optimize the costs of training and inference, and
Prescribe practices across the AI lifecycle from strategy to deployment to optimization.

COOL Principles:

ITIL best practices are guided by 7 key principles:

Focus on value
Start where you are
Progress iteratively with feedback
Collaborate and promote visibility
Think and work holistically
Keep it simple and practical
Optimize and automate

These principles map remarkably well to the world of AI Agents:

Focus on value: Focus AI Agent development on delivering business value.
Start where you are: Leverage existing data, models, and outputs.
Progress iteratively with feedback: Improve AI Agents iteratively based on user feedback.
Collaborate and promote visibility: Collaborate across teams and make AI agents transparent. Implement explainable AI Agents to enhance trust in the outputs.
Think and work holistically: Consider the full lifecycle of the context used and the outputs generated by AI Agents.
Keep it simple and practical: Avoid over-engineering systems, minimize complexity in managing AI-generated documents and outputs, and focus instead on feasible approaches that deliver clear benefits to the organization.
Optimize and automate: Continuously optimize and automate AI Agent-based workflows. This could involve using AI Agents to improve efficiency and categorize, tag, and index content.

Implementing The COOL Framework:

Putting the COOL framework into practice involves steps that mirror an ITIL implementation:

Assess current capabilities: Evaluate current practices for managing AI Agents, AI Agent context information, and generated content.
Align AI strategy with business goals and provide tangible value.
Define clear roles for humans and AI Agents in managing AI contexts and outputs.
Implement standardized processes for the creation, management, and disposal of AI Agent context and AI Agents generated content.
Define KPIs and monitor AI Agents' performance.
Train teams on COOL principles and practices.
Establish a culture of continuous AI agent improvement.

Wrapping It Up:

By adopting the COOL framework and its ITIL-inspired principles, organizations can:

Improve the quality, coherence, and relevance of AI-generated content.
Enhance the experience of human stakeholders.
Improve auditability and explainability of AI-generated content.
Optimize resources invested in AI.
Ultimately better align AI development efforts with overarching business objectives.

Borrowing from Rex Woodbury's article again:

The best companies will be savvy in how they embed human decision-making into workflows, rather than removing the need for human input altogether. (...) Winning products (...) will be those that offer a bridge from the world of human work to the world of software work, making us feel comfortable and in control along the ride.

The COOL framework can help build that bridge.

Your Turn!

This article is just a starting point.

How are you managing context and outputs in your AI agent projects?
How do you keep track of the AI Agents that have been deployed in your organization?
What challenges are you facing?
Do you think the COOL framework can help you address these challenges?

Share your insights and experiences in the comments below.

I'm eager to hear your thoughts!

Sources:

Stefano Ceravolo

Senior Manager at Neoris || INSEAD MBA || Co-founder BolzanoSlushD

4 个月

Cool summary Franck Boullier! I wonder how close is the "infinite context" you mention with "common sense" and "tacit knowledge", and how they overlap. From what I have seen, it is somewhat easy (kinda of) to convert explicit knowledge into something actionable via GenAI. Depending on the final user's maturity, this can be well documented. However, to capture everything else you really have to build that layer on top to monitor and audit that you mention, but it is challenging to capture every possible scenario. This until we don't have a general model (very close to ideas of AGI) that is able to do this without explicit instruction, but i have no clue on how close it could be.

Anne LEHMAN

SID Accredited Director - Director Strategy & Innovation | Business Development | Entrepreneur | Solution Finder > 20yr+ Exp

4 个月

Very insightful. Interesting approach to structure the Data and AI governance in any structure.

1 次回应

Vijay Gunti

4 个月

It would be prudent to explore how generative AI can enhance decision-making processes in sectors with stringent compliance and governance requirements.

Paolo Cervini

Walk the Talk - Generative AI - Thinkers50 Radar - VP, Co-lead of The Management Lab by Capgemini Invent

4 个月

Very comprehensive and insightful. Very interesting the context windows size exponential growth. For AI Agents, unfortunately I have no longer access to ChatGPT Plus, so I can't assess latest developments on the GPT Builder. But the direction is clearly that you described.

查看更多评论

要查看或添加评论，请登录

Franck Boullier的更多文章

How To Balance Your AI Equation: Competitive Advantage, Technical Debt, and Regulation

2023年10月16日

How To Balance Your AI Equation: Competitive Advantage, Technical Debt, and Regulation

There is a complex interplay between technology, business processes, norms and practices, and the ever-evolving…
ChatGPT - "You Don't Know What You Don't Know"

2023年1月24日

ChatGPT - "You Don't Know What You Don't Know"

Unless you have completely unplugged from the Internet for the last two months, you must have heard about ChatGPT: the…
Programmable Money (PBM Part 2) - 5 Possible Usages

2022年11月28日

Programmable Money (PBM Part 2) - 5 Possible Usages

This article is the second part of a series on Purpose Bound Money and Programmable Money on the Blockchain. See Part…

2 条评论
Voucher Issuance and Purpose Bound Money On The Blockchain - An Analysis Of Singapore Project Orchid.

2022年11月15日

Voucher Issuance and Purpose Bound Money On The Blockchain - An Analysis Of Singapore Project Orchid.

With current headlines dominated by the news of FTX's implosion last week, it is easy to forget that blockchain…

3 条评论
Data is not the new currency - Data is the new money

2020年6月7日

Data is not the new currency - Data is the new money

"Data is the new currency": It's a commonly used saying, but it's NOT meaningful enough for most people. When I explain…

2 条评论
The 5 key challenges every Coliving operator has to face

2019年11月25日

The 5 key challenges every Coliving operator has to face

In the summer of 2018, I decided to spend several weeks in San Francisco. I needed a place to stay.

2 条评论
Short, medium, and long term fixes, make sure you do all three!

2019年9月22日

Short, medium, and long term fixes, make sure you do all three!

A customer calls you: there’s water everywhere in the kitchen! Short term fix: turn the water off. Medium-term fix:…
The customer satisfaction equation

2019年9月18日

The customer satisfaction equation

What is customer satisfaction? It was a long time ago. I was in class, in an Amphi full of people.

79 条评论
Location, location, location? Not anymore

2019年9月5日

Location, location, location? Not anymore

How should you evaluate real estate in the 21st century? Why are people buying real estate today? If you bought a…

1 条评论
Focusing only on SLA is a BAD idea!

2019年7月23日

Focusing only on SLA is a BAD idea!

SLAs (Service Level Agreements) are the key elements that make every well-run operation work! All of us in operation…

See all articles

Introducing The Context and Outputs Library (COOL) Framework For AI Agents:

Franck Boullier

Chief Digital Officer @ UNIQGIFT | Advisor | Entrepreneur | FinTech, DLT, AI | INSEAD MBA

Executive Summary: