Charting the Path to AGI

Charting the Path to AGI

Hi Everyone,

This a guest post by Abhinav Upadhyay, who has an incredible substack called Confessions of a Code Addict. His summaries of papers also shines a light of clarity on the difficult topics he chooses.

So I asked him to tackle the most difficult topic of all of our times: AGI. As fate would have it, Google decided to publish a rather lucid paper on this recently.

Read the Paper

  • Favor to ask: kindly give this article a "like", to remind LinkedIn's algorithm to share it with other curious readers. ??????


?? From our sponsor: ??


Sell Smarter with AI in 2024! New Data from HubSpot and G2 dives into insights from 600+ sales pros and leaders across B2B and B2C teams.

Now more than ever, sales teams to turn to AI-based tools to optimize the sales process, automate manual tasks, helping them actually spend their time connecting with prospects and closing more deals.


Get the free report



By Abhinav Upadhyay, Hyderabad India, December, 2023.

Following are some of his articles which have made an impact on the readers:



Charting the Path to AGI: DeepMind's Levels and Risks Framework


A year has passed since the release of ChatGPT, and AI has progressed rapidly in this short amount of time. This progress has also triggered discussions about artificial general intelligence (AGI). Some people believe that ChatGPT has shown sparks of AGI, while some believe that state-of-the-art large language models (LLM) are already AGI. And, some scientists believe that we are not yet at the level of AGI but accelerating towards it at a rapid pace.

When we talk about AGI, it also brings up the debate around the risks it poses to society. A part of the industry is even asking for regulations to control the research and development of AI to mitigate the potential risks. However, everyone in this debate has their own definition of AGI, which makes the discussion very subjective.

Before we talk about regulations, it’s important that the scientists come up with an objective definition of AGI, design a benchmark to test for AGI, and create a framework to assess the associated risks based on the capabilities of these models.

Till now, very little progress has happened on this front. However, recently a team at DeepMind released a paper which may build the foundation for defining AGI and the associated risks. The paper outlines a gradual pathway toward AGI going from “no AI” to “artificial superintelligence”, along with a framework for assessing the risks associated with each level of AGI.

This paper might be the start of an important discussion for formulating the risks of AGI. This article will unpack the paper for you, and highlight the important insights. So, without wasting anymore words, let’s dive in.


For those looking to stay on top of the latest developments in AI policy discussions, consider subscribing to “AI Policy Perspectives”, a Substack curated by a team of researchers at DeepMind. They publish a comprehensive monthly report highlighting the most significant news and updates around AI policy. ( Harry Law and friends).

Six Key Principles for Defining Levels of AGI

Scientists and philosophers have long thought about AGI, and it is important to consider how they have thought about it in the past. The authors analyze nine definitions of AGI proposed between the period of 1950 and 2023 [Turing 1950, Searle 1980, Legg 2008, Shanahan 2015, OpenAI 2018, Marcus 2022, Suleyman and Bhaskar 2023, Norvig et al. 2023] and come up with six key principles that should form the basis of a framework for defining AGI and its risks. These six principles are as follows:

  • Focus on capabilities, not processes: Many AGI definitions are focused on the mechanisms of AGI, such as sentience, consciousness, or humans-like thinking processes. Instead, the focus needs to be on the AI model’s ability to perform tasks, irrespective of the underlying processes which may drive it.
  • Focus on generality and performance: While there is an obvious focus on generality when defining AGI, it needs to be accompanied with a focus on performance. An AI can be deemed generalizing only if it performs well on a benchmark composed of a diverse array of tasks.
  • Focus on Cognitive and Metacognitive tasks: The benchmark for testing AGI systems needs to include both cognitive and metacognitive tasks. Metacognitive tasks measure the ability of the AI system to learn. Whereas cognitive tasks are non-physical tasks, i.e., a robotic embodiment is not needed to perform them. The authors believe that the ability to perform physical tasks increases a system’s generality, but that should not be a requirement for being defined as an AGI.
  • Focus on potential, not deployment: The emphasis should be on an AI system’s potential for achieving a goal, as opposed to its deployment in the real-world and actually doing that. This is important because deployment in the real-world can be time consuming, ridden with regulatory hurdles, and risky. For instance, demonstrating that an AI has potential of substituting labor is sufficient, rather than requiring actual labor substitution in real-world scenarios.
  • Focus on ecological validity: Another aspect that should be considered when designing the tasks for benchmarking AGI systems is their real-world value, i.e., they should be meaningfully useful to society. Otherwise, there are many tasks that are easy to automate and quantify but have no alignment with the real-world.
  • Focus on the path to AGI: Finally, instead of thinking of AGI as an end goal, we need to focus on the path for getting there. The road to AGI is going to be through small and incremental improvements, and not through a sudden discovery in a lab. A well defined gradual pathway leading to AGI will help the discussion around policy and regulation of these systems.

Six Levels of AGI

Using these six key principles, the authors delineate six gradual levels of AGI. These levels are defined using ‘performance’ and ‘generality’ as the two dimensions to measure them.?

Here, performance refers to how well an AI system can perform a task in comparison to a skilled human. While, generality is concerned with the breadth of the tasks that the AI can perform. For instance, is it skilled in a narrow domain such as protein folding, or is it capable of performing well on tasks from a wide array of domains.

The following table shows these six levels of AGI as defined in the paper. The two columns, Narrow and General, specify the generality of the AI, while the six rows define the six levels of AGI with gradually increasing levels of capabilities.?

Table 1 from the paper showing Levels of AGI with examples (above)

Although, these leveled definitions are straightforward, few important points are worth highlighting:

  • To certify an AI model at a certain level, the model needs to perform well on most (not all) of the tasks at that level. For example, to become a competent AI, it should be able to perform as good as the median skilled human on most of the tasks at that level.
  • In reality, the performance of AI systems on these benchmarks is going to be very uneven. For instance, they may be able to perform well on some tasks at “Competent” and “Expert” level and yet be certified as “Emergent” because for most of the tasks their performance is at the “Emergent” level.?
  • The order in which these systems acquire skills can have safety implications. For instance, acquiring expertise in chemical engineering before learning about ethics can be dangerous.
  • Also, the progression from one level to another may not be linear, but at a faster rate. For instance, once an AI system acquires the ability to learn new tasks, it may be able to progress through the levels much faster than anticipated.
  • Finally, even if an AI system is capable of performing at a certain level as per this rubric, in reality it may not be able to achieve that level of performance when deployed. This may happen due to the limitations of the environment, or the interface. For instance, even though the DALL-E 2 model is superior in drawing skills than most humans, it is categorized as an expert level narrow AI system, instead of virtuoso or higher. This limitation is because of the prompting based interface which limits the quality of the model’s output.

Defining a Benchmark for AGI

The 6 levels of AGI as defined in the paper give us a pathway to classify the progress of AI systems. However, it does not specify any benchmark that should be used to test these systems to certify them as belonging to one of these levels.

These tasks need to be diverse, challenging, and relevant to real-world use cases. This is a serious undertaking which needs to include multiple perspectives. The benchmarks need to measure both cognitive and metacognitive abilities, and include tasks from diverse areas such as mathematical and logical reasoning, linguistics, coding, spatial reasoning, social intelligence, the ability to learn new skills, and creativity.?

Exhaustively enumerating tasks for such a benchmark is a monumental task and impossible to get right in the first attempt. Additionally, this benchmark needs to be a living and breathing piece of work which can get updated with new tasks as we understand more about the AI systems and their capabilities. For these reasons, the authors leave out the definition of a representative benchmark from the paper. However, they note that this is an important goal for the AI community to strive for.?

A Framework for Assessing Risks of AGI?

Just defining levels of AGI is a job half done, this also needs to be accompanied by a framework for assessing risks associated with each of these levels. For defining this framework, the authors introduce the concept of autonomy.

The autonomy of an AI system depends on its capabilities, and the environment in which it is operating. The environment here means the interface which enables human-AI interaction. The authors introduce six levels of autonomy which are directly correlated with the levels of AGI. Progressing through the levels of AGI unlocks higher levels of autonomy in the model. Because of this, the interface design of these systems is going to play a crucial role in the safe deployment of AGI in the real-world.

The following table from the paper shows these six levels of AGI autonomy, and examples of some of the associated risks:


Table 2 from the paper showing autonomy levels of AI and associated risks

I would highlight couple of points about this framework:

  • Each level of AGI opens up a new set of risks. However, this also means that the risks from the previous levels are no longer an issue. For instance, the “Expert” AGI might introduce risks of economic disruption and job replacement, but it also eliminates risks associated with “Emerging” and “Competent” AGI, such as incorrect task execution.
  • The paper lists down the six levels of autonomy along with concrete examples of the associated risks. But these are just a few examples, and not an exhaustive list. The interplay between the human-AI interaction and the capabilities of the AI models is going to determine the exact set of risks we are dealing with. But a framework like this makes the discussion on AGI more constructive, and can help the industry, and the governments in designing the course of policy around AI safety.?

Conclusion

The rate at which AI technologies are advancing demands a cohesive framework for recognizing AGI and assessing its potential impact. The DeepMind paper sets a foundation by charting a structured trajectory with clear AGI levels. What remains critical is the development of detailed benchmarks that will reliably measure AI capabilities across these levels. These benchmarks require input from a broad spectrum of fields to ensure they encompass the necessary complexities of real-world tasks.

Moreover, a nuanced risk assessment framework is essential for the development of advanced AI systems. Each incremental step of AI advancement brings with it distinctive challenges and risks that must be anticipated and managed. A well-constructed framework will dispel misconceptions and mitigate the rush to implement potentially stifling regulations that could hinder progress in AI research.

The AI community needs to extend the dialog started by the paper, filling in gaps with in-depth analyses and tools for testing AGI systems. Only through such concerted efforts will we be able to align the march toward AGI with prudent oversight and ethical considerations, ensuring that the transition to higher levels of intelligence is both beneficial and secure.

Many thanks to the author of this paper Google DeepMind , Meredith Ringel Morris, Jascha Sohl-dickstein, Noah Fiedel, Tris Warkentin, Allan Dafoe, Aleksandra Faust, Clement Farabet, Shane Legg, - published November 4th, 2023.

About Me

Abhinav is a seasoned software engineer with over a decade of experience in the industry in various roles ranging from dev ops, backend engineering to ML. He is an explorer who likes to break things open to understand how they work from the inside. This passion of learning has led him on the path of writing to share the insider’s perspective with his audience.

On his Substack, “Confessions of a Code Addict”, he talks about a myriad of topics including AI, programming languages, compilers, databases and many more. His in-depth exploration and insights offer readers a unique understanding of these subjects from a practitioner's viewpoint, making complex concepts accessible and engaging.?

Following are some of his articles which have made an impact on the readers:


This article is a guest post on A.I. Supremacy this week, my main Newsletter related to A.I. and its impact.

Mike Adams

Hospitality Driver/Social Media/Writer/Old Alaskan Scout.

1 年

Try playing a play by email online game with an AI as your game master?

回复
Jeff H.

President and CEO, Cyber Intelligent Partners LLC | CISO | NYU Adjunct Instructor

1 年

?? updating ‘21 research on AI competition! “With the era of the Fourth Industrial revolution spreading across the world, AI has become the commanding heights of the new millennial.” ? Lack of AI Ethics and safety. Chinese leadership has unaccountable history of practices that ignore ethical standards. According to a British Medical Journal,the WHO confirmed breaches of safety procedures at one of Beijing's top virology laboratories were the probable cause of SARS virus outbreaks. ? AI advances in military capabilities (with no red lines). Beijing has proven its desire to develop AI-enabled warfare such as swarm intelligence, integration with command decision making, and innovations for Artificial General Intelligence (AGI), which former Google China CEO called, “the holy grail of AI.” ? Digital socialism and government use of data. In China, social credit uses big-data collection and analysis, to monitor, shape, and rate individual’s behavior. ???? “Smart-City” services in over 40 countries proves China’s mass surveillance program as the most prolific and accessible in the world. https://www.dhirubhai.net/pulse/2021-research-united-kingdom-jeff-hoffmann?utm_source=share&utm_medium=member_ios&utm_campaign=share_via

David Valenziano

Manuf. Engineer at Ford Motor Company

1 年

But who is defining the choreography & trajectyof ethical codices being breathed into this new life form, as it takes shape and begins its ascent toward autonomy?

要查看或添加评论,请登录

Michael Spencer的更多文章

  • The Fundamental Lie of OpenAI's Mission

    The Fundamental Lie of OpenAI's Mission

    Welcome Back, Everyone from OpenAI to DeepSeek claims they are an AGI startup, but the way these AI startups are…

    12 条评论
  • Vibe Coding: Revolution or Regression Students and Non-coders?

    Vibe Coding: Revolution or Regression Students and Non-coders?

    Good Morning, As the vibe coding interface takes shape, I’ve been checking out a new startup coming out of stealth this…

    8 条评论
  • The Truth about DeepSeek's Integration in China and WeChat Explained

    The Truth about DeepSeek's Integration in China and WeChat Explained

    DeepSeek's rapid integration in China is a bigger story that is being told. It's not just the China Cloud leaders…

    4 条评论
  • How AI Datacenters Work

    How AI Datacenters Work

    Good Morning, Get the full inside scoop on key AI topics for less than $2 a week with a premium subscription to my…

    5 条评论
  • How Nvidia is down 30% from its Highs

    How Nvidia is down 30% from its Highs

    If like me, you are wondering why Nvidia is down more than 20% this year even when the demand is still raging for AI…

    7 条评论
  • What DeepSeek Means for AI Innovation

    What DeepSeek Means for AI Innovation

    Welcome to another article by Artificial Intelligence Report. LinkedIn has started to "downgrade" my work.

    16 条评论
  • What is Vibe Coding?

    What is Vibe Coding?

    Good Morning, Get access to my best and complete work for less than $2 a week with premium access. I’m noticing two…

    23 条评论
  • TSMC "kisses the Ring" in Trump Chip Fab Announcement

    TSMC "kisses the Ring" in Trump Chip Fab Announcement

    Good Morning, To get the best of my content, for less than $2 a week become a premium subscriber. In the history of the…

    9 条评论
  • GPT-4.5 is Not a Frontier Model

    GPT-4.5 is Not a Frontier Model

    To get my best content for less than $2 a week, subscribe here. Guys, we have to talk! OpenAI in the big picture is a…

    15 条评论
  • On why LLMs cannot truly reason

    On why LLMs cannot truly reason

    ?? In partnership with HubSpot ?? HubSpot Integrate tools on HubSpot The HubSpot Developer Platform allows thousands of…

    3 条评论

社区洞察

其他会员也浏览了