Towards General AI

Towards General AI

Recent advancements in Large Language Models (LLMs) have demonstrated a ability for reasoning. Is this the basis for General AI? Artificial General Intelligence (AGI) is known as human-level AI or general intelligent action. AGI is able to learn tasks from one domain and apply to another domain; so AGI is the AI able to perform tasks in a very similar way to humans.

In 2020 we predicted that AGI would be reached in 2040; we have now moved forward to 2030.

Nvidia CEO Jensen Huang says AI will be ‘fairly competitive’ with humans in 5 years

The road is still foggy (and perhaps we are being optimistic), but some signs are showing us the way.

Human Brain vs AI Evolution - Simplified!


Human vs AI

Let's retrace the road. Human is a collection of complex capabilities:

  • The ability to remember,
  • to perceive the environment and interact with it,
  • the ability to learn,
  • the ability to generalize and to reason in abstract and symbolic way
  • the self-awareness
  • and, last but not least, the ability to judge ethically own actions.

To emulate these functions, several strategies have been developed: in 1950 we deceived ourselves that we understood the visual mechanisms and we created the perceptron (1958), the later convolutional filters. In 1980 we defined the decision trees to learn from data. Only thanks, however, to the growing computational capacity, we obtained the first results (LeNet and CNN) in 2010.

Deep Learning and Transformers architecture (2017) led us to Large Language Models (LLM, 2020): models with over 1 billion parameters.

Large Language Model

A feature predominantly seen in models with over 100 billion parameters, is the ability of reasoning (considered an emergent ability). While opinions differ on whether LLMs possess reasoning abilities, it's becoming evident that LLMs can be equipped to deal with complex problems (similar to human).

Indeed, through sophisticated prompting strategies (eg. Chain-of-Thought Prompting, Tree of Thoughts, ....) or Symbolic Modules it is possible to achieve complex tasks.

So, what is the next step to achieve the AGI?

Multimodality

The secret ingredient seems to be (obviously) multimodality; that is, the ability to integrate and process audio, video and sound content in an integrated manner. It's not just adding something but also improving.

Gemini (Google), Ferret (Apple) and GPT-4+ (OpenAi) and other open source (eg. Hugging Face, LangChain, ...) tools are going in this direction.

Moreover, cross-modal alignment is another interesting tool: Contrastive Language Image Pretraining (CLIP) utilizes image captioning task to align image and text modalities. Contrastive Language Audio Pretraining aligns audio and text.

1 more Multimodality

There is, however, a fourth element to integrate: not only vision, text and audio, but also the ability to act and interact with an environment.

Tools like Reinforcement Learning or IoT or Robotic or simple API can provide to AI the ability to take a further step.


Super AI

Artificial Super Intelligence (ASI) is defined as a form of AI capable of surpassing human intelligence. ASI is expected by 2050, but, first, we should fix some issues such as ethics and self-awareness (and maybe leveraging on quantum computing).


Conclusion

So is it 2024 when we will find the road to AGI?

Matteo Dariol

AI enthusiast, ML practitioner, IIoT and cloud architect, technology leader, intrapreneur | Shaping the 4th industrial revolution

1 年

Very insightful. Looking at the roadmap above, would it be a problem is an AGI acquires self-awareness before acquiring a sense of ethics?

要查看或添加评论,请登录

Giacomo Veneri的更多文章

社区洞察

其他会员也浏览了