Modern LLM : A well-trained zombie

Modern LLM : A well-trained zombie

In Nov 2024, we are seeing reports from all over the world ( 谷歌 , OpenAI , etc.) that the rate of LLM improvement is slowing down, contrary to the scaling laws that have defined the growth of previous generations of AI. The results aren't surprising to anyone who deeply understands the algorithm behind a generative AI model.

In this article, we will explore where LLMs stand on the grand scale of intelligence (AGI, ASI), the basic analogy of LLMs, their functioning in layman terms, and what the future generation of AI may look like.


Basic Fundamentals of Large Language Models:

The LLM models work on the principle of "next token prediction". In layman terms, this is like predicting the stock market (predicting how the stocks will change today), just running through probability and possibilities and finding the best and most probable match.

Imagine a nursery school kid going through school. We first tend to teach a baby how to do "ABCD.."

The most common and first learning that the baby goes through is "A for apple", "B for ball", "C for cat".

The kid here does not understand what an apple is or what a cat is. But by seeing multiple examples of "A for apple", "B for ball", he learns to predict that he has to say "Apple" when given a text "A for" by his mother.

LLMs work on similar fundamentals. When you ask ChatGPT a question, they have seen trillions of web pages and datasets in their training datasets that they give you a more advanced and complicated version of "A for apple" which a nursery school student presents.

This is called next-token prediction. Predicting what the next word in the sentence would be.

This is the basic scale of AI, System 1 AI.


Equivalence to Zombie:

Zombies are mythical characters in films that are formed when a person dies and loses his consciousness. Consciousness is the key to society, philosophy and human mind that makes up who we are and is the key essence of the world we see around us.

Without consciousness, zombies act as per their instinct, without thinking and in a similar manner how a System 1 LLM (llama, GPT4, etc.) works.

Only difference is the training dataset. While the man that was zombified went through 500-1000 books, one university degree and probably 20,000 pages of text in his entire life, the LLM models are trained on trillions of tokens, 5000x more text content that a human being sees in his entire life.

Also, we fine-tune the behavior of LLMs to make them safe, reliable, and in accordance with what we want.

So this makes LLMs nothing more than a well trained zombie.


System 2 AI (Self reflection plus chain of thought):

System 2 AI (like GPT-o1) is not just trained to answer something based on a prediction mechanism, like how a kid would recite "A for apple" or "B for ball". But it is trained to think about a problem before answering.

Imagine we give a math problem to a 5 year old kid. And ask him to submit his answers after 30 mins. He would write down a solution, then re-check his steps and will follow it with multiple cycles of iterations and correction. This is called self-reflection or reflecting on your answers and then correcting them.

Many a times, he would formulate a strategy while solving that problem. This is like "What if I do this, then this would happen", etc, etc.

Humans make strategies all the time. Even when we are sitting idle or taking a shower, the voice in our head keeps on thinking about something or the other and makes us go through an infinite loop of "What if scenarios", also known as the "Inner Mind World simulation".

A system 2 AI is an expert at making those strategies and breaking a problem into different steps, and is also good at self reflecting over it's answers.


Cause-effect aware AI:

Modern AI do well in self reflection and breaking down a problem into multiple steps. But they don't predict how the answers they give you would be perceived by you.

For example, when I am making a speech in front of 500 people, I would not just think about what I should say. But I will also think what the other person would react to it.

"What others think about us" is the biggest fears of humanity and strongly affects our behavior, course of action and can impact lives.

A cause-effect AI will not just self reflect or break down problems, but would also be able to simulate his responses over virtual human-AI conversations and see how the problem may be perceived by you in the long term.

It is similar to searching for a path in a graph of possibilities.

Such an AI is called cause-effect aware AI, who is aware about the causes and effects of his words. And the answers he gives are reflective of who he is speaking to.


Conscious AI:

Even a system 2 AI works on predictions. That means, we are training a model to predict which strategies to apply when seeing a problem.

This can be as simple as a strategy made to predict about an event, in which I would first check the internet about that event, then I will make notes and then I need to summarize and answer them.

Or, if I am given a maths problem, I need to first think about best theorem to apply, then how to substitute the variables, etc.

Here we are learning to predict "the strategy" to apply when presented with a problem.

This "Chain of thought" or "Which strategy" is also learnt via prediction in a System 2 AI.

Human consciousness as defined by philosophy works on understanding the true essence of things around us and not just learning to predict or the ability to memorize.

This consciousness is only developed by self-thoughts, questioning yourself all the times in your heads, evaluating your actions multiple times in multiple virtual realities created inside your heads.

The biggest difference between a System 2 AI (ChatGPT-o1) and human consciousness is their behavior when idle. Suppose I take an AI system and a human being, put them in separate rooms, and just observe their behaviors.

While a GPT system would not be thinking anything, human mind would be thinking about something or the other. This is like a "turing test" to detect a System 2 vs conscious AI.

A conscious AI has the ability to think even when not given a problem to think about.


Importance of randomness:

Contrary to what most people think, a conscious AI cannot be invented just by deterministic axioms or a framework of mathematical equations.

The whole world around us revolves around Quantum Physics whose core essence is randomness.

Randomness is the key which can unlock conscious AI. Even though most people would say that outputs from generative AI are random as noise is added in many places before output prediction.

This is true, but the level of randomness which a conscious being like a person has, is far huge. He can abuse you on one day and praise you the next day. A person who did charity this year can even commit crime the very next year.

Creativity comes from randomness. If Newton's brain was purely deterministic when "apple fell on his head", the unrelated random idea of gravity would not hit his head. Rather, he would be deterministically thinking about something which the apple is closely related to, like food or the company Apple if he was born in 2005.


Quantum AI:

This is even the next generation of AI which has the ability to simulate multiple scenarios simultaneously and can lead to far more intelligent systems.

Quantum mechanics works on the principle of superposition, where something can co-exist in multiple different forms together, like a person being dead and alive at the same time.

A Quantum AI can not just reflect on its current response (one chain of thought) but will have the ability to write 500 answers with different levels of context, prompting, and a different tone. And would be able to evaluate them just in one cycle of evaluation (single inference).

There is a concept in philosophy that says that even our human brain is a quantum object that is entangled to multiple copies of itself, existing in multiple universes (alternative realities) and how we are just interacting with a different copy of ourselves existing in a different universe all the time, getting our consciousness or creativity from there. While it may sound science fiction, many interpretations of quantum mechanics support this idea.


How powerful can a quantum AI (ASI) be:

Imagine a game of chess. Your opponent plays a very simple move and you just play the next move in response. In three moves, you see yourselves being checkmated and are shocked.

The simple move he played in his first move was rather a well thought strategy to defeat you but appeared simple to you.

This happens because your opponent has a better visibility of the future than you have. He is able to simulate the game graph in his head much better than you.

He knew that after that simple move, you would play move A, then he can play move B, then you would play move C, then in few moves he could checkmate you. You could not predict that.

Next generation of AI, will not just have the ability to self-reflect their answers and make a strategy for problem solving. But it will also be able to think how the answer AI is giving will be seen by you, your next course of action on that question, your counter question, etc.

The game graph of chess here is "human-AI interaction".

Imagine you being the chess player and the AI being the opponent in the chess game we discussed. AGI will be able to simulate the game of "human question answering" better than you. That you may find his first answer very simple, but after a few series of question-answering/interacting you will find yourself being checkmated (which is doing something harmful).

Many people think this is not possible, but even non-AI things like hypnosis works on something similar. Where I can hypnotise a person and make him even kill another person.

ASI or AGI will have the ability to hypnotise you just via text. Because it can predict the "human-AI interaction graph" better than you.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了