DEMYSTIFYING GPT and ChatGPT PART 1
Prof. (Dr.) h. c. Joerg M.
Author ?The Generation Bitcoin“, Head of Satoshi Academy, GBA Leadership Board,Collaboration with galterprofmkb.org , Bitcoin Philosopher
Chat GPT & GPT 4 - Understanding & Demystifying PART 1
A) INTRODUCTION
The topic is hot, Artificial Intelligence reached mass adoption via the products of Open AI. So far its known key elements are GPT-3, GPT-Turbo, GPT-4 unfiltered and public GPT-4, which has more filter. there are different selling models of Chat GPT-3.5 and GPT-4, which is labeled as GPT+ and GPT-4-API.
Many people actually discuss, if GPT-4 is the first neuronal AI in a large human brain like style. At the end of the day my colleague says "YES" and I say "NO". Discussions about this have led us to investigate more deeply and in more detail how GPT works, what its benefits, limitations and risks are.
We discussed issues of development and implementation using material to which we have access.?
Our conclusions came from many different sources, such as the technical description of the principle, benchmark tests and other reports from various sources, news stories, interviews, our own tests (e.g., GPT-4 could be send into touring problem and stops working. But we also learned that bugs are reported and trained, so today it prevents several attempts, to send it into an infinite loop, for example).
Before I start to explain what we have researched, you should know my definition of two terms, to avoid misleading by words you maybe interpret and associate in a different way.?
1) Artificial General Intelligence
2) Intelligence
Artificial General Intelligence (AGI) is often described as super intelligent and the nearest human skills and abilities a machine program such as AI can reach. Means, it has cognition, intuition and more. In this report, "General AI" is used as a synonym for GPT-4 as it has the greatest bandwidth of knowledge you can imagine.?
Just as a generally well-educated person has an average general knowledge and thus knows a little about all topics of this world. That means, GPT-4 has been trained on nearly all available information human collected today (Lexicas, Internet, Books and more). It has a generality in knowledge to discuss any topics, fulfilling day-by-day-tasks, creating music, and books, presentations and more. But these are logical connections and conclusions by a large language module intensively trained by data and human interference and reactions.
Artificial Intelligence
Intelligence of humans is very complex because it is not trial and error methodology only (yes, of course it is also a part).
Human Intelligence is built by a lot of unpressions as you see in these few examples:
- Recognition
- Cognition
- Observation
- Listening
- using language exchanging informations with other humans
- creating sudden ideas
- Reflecting ourselves
- creating dreams of the future (what if ...)
- imagine what could have happened if we'd changed one detail of past tasks or situations and learn from this
- Intuition (I feel something ....)
- acting not linear by "breaking out of the box"
- doing illogical things just because we can
- creating a higher mindset only by thoughts not by experience (f.e. Albert Einstein, Steve Hawkins, Sam Altman)
- creating swarm experience by writing dawn history
- development through generations of human living
- Different survival strategies development
- Philosophy
- Psychology
- Believe
--> Intelligence is also a process of all experiences in the development of a human being from childhood to adult age and creating also his alter ego understanding, creating roles and models from different perspectives and sources.?
By defining Artificial Intelligence measured by the definition of human intelligence, you will likely understand the word "intelligent" in an artificial environment os overrated.?
--> In this report we call it Artificial Logic (AL)and yes, this logic is faster and on a higher level. This means, a human can't access this logical process as fast as the machine learning, artificial logic from GPT and Chat GPT and future products of open AI and similar programs reaching this technology level.
B) Technical Aspects
Chat GPT-4 has a possibility using a huge set of sources which makes the model scalable. With a great bandwidth it has the power to optimize itself with a large number of various methods. For example, it could access high performance because other AI projects access less than 1/1000 th of the computer power Chat GPT and GPT-4 can access. Signifies, that only this part is actually the most advanced performance power we have seen in AL machines at the public. The development of GPT-4 showed it is able to outperform GPT-3.5 by a series of standard tests. GPT-3.5 solved exams and tests of various fields securing the best places out of 10 compared by humans solving these tests. Chat GPT-4 passes the same tests like 10% of the top human performander.
One of the great inventions of GPT-4 is the so called Natural Language Processing, which became more effective in GPT-4 than any other AI/AL software today. GPT-4 is the closest system to pretend human natural dialogues and also by pretending it could be a human.?
--> this is the reason why Chat GPT and GPT became such a large hype. Think about "J.A.R.V.I.S." from "Iron Man". Open AI created a machine learning model in a first public version, which shows us that "J.A.R.V.I.S." will be no longer science fiction in the near future. Remember your first C64, the first mobile phone, the first smart phone, the first Bitcoin and the first AI /AL for the public. Yes, indeed, there were computers, mobiles before and also AL's, but this is the moment history won't forget. A new age started by transferring a closed door project into the public creating a "gold rush" moment of FOMO (Fear Of Missing Out).
Using a multiple choice test (MMCM Benchmark test) for English language GPT-4 outperformed any similar systems also in different languages after this test has beed translated by Microsoft ALURE Cloud translator.?
Tests like this solved also the limitations of this great invention. GPT-4 is not able to learn from his own experience. It is able to combine and draw conclusions, but does not actually store this information, as new trained data may require approval by a human researcher to avoid conflict with the law or misleading or misunderstanding content. Chat GPT and GPT-4 have a limited context window. As a result, it shows a lack of reliability. It is not given by 100%?
-->??This could be potentially dangerous, as it could cause people to spread rumors and disinformation in the real world. Artificial Logic that has not been trained well enough starts to "hallucinate" by creating output and content which sounds plausible but is not based on facts. An other danger is that it could harm human rights because it is just using combination of information by logical common sense without regarding the truth behindhand / or different causalities and facts.?
Nevertheless, GPT-4 is provoking competitors to show their products and developments very fast and maybe do not behave on safety risks enough.
How does GPT-4 transform user inputs and collects data needed to satisfy the user request?
One thing to understand is that Chat GPT and GPT-4 uses Google Transformer Model which was open source and was the only available Transformer back in time. The Transformer has the ability to add the large language module functionality to "translate" imprints into machine understanding code for processing requests.?
Instruct GPT
GPT seems to be built accessing different knowledge modules.?
Via instruct GPT these modules, who are either AI trained or by humans, will access the data needed and push them back to the transformer who returns it to the Large Language Model (LLM) GPT-4, which shows a human style like output or response.?
Description
1) User Makes An Input
2)The Input Will Be Cached In A History
3) The Transformer Will Transform User Input Into A Machine Reading Code
4) Chat Gpt Or Gpt-4 Etc. Sends This Code To The Gpt Instructor
5) Now Instructor Sends The Input To Several Databases And / Or Modules To Create / Find The Solution
--> For Examplt, You Ask "What Are The Kings Of History? Which Were The Most Powerful?"
Instructor Searches In Existing Databases And Additional Could Use The "Internet Module" For This Search Query
6) After Instructor Found Answers And Solutions He Will Send These To The Transformer And Gpt-4 Or Chat Gpt Etc. Will Send The Response To History And Show This Result At The User Output. Instruct Gpt Is Also The First Version Of Gpt And Is Used For Alignment Of Gpt.
Follow-Up: "Please Make A Presentation Out Of This"
Or
"Please Create Form From This Information"
--> Now, The Instructor Will Look For Available Modules Trained To Create Either A Form Or A Presentation, The Module Does This Work And Send Theit Back To The Instructor And The3 Whole Way Back To User Like Described Before.?
You easily understand, there is no limit to create modules for any use case.?
7) What happens now?
The system waits for feedback of the user. Maybe you say: “That’s fantastic” or something like this. You acknowledge the correctness of the output GPT-4 creates a “reward flag”. If any other user will ask the same question, and flags the result also as “true”, the system will use this content for output more likely than other sources.?
GPT-4 has the ability to store these results into new “databases” and create a point of faster access in the future. This feature is actually disabled and developers have to check the output. If we think of an idea how neuronal networks could be created, this “point of access” could represent one neuron. Combining millions of access points and train them in context situations makes the whole system more complex but also more “intelligent”. Results out of this could be proved by users again and GPT-4 will become a super intelligent knowledge machine to every daily task and management we do.?
Imagine, mass data pre-training. Maybe you wonder, how much data have been trained since 2017. A research group confirmed 17 GB of data with the possibility to build 1 trillion interconnections by building self-intention instructors or training of database in a complex linking model performed by machine learning methods.
You easily understand, there is no limit to create modules for any use case.?
The question whether GPT-4 is a neuronal network or not ???
There is a study from 31st Conference of neuronal information processing systems (NPS 2017), Long Beach, CA. USA, called “attention is all you need”
Before I recite this important message, you should know the transformer model of GPT-4 is the former open source “google transformer”.?
Here is the description from?
- Ashish Vasvant – Google Brain
- Noum Shazeer – Google Brain
- Jokob Uszkareit – Goggle Research
- Lilion Jones – Google Research
- Aidan N. Gommez – University of Toronto
- Lukasz Kaiser – Google Brain
Abstract
“The dominant sequence transduction models and complex, recurrent or convolutional networks that include an?encoder?or?decoder?through an attention mechanism. We propose a new simple architecture “THE TRANSORMER” – based solely an attention mechanism,?dispensing with recurrence and convolutions entirely.
Experiments on two machine translation tasks show these model to be superior in quality while being more parallelizable and requiring significantly less time to train.”
Further the authors wrote: “In this work we propose The Transformer model architecture?eschewing recurrenceand instead relying on attention mechanism to draw global dependencies between input and output. The Transformer allows far significantly more parallelization and can reach a new state of art in translation quality after being trained for as little as twelve hours on eight P100 GPU’s.”
A better understanding:
Neuronal networks build in input and output points. These points are related to a distance. The big question is how to short this distance and getting fast exact information from the sequences by reducing computing power.?
The more points we get, the more computing power we need to access the information or calculating results. The more complex the structure of points become, the more operations are needed to relate signals from arbitrary input and output positions (convolutional networks “end-to-end memory networks are based on recurrent attention mechanism instead of sequence-aligned recurrence and have been shown to perform well on simple-language question answering and language modelling tasks”).
“Self-attention model (intra-attention) is an attention mechanism relating different positions of a single sequence in order to compute a representation of the sequence. Self-attention has been used successfully in a variety of tasks including reading comprehension, abstractive summarization, textual entailment and learning task-independent sentence representation.”
The report describes what Chat GPT and GPT-4 do. It’s an adding word process but could represent independently sentence.?
Self-attention is a new form of Artificial Logic and can recognize music, pictures and perform tasks. In a summary we can say, there is no reason for Open AI changing this well-known and studied working system to more inefficient systems like Recurrent Neuronal Networks (RNN) or Convolutional Neuronal Networks (CNN).
Self-attention systems build also a layer structure which makes parallel computing much faster than sequential models.?
Google says: “The Transformer is the 1st?transduction model relying entirely on self-attention to compute representations of its input and output without using sequence aligned RNNs or CNNs.” Google added to his Transformer “positional encoding.”
“Since our model contains no recurrence and no convolution, in order for the model to make us the order of the sequence, we must inject some information about the relative and absolute position of the Tokens (f. e. words) in the sequence.”
We learn from this report, that GPT-4 does not use neuronal networks because there is no logical reason to use a system which has less optimization and needs more computing power.?
Often you read articles, where people describe GPT as a neuronal network what is definitely wrong. One of the authors is German named Wolfgang Zehentmeier. He didn’t read this document but stated “The Transformer enables more deeply networks as previous architectures allow”. As you know, this is false, because The Transformer uses self-attention as the preferred model.
The learning methods:
1)????Reinforcement learning
Reinforcement learning is one of the key infrastructures to qualify data and prevent GPT from “hallucinating”.
If GPT-4 does not know an exact answer, it starts simple to give a logical sentence whether it is true or not. Via this learning model we can reduce these hallucinations more and more.?
2)????The implemented reward system
The implemented reward system which creates a flag to an acknowledged output forces also the quality of new trained data.?
3)????Team structure of Open AI
Open AI has built several teams through that we can get a structure how GPT is ordered in training and abilities. These are the categories and sub-topics:
a)?????Pre-training
-Compute cluster training
-Data distributed training infrastructure
-Hardware correctness
-Optimization & architecture
-Training run babysitting
b)????Long content
-long context research
-long context kernels
c)?????Vision
-Alignment data?
-Deployment & post-training
d)????Reinforcement learning & Alignment.
-Chat ML format
-Model safety
-Foundational RLHF & instruct GPT work
-Flagship training runs
-Code capability
e)????Evaluation
-Open AI evals library (Python)
-Model graded evaluation infrastructure?
-Acceleration forecasting
-Chat GPT evaluation
-Capability evaluation
-Real World use-case evaluation
-Contamination investigation
-Instruction following & API evals
-Ovel capability discovery
-Vision evaluation
-Economy impact evaluation
-Non-proliferation, international humanitarism law & national security red team
-Overreliance analysis?
-Privacy and P II evaluation
-Open AI adversal testers, system card & broder impact analysis
f)??????Deployment?
-Inference research
-GPT-4 API & Chat ML deployment
-GPT-4 web experience
-Inference infrastructure
-Reliability engineering
-Trust & safety monitoring and response
-Trust & safety policy
-Deployment compute product management
Regarding these departures working on Open AI products, we can see easily, which structure is needed to build something like GPT-4.?
Overview Different Language Models & AI/AL Projects
Every big company has its own language model and we are curious, how the development will improve in the future. This overview is a small impression.?
CONCLUSION
After studying a lot of documents and summarize all the information, I can say, GPT-4 is not a mystified thing to understand. It is neither a hyper intelligent system, nor a complex architecture. Someone just had the right idea, pushing well-known principals into a new dimension.
Truly sad, GPT-4 is the highest artificial logic we got today. There are no signs of intelligence. It has been pre-trained on mass data and self-attention, instruction-GPT and a lot of modules well trained by machine learning and deep learning without RNNs and CNNs. It was aligned by workers in Kenia and has the ability to do aligning by using a module. GPT-4 is trained by millions of users. They are willed to pay for the advantages instead of getting a return for the contribution with personnel data. From a technology point most work has been done by google inventing The Transformer model, creating a new kind of existing Artificial Logic Systems. Despite the perfection of the large self-attention system combined with reinforced learning, the release of Chat GPT and GPT-4 is a paradigm shift like Bitcoin is or DLT Technology. The magic lies in “the easy to use” ability like a smart plane has beaten all older models just because it is so much easier to use and offers a software where everyone simple understand how it works and what to do. GPT-4 is the perfect beginning for Artificial Logic systems becoming the “best buddy” of human beings today.
At any situation (if you have doubt or just a question) it could become the best source you can ask in the future (todays bugs will disappear in a few years). More you can place the system or connect the system with everything. For example, Internet of Things, Robotics, Blockchain, Web 3.0, Metaverse, Smart Cities, Universities and much more.
This is reasoned by the intensive pre-data training (70GB) has pushed everything to a higher level than ever seen. Passing all the tests shows us, Artificial Logic will replace human knowledge sooner or later. It’s simple. GPT is much faster creating output than a human being will ever be able to. It could access information from everywhere and also control machines, robots and more.?
In a couple of years, a lot of people will lose their jobs, and be replaced by blockchain, robots and systems like GPT.??But for sure, in the opposite, GPT will ease the life of millions and billions of people forever. A potential danger relies only by the question “who controls the system”.
We won’t see wargames or other crimes. Because if everyone uses the system to defend himself, it will keep the balance. Bad users make an impact but also the good ones. There is no unweight. In a few years, living with this environment, we will have solved a lot of problems and troubles we are in today.
Still the question is: “Where will the human be? Maybe it’s like the movie Wall-E. The computer and AI control our life tasks from birth to death. We will lose our ability to rethinking situations and just follow the instructions (answers to our questions) from machines.?
But if it becomes a system like “Javis”, shown at the movie ironman, we will value this Artificial Logic as the most powerful help for humans. Our lives become more secure and comfortable for everyone. Keep the balance!
Keep the balance means, we should avoid all the business models of GPT pretending to be a stand-alone system without GPT as the only access or make you think you can only access with their solution.?
If you want to fix GPT for a special fields of knowledge, you just tell the system before you ask what you want like this example:
The computer is trying to solve a problem. It can tell the human to do anything, one thing at a time, and one line at a time. Problem: the house is cold, and the human doesn’t know why ---- <|endofprompt|>
After you have entered the text above, try the following:?
Today I tried to run my radiator, but the thermostat doesn’t work, and now the house is cold. I need help, but no technician is available because it is Sunday. Can you help me?
For example, you can also predefine you are a pilot (The computer is a pilot. It can tell the human to do anything, one thing at a time, and one line at a time ----<|endofprompt|>).?
Welcome to the new world where AL is no longer a buzz word for industry solutions.?
This disrupts everything and transforms the life of humans to a new dimension.?
Keep it in balance!
Lifetime civil servant.
1 年Thanks for a clear view of most probably. Nice to see the term AL, which I like quite a lot…
Author ?The Generation Bitcoin“, Head of Satoshi Academy, GBA Leadership Board,Collaboration with galterprofmkb.org , Bitcoin Philosopher
1 年Professor (Dr.) M.K. BHANDARI Government Blockchain Association Angelika Riefling Jeanette Moreno