登录查看更多内容

GPT-3 is not That Smart. With a Reason

Stojancho Tudjarski

(MSc, Senior Data Science Consultant | Senior Software Engineer | Data Science Trainer | Blogger)

发布日期: 2021年9月25日

GPT-2 was a great success. OpenAI didn't want to publish the most enormous and mightiest version, with 1.5B parameters. At least, claiming that they were afraid of misusing it for less ethical purposes. Lately, they claimed that they didn't found shreds of evidence of such.

All of this is legit, considering the volume of the false "news" generated using it. And the truth is that it can be very successful in developing false news/stories. I tried this, played with it for a while, and, well, at least impressed.

Then GPT-3 come. With more than a hundred times more parameters. And how people reacted? It appeared that the people's expectations grew at least with O(N), N == # of parameters. Such a disappointment in the people's ability to figure out what is going on in that black box called GPT-3.

There is absolutely no doubt that GPT-3 is a remarkable technical, engineering, and scientific achievement. Still, there is one large BUT here: the technology that is used. No matter how significant the number of parameters is, the technology that drives those parameters is all that counts. And the truth is that that technology is nothing more but highly sophisticated statistics applied over the input sequence of words to produce the most probable next few (up to a thousand) following words. And the parameters for that statistics are obtained by exposing GPT-3 as a tabula rasa to learn something from the texts the creators revealed to him during the training process.

Therefore, ladies and gentlemen, the truth is that GPT-3 doesn't understand what it is talking about. Again, it is all around producing the most probable continuation of the given sequence of words. And, the important thing here is that it takes words simply as numbers and generates the following numbers in return. Simply put: there is not a single clue of understanding. It's all about statistics applied over a massive amount of data very quickly, and that's it.

Now, one point more to be noted: this massive amount of data is probably the most extensive text corpora ever collected by human beings.

I've been playing for a while with GPT-2. From time to time, it speaks (ok, generates) Java and JavaScrpt code, standardized weblogs from a web server, and so on. Even pornography from time to time. Even base64 encoded texts. That means that the trainers exposed it to such kind of data, it couldn't generate such staff without learning it from somewhere. Talking about GPT-3, this text corpora is probably more extensive at a scale. But, again, the technology behind the one inside that black box remains the same: pure statistics.

Consequently, the texts generated by both GPT-2 and GPT-3 are simply mirrors of the texts the trainers exposed them to during the training process. This fact has one remarkable consequence: both GPT-3 and GPT-3 mirror the digitalized content found over and there, all around the web. Consequently, all the human biases, misconceptions, misunderstandings, misbeliefs, and so on are simply injected into the neurons of the models. Now, let's think about the volume of the accurate and scientifically proved, validated content on the Internet versus all of t the so-called "I know the truth". Consider this for a moment, and you'll have a profound insight of what is inside the GPT-2. And that is even worse for GPT-3. Because GPT-2 is overpopulated with irrelevant text data, remember Java code and weblogs? And, GPT-3 is much more overpopulated with irrelevant texts, simply because the more you want to put in, the less possibility you have to control what you are putting inside. In other words, just take literally speaking everything you can find, put it into the number-crunching machine, and train the model.

领英推荐

Microsoft is Playing with Fire GPT-4

Michael Spencer 3 年前

? The In-Context Revolution

Pascal Biese 9 个月前

Three Critical Blind Spots Developers Overlook in AI's…

Ajit Jaokar 2 个月前

Whatever nonsense you get from GPT-3 is simply there because there are much more nonsenses than scientifically meaningful texts. Hey, the number of scientists and professionals is a couple of orders of magnitude lower than the number of YouTube watchers that grabbed their relevancy by reading posts on Facebook and Twitter. Ok, there are scientists on Facebook and Twitter, I'm following several of them by myself, but the volume of the available text is the thing that counts here.

Several decades ago, humans believed that widespread ignorance is based on hardly reachable quality information. These days, we know that it wasn't. And, GPT-3 simply reflects that. It is nothing more but a statistically averaged human being. It's us, and we have to cope with it.

To be honest, I couldn't believe when I read how much "scientific" researches is conducted to prove that GPT-3 "doesn't reason". Such a perfect example of wasting precious Ph.D. hours. They could just learn how GPT-[23] works, and they would figure out quickly what they can expect from it.

All of this is well elaborated in a politically correct voice in the following Yannic Kilcher video:

Hey, there is still a chance to find some quality out there in the wild :-)

Let's cheer for that.

要查看或添加评论，请登录

Stojancho Tudjarski的更多文章

By Desanka Maksimovi?, 2021 DC

2021年9月20日

By Desanka Maksimovi?, 2021 DC

Generated by a Neural Network, trained with Desanka Maksimovi?'s writings: 1: U istom ko?maru behu oni poumirali kao…
GPT-3 is Not a Way To Go and Here's Why

2021年7月25日

GPT-3 is Not a Way To Go and Here's Why

Introduction You simply couldn't miss the GPT-3 hype. Everybody loves it.

1 条评论
AI as New Electricity?

2021年7月14日

AI as New Electricity?

Till April 2020: GPT-2 was the king of AI, with his stunning 1.5B parameters.
Hi High-Tech Big-Shots. YOU HAVE TO BE AGILE THESE DAYS!

2021年6月29日

Hi High-Tech Big-Shots. YOU HAVE TO BE AGILE THESE DAYS!

A year and a half ago, Corona days just started. It was clear that the people will spend much more time in front of…
Now it's official: We didn't Understand how Neural Networks Work Till Now

2021年6月24日

Now it's official: We didn't Understand how Neural Networks Work Till Now

First, we created Neural Networks with neurons as building blocks to analogy the neurons in the human brain. Then, we…

1 条评论
Introduction to Trading. From Data Science Perspective.

2021年4月28日

Introduction to Trading. From Data Science Perspective.

Short Intro Some time ago, I decided to switch my AI focus from NLP to time series. Naturally, is there a more…
AI in the Middle of a Big Paradigm Shift in Physics

2021年2月27日

AI in the Middle of a Big Paradigm Shift in Physics

Ladies and gentlemen, this is big. But, let' start from the beginning.
Hello from Macedonizers

2021年2月6日

Hello from Macedonizers

#thisisnotautogeneratedtext Many thanks for support from all that gave it to us. We already have several team members.

2 条评论
Automatic Text Summarization for Traders and Other FinTech Guys

2021年2月2日

Automatic Text Summarization for Traders and Other FinTech Guys

So big heading for so simple idea and implementation. Hi all, hi-fin-tech guys! I bet you spent a lot of time reading…
What a time to live in - for Data Scientists at least

2021年1月28日

What a time to live in - for Data Scientists at least

#thisisnotautogeneratedtext First Corona. Try to figure out how the stock prices for newly founded companies claimed…

See all articles

GPT-3 is not That Smart. With a Reason

Stojancho Tudjarski

(MSc, Senior Data Science Consultant | Senior Software Engineer | Data Science Trainer | Blogger)

领英推荐

Stojancho Tudjarski的更多文章

社区洞察

其他会员也浏览了

??Top ML Papers of the Week

Qwen 2.5 — Is it better than GPT-4o?

Successfully Mitigating LLM Bias: Introspection & Prompt Engineering with LLM-Genie!

WE, THE STORIES.

GPT-3 writes like a writer, programs like a programmer, and can be ... dangerous

Paper Review: RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMs

MLOps at Industrial-Scale: Lessons from Google

An Analysis of LangChain's Reusability in LLMs: Challenges and Insights

Lang-O-Unchained

Aspect/sentiment-aware review summarization (SOTA)

领英推荐

Stojancho Tudjarski的更多文章

By Desanka Maksimovi?, 2021 DC

GPT-3 is Not a Way To Go and Here's Why

AI as New Electricity?

Hi High-Tech Big-Shots. YOU HAVE TO BE AGILE THESE DAYS!

Now it's official: We didn't Understand how Neural Networks Work Till Now

Introduction to Trading. From Data Science Perspective.

AI in the Middle of a Big Paradigm Shift in Physics

Hello from Macedonizers

Automatic Text Summarization for Traders and Other FinTech Guys

What a time to live in - for Data Scientists at least

社区洞察

其他会员也浏览了

??Top ML Papers of the Week

Qwen 2.5 — Is it better than GPT-4o?

Successfully Mitigating LLM Bias: Introspection & Prompt Engineering with LLM-Genie!

WE, THE STORIES.

GPT-3 writes like a writer, programs like a programmer, and can be ... dangerous

Paper Review: RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMs

MLOps at Industrial-Scale: Lessons from Google

An Analysis of LangChain's Reusability in LLMs: Challenges and Insights

Lang-O-Unchained

Aspect/sentiment-aware review summarization (SOTA)