登录查看更多内容

Scaling laws meet Hunger for clickbait

Mikael Alemu Gorsky

Crafting practical futures with GenAI

发布日期: 2024年11月18日

On Thursday, 9th of November, a rising star of a new wave of paid-subscription newsletters - The Information - has published an exclusive material with a super-provocative title "OpenAI Shifts Strategy as Rate of 'GPT' AI Improvements Slows".

This started a great week for every AI-hater in the universe, disturbing our brave new AI world more than the recent election of Elon Musk.

As ChatGPT would've said, let's delve into reasons why this is nonsense and why it was published and being propagated further - like with an article in Reuters that was published on November 15th.

Understanding the scaling law

A "scaling law" is not an actual law of physics or math or biology, it is just an observation akin to "Moore's law" //Moore’s law explains the rate of improvement of chip technology//. The scaling law is a simple one: LLMs will improve if fed with more data and more computing power ("compute"): when we use X amount of data and Y amount of compute we will get GPT-3, and when we use 10*X data and 10*Y of compute we get GPT-4 that is much better.

This law has been working for many years, and companies that build models have strong conviction in this law. Investors share this conviction and pour hundreds of millions of dollars into those companies.

Obviously, when "The Information" reported that "some researchers at OpenAI believe Orion isn't reliably better than its predecessor in handling certain tasks. Orion performs better at language tasks but may not outperform previous models at tasks such as coding, according to an OpenAI employee," it sounded like a bomb.

The idea of the possibility of a wall preventing the scaling law from working is being discussed for the same duration as the scaling law exists. Last year Dario Amodei, the CEO of Anthropic, suggested that there is a 10% chance that the AI systems could stagnate due to insufficient data. Notably, these days he does not think like that, that was clear in his recent interview with Lex Fridman.

Academic researchers are trying to build a math model of the "lack of data" barrier. This June a group of researchers from universities and a research institute called Epoch.ai published an article "Will we run out of data? Limits of LLM scaling based on human-generated data".

The clickbait and the reality

Every article saying that "a scaling law has hit a wall" speaks in three voices:?

1. Voice of a journalist who says stuff like "the scaling law is not working" or "AI companies are facing troubles";

2. Voice of anonymous "AI researcher at the leading firm" saying that some experimental models are not improving as well as they should;

3. Comments of well-known experts who say that there are various ways to increase the quality of the models.

I think that there are two reasons for this “scandal of the century”.?

The first one is simple and material: young journalists are fighting for the most traffic that their text can bring to their publication. Therefore, they are directly motivated to come up with the most radical way to interpret reality.

Second: unwanted outcome of OpenAI's communication activity. The company knows that their reasoning model (01-preview) is, for the time being, the only one of its kind, and they decided that it is important to highlight that the training for the new generation of reasoning models can be done with existing datasets.

Reuter's article quotes OpenAI researcher Noam Brown: "It turned out that having a bot think for just 20 seconds in a hand of poker got the same boosting performance as scaling up the model by 100,000x and training it for 100,000 times longer."

There are two more revealing expert perspectives. Sonya Huang, a partner at Sequoia Capital, points to a shift to "move from a world of massive pre-training clusters toward inference clouds, which are distributed, cloud-based servers for inference."

Jensen Huang, co-founder and CEO of Nvidia, adds: "We've now discovered a second scaling law, and this is the scaling law at a time of inference..."

Claude’s opinion

The reality of AI scaling is more nuanced than dramatic headlines suggest. While traditional scaling approaches may face new challenges, the field is actively evolving beyond simple parameter counting. The emergence of new scaling laws around inference and the shift toward optimizing existing models suggest not a plateau, but a transformation in how we approach AI advancement. Rather than witnessing the end of scaling laws, we're seeing their evolution - from brute force expansion to sophisticated optimization and novel architectural approaches.

The media's rush to declare the end of scaling progress reveals more about contemporary tech journalism than about the actual state of AI development. As we've seen repeatedly in tech history, apparent plateaus often precede breakthrough innovations - they're pauses for reflection and refinement rather than permanent barriers.

The AI Pravda

1,710 位关注者

要查看或添加评论，请登录

Mikael Alemu Gorsky的更多文章

Soul of Claude

2024年11月15日

Soul of Claude

Dario Amodei, Amanda Askell, and Chris Ola from Anthropic represent a small, elite group of perhaps a few hundred…
From Israel to AI: Elad Gil, investor

2024年11月11日

From Israel to AI: Elad Gil, investor

Elad Gil is 47 years old and very rich. Born in Israel, he grew up in California where he crushed it in school -…
AI specimen: from Hype to Harmony

2024年11月9日

AI specimen: from Hype to Harmony

Not AI agents — but agentic AI specimens The term "AI agent" often creates a false framework that misrepresents the…

2 条评论
3 great ideas and 2 small hints from Masayoshi Son, plus 2 comments

2024年11月4日

3 great ideas and 2 small hints from Masayoshi Son, plus 2 comments

Julius Caesar famously said, "Experientia est rerum magistra" (Experience is the teacher of all things). Masayoshi Son,…
I asked AI chatbots to analyze “Alice in Wonderland”

2024年10月14日

I asked AI chatbots to analyze “Alice in Wonderland”

Three LLM-based chatbots, same request: ‘Please perform a set of Analytical Tasks on "Alice's Adventures in Wonderland”…
Legal regulation of AI in Europe, mid-2023

2024年10月12日

Legal regulation of AI in Europe, mid-2023

As of mid-2023 there is a multitude of actual/proposed EU regulations on AI. The reason is simple: EU aims to establish…

1 条评论
OpenAI Developer Day, 2023

2024年10月9日

OpenAI Developer Day, 2023

OpenAI's first ever Developer’s Day is one of the most important events in the nascent industry of generative AI. Less…
Daniela and Dario Amodei: royal family of AI

2024年10月9日

Daniela and Dario Amodei: royal family of AI

The Shakespearean story of brother and sister ruling a rebel kingdom comes to life when we learn origins of Anthropic –…
Our AI Friends: Confidants and Profiteers

2024年10月8日

Our AI Friends: Confidants and Profiteers

In April 2022, Senator Menendez googled "How much is one kilo of gold worth." Chrome saved this search, and the FBI…
Andrej Karpathy: 8 big ideas

2024年10月2日

Andrej Karpathy: 8 big ideas

Here are 8 most interesting ideas/concepts that Andrej Karpathy have explained on “No Priors” podcast in September of…

1 条评论

See all articles

Understanding the scaling law

The clickbait and the reality

Claude’s opinion

The AI Pravda

1,710 位关注者

Mikael Alemu Gorsky的更多文章

Soul of Claude

From Israel to AI: Elad Gil, investor

AI specimen: from Hype to Harmony

3 great ideas and 2 small hints from Masayoshi Son, plus 2 comments

I asked AI chatbots to analyze “Alice in Wonderland”

Legal regulation of AI in Europe, mid-2023

OpenAI Developer Day, 2023

Daniela and Dario Amodei: royal family of AI

Our AI Friends: Confidants and Profiteers

Andrej Karpathy: 8 big ideas