Scaling laws meet Hunger for clickbait

Scaling laws meet Hunger for clickbait

On Thursday, 9th of November, a rising star of a new wave of paid-subscription newsletters - The Information - has published an exclusive material with a super-provocative title "OpenAI Shifts Strategy as Rate of 'GPT' AI Improvements Slows".

This started a great week for every AI-hater in the universe, disturbing our brave new AI world more than the recent election of Elon Musk.

As ChatGPT would've said, let's delve into reasons why this is nonsense and why it was published and being propagated further - like with an article in Reuters that was published on November 15th.

Understanding the scaling law

A "scaling law" is not an actual law of physics or math or biology, it is just an observation akin to "Moore's law" //Moore’s law explains the rate of improvement of chip technology//. The scaling law is a simple one: LLMs will improve if fed with more data and more computing power ("compute"): when we use X amount of data and Y amount of compute we will get GPT-3, and when we use 10*X data and 10*Y of compute we get GPT-4 that is much better.

This law has been working for many years, and companies that build models have strong conviction in this law. Investors share this conviction and pour hundreds of millions of dollars into those companies.

Obviously, when "The Information" reported that "some researchers at OpenAI believe Orion isn't reliably better than its predecessor in handling certain tasks. Orion performs better at language tasks but may not outperform previous models at tasks such as coding, according to an OpenAI employee," it sounded like a bomb.

The idea of the possibility of a wall preventing the scaling law from working is being discussed for the same duration as the scaling law exists. Last year Dario Amodei, the CEO of Anthropic, suggested that there is a 10% chance that the AI systems could stagnate due to insufficient data. Notably, these days he does not think like that, that was clear in his recent interview with Lex Fridman.

Academic researchers are trying to build a math model of the "lack of data" barrier. This June a group of researchers from universities and a research institute called Epoch.ai published an article "Will we run out of data? Limits of LLM scaling based on human-generated data".

The clickbait and the reality


Every article saying that "a scaling law has hit a wall" speaks in three voices:?

1. Voice of a journalist who says stuff like "the scaling law is not working" or "AI companies are facing troubles";

2. Voice of anonymous "AI researcher at the leading firm" saying that some experimental models are not improving as well as they should;

3. Comments of well-known experts who say that there are various ways to increase the quality of the models.

I think that there are two reasons for this “scandal of the century”.?

The first one is simple and material: young journalists are fighting for the most traffic that their text can bring to their publication. Therefore, they are directly motivated to come up with the most radical way to interpret reality.

Second: unwanted outcome of OpenAI's communication activity. The company knows that their reasoning model (01-preview) is, for the time being, the only one of its kind, and they decided that it is important to highlight that the training for the new generation of reasoning models can be done with existing datasets.

Reuter's article quotes OpenAI researcher Noam Brown: "It turned out that having a bot think for just 20 seconds in a hand of poker got the same boosting performance as scaling up the model by 100,000x and training it for 100,000 times longer."

There are two more revealing expert perspectives. Sonya Huang, a partner at Sequoia Capital, points to a shift to "move from a world of massive pre-training clusters toward inference clouds, which are distributed, cloud-based servers for inference."

Jensen Huang, co-founder and CEO of Nvidia, adds: "We've now discovered a second scaling law, and this is the scaling law at a time of inference..."

Claude’s opinion

The reality of AI scaling is more nuanced than dramatic headlines suggest. While traditional scaling approaches may face new challenges, the field is actively evolving beyond simple parameter counting. The emergence of new scaling laws around inference and the shift toward optimizing existing models suggest not a plateau, but a transformation in how we approach AI advancement. Rather than witnessing the end of scaling laws, we're seeing their evolution - from brute force expansion to sophisticated optimization and novel architectural approaches.

The media's rush to declare the end of scaling progress reveals more about contemporary tech journalism than about the actual state of AI development. As we've seen repeatedly in tech history, apparent plateaus often precede breakthrough innovations - they're pauses for reflection and refinement rather than permanent barriers.

要查看或添加评论,请登录

Mikael Alemu Gorsky的更多文章

  • Soul of Claude

    Soul of Claude

    Dario Amodei, Amanda Askell, and Chris Ola from Anthropic represent a small, elite group of perhaps a few hundred…

  • From Israel to AI: Elad Gil, investor

    From Israel to AI: Elad Gil, investor

    Elad Gil is 47 years old and very rich. Born in Israel, he grew up in California where he crushed it in school -…

  • AI specimen: from Hype to Harmony

    AI specimen: from Hype to Harmony

    Not AI agents — but agentic AI specimens The term "AI agent" often creates a false framework that misrepresents the…

    2 条评论
  • 3 great ideas and 2 small hints from Masayoshi Son, plus 2 comments

    3 great ideas and 2 small hints from Masayoshi Son, plus 2 comments

    Julius Caesar famously said, "Experientia est rerum magistra" (Experience is the teacher of all things). Masayoshi Son,…

  • I asked AI chatbots to analyze “Alice in Wonderland”

    I asked AI chatbots to analyze “Alice in Wonderland”

    Three LLM-based chatbots, same request: ‘Please perform a set of Analytical Tasks on "Alice's Adventures in Wonderland”…

  • Legal regulation of AI in Europe, mid-2023

    Legal regulation of AI in Europe, mid-2023

    As of mid-2023 there is a multitude of actual/proposed EU regulations on AI. The reason is simple: EU aims to establish…

    1 条评论
  • OpenAI Developer Day, 2023

    OpenAI Developer Day, 2023

    OpenAI's first ever Developer’s Day is one of the most important events in the nascent industry of generative AI. Less…

  • Daniela and Dario Amodei: royal family of AI

    Daniela and Dario Amodei: royal family of AI

    The Shakespearean story of brother and sister ruling a rebel kingdom comes to life when we learn origins of Anthropic –…

  • Our AI Friends: Confidants and Profiteers

    Our AI Friends: Confidants and Profiteers

    In April 2022, Senator Menendez googled "How much is one kilo of gold worth." Chrome saved this search, and the FBI…

  • Andrej Karpathy: 8 big ideas

    Andrej Karpathy: 8 big ideas

    Here are 8 most interesting ideas/concepts that Andrej Karpathy have explained on “No Priors” podcast in September of…

    1 条评论