They(LLMs) are all the same - myth or fact ?
Credits to Maxime Labonne for this image - https://x.com/maximelabonne/status/1816416043511808259

They(LLMs) are all the same - myth or fact ?

This is the third article in the AI series. And I need to address something that keeps coming up a lot in discussion.

FACT : Performance of all major LLMs are currently converging. In-fact the above image shows us that gap between both closed and open source foundation models are also decreasing.

Myth : With every new model release, we see this gap between the performance of the models decrease. Hence foundational models are going to be commoditized.

The myth stems from the following assumptions :

  1. We are running out of data to train, hence no one has an edge
  2. Data is the only blocker, and we are at the end of what can be squeezed out of the Transformer model

The reality is far from this. To understand it, let's look under the hood of the different forces at work. Three voices which alluded to them are :

  1. Leopold Aschenbrenner(former OpenAI engg.) from his Menifesto. The full 165 pages here.
  2. Eric Shmidt (former Google CEO) - recent talk at Stanford(they took down the video, but parts of it are available back now)
  3. Ilya's view on the Transformer model

Here's the summary:

  1. We are NOT running out of data. See below
  2. The true blockers are: Capital & Power

Data isn't a blocker

  • Re-training on the same data works wonders on improving model performance (Leopold)
  • Training on artificial structured data works pretty well. i.e. when annotated data is fed to models, instead of raw data from the internet, the model performance is better & a lot less data is required
  • Capital required for new models is exponential. 1 Billion & 100 Billion dollar models are seriously being thought of. They are no more thought experiments.

Why Power & Capital

  • The first two reasons above requires an enormous amount of power, think about training GPT-4 10-100 times on a slight variations of reasoned data
  • Those NVidia clusters are hungry babies !! :)

When you put all the above in to perspective, the emerging picture shows that the picture hasn't played out fully.

And as we get closer to 2029, things are going to:

  • Divergence in performance
  • Capital will dictate consolidation of companies
  • War for AI supremacy will move from Companies to Nation state (currently being played out in the shadows)


#AI #AItrends #AI-for-CIOs

Here are the links to the previous articles in this series:

2nd Article - Fourth Revolution

1st Article - AI : Where are the use cases

Nikhil Kodilkar

Director Strategic Group ? Lead by example

6 个月

Something to confirm this hypothesis: Microsoft recently signed one of the biggest power deals ever. ~$800 million/yr for 20 years $16 BILLION for one nuclear reactor

Tausif Sheikh

Software Engineer and Chat Bot Developer - Full Stack (Node JS, React, React Native, AI, ML)

6 个月

Very informative Thank you Nikhil

要查看或添加评论,请登录

Nikhil Kodilkar的更多文章

  • Beyond the Hype: A Candid Evaluation of 8 AI Developer Tools

    Beyond the Hype: A Candid Evaluation of 8 AI Developer Tools

    With so many AI tools flooding the market, I decided to cut through the noise and test them myself—hands-on, no fluff…

    3 条评论
  • Fulfilling moment as a Computer Engineer

    Fulfilling moment as a Computer Engineer

    Vocation : a strong feeling of suitability for a particular career or occupation Ever questioned your career ? Well, at…

  • LLMs : They aren't intelligent, they just …

    LLMs : They aren't intelligent, they just …

    This is the fourth article in the #AI series. And probably a good time to bust a myth.

    1 条评论
  • Fourth Revolution : Two en route !

    Fourth Revolution : Two en route !

    This is the second article in the continuing series of where we are & where we will be in regards to AI. For this…

    2 条评论
  • AI : Where are the use cases ?!

    AI : Where are the use cases ?!

    I see a lot of people asking this question "These images of Dolphin riding a bicycle are great, BUT is that all AI has…

  • "Peter Principle" - My notes on the book

    "Peter Principle" - My notes on the book

    Here is the second installment of my effort to take notes on books I'm reading or have read in the past. The book is…

    1 条评论
  • My notes from the #book "Don't make me think"

    My notes from the #book "Don't make me think"

    I like to learn, hence I read a few books every month. As with anyone, I forget things if they aren’t written down.

  • Did Apple just listen to my idea & patent it - #wishfulThinking :)

    Did Apple just listen to my idea & patent it - #wishfulThinking :)

    Maybe Apple was listening to me when I said iPhones should reorient themselves if they fall. Here is my blog :…

    4 条评论
  • Economic sizes of US cities

    Economic sizes of US cities

    Understanding the 17+ Trillion dollar economy of the USA by its constituents.

    1 条评论

社区洞察

其他会员也浏览了