Everything you need to know about Deepseek, the US-China AI race, and the golden age of AI, in less than 60 seconds.
If you follow AI news, you surely read about Deepseek this weekend: A Chinese startup that released two very impressive models (v3- their gpt4 equivalent, and r1- their o1 equivalent). Here's the kicker: they made it fully open source and (apparently) trained them with 1/10 the cost of other LLMs from the likes of OpenAI, Meta and Google.?
This has MAJOR (MAJOR) implications for the AI world and probably is the biggest AI moment since the launch of ChatGPT in 2022.
Here’s what you need to know:?
I won’t go into the conspiracy theories, you can find plenty of those online. The one thing that does strike me as a big coincidence is that these models were released on President Trump’s first week in office, as this release has been interpreted as China’s gift to the world.?
What is that gift to the world?
Let’s break these further down.?
1. It’s possible to train world class models for under $100M (and A LOT of data)
Up until this release, the world was following OpenAI’s lead. OpenAI had a simple message: “all you need is compute”–ie, more machines to train AI models (called GPUs). According to OpenAI’s theory, the more GPUs, the better the model.?
Thus, the assumption goes, to build AGI (or ASI, the holy grail of AI systems) you need massive, ever expanding, levels of compute. That’s why Sam Altman and president Trump announced project Stargate last week (a $500bn investment in GPU clusters). With Deepseek reportedly costing a little over $50M to train, the thesis above is now heavily questionable. (Thus why Nvidia stock is dropping today).?
And, if such large clusters are perhaps not required, then the energy to feed them might not be required either (so, less environmental impact).
Word of caution: access to data could still be a bottleneck. If compute was king, then data is queen. The king is dead, long live the queen!
1. 5 From the age of training to the age of inference? I read through Deepseek’s papers and a bunch of posts on this so you don’t have to. (Skip this section if you’re not interested in technical details). How they managed to pull this off, is, in a nutshell a combination of resource constraint and cleverness:?
If the training race is over because of this (mainly, now “anyone” can train a world class LLM), then we’re fully in the age of inference- i.e. now using these models to solve problems they’ve never encountered and build new useful products.
领英推荐
(BTW - if you're enjoying this read, would you help me share it?)
2. Making these models open source
The massive implication here is that:
a) Now “anyone” (with a certain budget) can train a world class model by replicating Deepseek’s paper, GPU constraints notwithstanding.
b) There might not be that much money in the hyperscaler realm of building these foundational models (why would you pay for an expensive closed source model if the open source one is equally good)
c) the true moat and differentiation is in the application layer.
d)? And now, finally, companies building AI applications can have a realistic shot at building software-like margins (high 70%-80%)
e) For companies like Desteia, hosting models in our own infrastructure removes concerns about how data is used, shared and accessed.?
This is the moment all of us building in this space had been waiting for.?
3. Opening the AI race for (almost) any country and company
A week ago, the entry ticket to the AI race was $500bn (project Stargate). That’s a price tag very few countries could aspire to—in fact, just 30 countries in the world have a GDP above $500 bn. It seemed, then, that the most important technology of our generation could only be in the hands of a literal handful of companies and countries. Perhaps not anymore. What Deepseek proved is that resource constraint and human ingenuity sometimes do go further than deep pockets.?
Necessity and constraints are the mother of innovation (and we, as entrepreneurs need to constantly remember this). Thus, every country in the world can now participate if they dedicate their brightest minds to the task.? (Don’t get me wrong, the US still has the upper hand due to its research facilities and compute power. It’s just that the compute moat has been reduced, apparently).??
Particularly Mexico (who, despite being a key US ally also faces GPU purchasing restrictions), with reasonable policies can aspire to become a relevant player in this new AI race.? What policies?? Some that come to mind:
Perhaps now, truly, the golden age of AI has begun. One that is, excitingly,? fully open-sourced and available for anyone to participate in.?
If you liked this post, please share it and comment. I would love to spark the conversation here!
AI custom development | Ambassador at 044.ai | Empowering businesses with intelligent AI
2 周Hey Diego, let's connect!
Financial Analyst | Founder | MSFA & LL.B | HBR Advisory Council | Published Author
1 个月China just handed the world a wildcard, redefining the game, and a good chess match never gets old.
Yale World Fellow | Kauffman Fellow
1 个月Simple and effective, well done (as usual) Diego Solorzano. I think an additional impact that is worth calling out is that with lower compute comes lower energy consumption which is another large constraint as we look to the future of these models and applications. ?Venga!
Thanks, very clarifying Diego!
Senior journalist, marketing & communications professional; qualified secondary English teacher, tutor & occasional lecturer
1 个月Really interesting read - but when you ask it anything related to Chinese politics it will not answer… so bias has been ‘trained’ into the system, which is a problem in itself