A Catch Up on DeepSeek and ChatGPT
This is a forwarded message I received, and it was a brilliant summary of what's going on:
The New AI Revolution
Many of you may have heard of something called Deepseek-R1. If you haven't, this is an AI app to rival the likes of ChatGPT - but from a Chinese company.
I have heard experts talk about its performance. They are absolutely stunned. The "R1" implies that it is a "reasoning" model. So it explains its reasoning to you.
Experts are blown away by its reasoning. They say the model reasons exactly as a human would, and it is much better than the industry leading reasoning app, OpenAI's o1 app.
There are many other things that make this product even more impressive.
One, Joe Biden restricted the sale of nVidia's most advanced GPUs (Graphics Processing Units) to China some time back in order to stop China's rise in AI. So Deepseek-R1 used an older generation GPU from nVidia to build the server that generates the answers. And, despite this, Deepseek-R1 is better than OpenAI's o1 app, which uses the latest GPUs.
Two, Deepseek-R1 was built at a cost of $6 million. In comparison, big tech majors have cumulatively budgeted $250 billion this year on AI infrastructure.
Three, and probably most significant, Deepseek-R1 is giving away its technology for free. Its code is open source. Even OpenAI can use it. OpenAI is closed source.
The third aspect is the biggest shock to the US tech industry, because if someone can build something so powerful for so cheap and afford to give it away for free, then who will buy their overpriced products?
But there is an alternative vision. The idea that people are inherently creative and that, left to themselves, will do creative things for the sheer joy of creation.
Think of the first person who invented the wheel. He wasn't working on a government research contract to investigate more efficient means of transportation.
Think of the cavemen who started painting the bison hunt in the caves of Altamira in Spain or Bhimbetka in Madhya Pradesh in India. These people didn't do these things for a commission. They were inspired to draw what they had experienced and hand down their stories.
As Deepseek founder Liang Wenben says,
"True innovation is driven not only by commercial incentives but also by curiosity and the desire to create."
He also says,
"For technologists, being followed is an achievement. Open sourcing is more of a cultural act than a commercial one. Giving is a form of honor, and it attracts talent by fostering a unique culture."
After Deepseek-R1 was released, nVidia stock has lost nearly $600 billion in value (a 17% drop in stock value - which is ironic, considering Deepseek uses nVidia's chips.) Alphabet (Google) lost 4% in stock value, Microsoft lost 3.8%, Philadelphia Semiconductor 9.2%, ASML (the Dutch company that makes the machines used to make nVidia's chips) sank 7%, Japan's Softbank, which is heavily invested in AI, sank 8.3%, among others.
Explained in simple language the history changing event that happened over the last few days.
Let me break down why DeepSeek's AI innovations are blowing people's minds (and possibly threatening Nvidia's $2T market cap) in simple terms...
First, some context : Right now, training top AI models is INSANELY expensive. OpenAI, Anthropic, etc. spend $100M+ just on compute. They need massive data centers with thousands of $40K GPUs. It's like needing a whole power plant to run a factory.
DeepSeek just showed up and said "LOL what if we did this for $5M instead?" And they didn't just talk - they actually DID it. Their models match or beat GPT-4 and Claude on many tasks. The AI world is (as my teenagers say) shook.
How? They rethought everything from the ground up. Traditional AI is like writing every number with 32 decimal places. DeepSeek was like "what if we just used 8? It's still accurate enough!" Boom - 75% less memory needed.
领英推荐
Then there's their "multi-token" system. Normal AI reads like a first-grader: "The... cat... sat..." DeepSeek reads in whole phrases at once. 2x faster, 90% as accurate. When you're processing billions of words, this MATTERS.
But here's the really clever bit: They built an "expert system." Instead of one massive AI trying to know everything (like having one person be a doctor, lawyer, AND engineer), they have specialized experts that only wake up when needed.
Traditional models? All 1.8 trillion parameters active ALL THE TIME. DeepSeek? 671B total but only 37B active at once. It's like having a huge team but only calling in the experts you actually need for each task.
The results are mind-blowing:
- Training cost: $100M → $5M
- GPUs needed: 100,000 → 2,000
- API costs: 95% cheaper
- Can run on gaming GPUs instead of data center hardware
"But wait," you might say, "there must be a catch!" That's the wild part - it's all open source. Anyone can check their work. The code is public. The technical papers explain everything. It's not magic, just incredibly clever engineering.
Why does this matter? Because it breaks the model of "only huge tech companies can play in AI." You don't need a billion-dollar data center anymore. A few good GPUs might do it.
For Nvidia, this is scary. Their entire business model is built on selling super expensive GPUs with 90% margins. If everyone can suddenly do AI with regular gaming GPUs... well, you see the problem.
And here's the kicker: DeepSeek did this with a team of <200 people. Meanwhile, Meta has teams where the compensation alone exceeds DeepSeek's entire training budget... and their models aren't as good.
This is a classic disruption story: Incumbents optimize existing processes, while disruptors rethink the fundamental approach. DeepSeek asked "what if we just did this smarter instead of throwing more hardware at it?"
The implications are huge:
- AI development becomes more accessible
- Competition increases dramatically
- The "moats" of big tech companies look more like puddles
- Hardware requirements (and costs) plummet
Of course, giants like OpenAI and Anthropic won't stand still. They're probably already implementing these innovations. But the efficiency genie is out of the bottle - there's no going back to the "just throw more GPUs at it" approach.
Final thought: This feels like one of those moments we'll look back on as an inflection point. Like when PCs made mainframes less relevant, or when cloud computing changed everything.
AI is about to become a lot more accessible, and a lot less expensive. The question isn't if this will disrupt the current players, but how fast?
One of the reason of market falling across the globe.
#DeepSeek #ChatGPT #AI #ArtificialIntelligence
SAFe Scrum Master
1 个月Thank you for sharing!!
Chief Technology Officer | Cloud-Based Software Solutions Expert | Enthusiastic Violinist
1 个月This is a big moment in the history of AI. https://www.dhirubhai.net/pulse/fuss-deepseek-overblown-reason-jevons-paradox-oliver-kohll-uflne/?trackingId=KO50Yf5HEN29CfTBFNMQzA%3D%3D
Multidisciplinary | Engineer | Technology Innovator | Project Management
1 个月How to build own AI https://www.dhirubhai.net/posts/saket-kumar-pandey-pro_creating-your-own-ai-activity-7291387241417867264-pQfi?utm_source=share&utm_medium=member_
Thanks for the article Pon G Nithya. AutoKeybo runs DeepSeek.