登录查看更多内容

DeepSeek, OpenAI, and the AI Scaling Wars

Sandip Bharati

Healthcare Engineering

发布日期: 2025年1月31日

+ 关注

Chef vs. Buffet: Understanding DeepSeek vs. GPT-4 ???

Let’s start with a simple analogy.

Order from a Master Chef (GPT-4, Dense Model) ????

This chef knows every cuisine—Italian, Japanese, Indian, French—you name it!
No matter what you order, they’ll cook it from scratch using their full expertise.
Pros: High-quality, versatile, deep knowledge of flavors.
Cons: Takes more effort, uses all kitchen resources every time, expensive to run.

Go to a Smart Buffet (DeepSeek, MoE Model) ??

This buffet has specialized chefs for different cuisines.
When you order sushi, only the sushi chef works—others stay idle.
Pros: More efficient, faster for common dishes, cheaper to run.
Cons: If you ask for a rare or complex dish, it may not be as polished since no single chef knows everything.

DeepSeek: Hype vs. Reality Check

The buzz around DeepSeek has been intense—some call it a major open-source victory, while others question if big AI labs have been overspending. But let’s break it down beyond the hype.

Open Source vs. Proprietary: The Real Implication

DeepSeek’s execution is impressive, but it doesn’t eliminate the need for large AI models. Instead, it reinforces that MoE (Mixture of Experts) models can be more efficient for certain tasks. Big AI labs like Meta, Google, and Mosaic have explored this before—DeepSeek just did it at scale.

Did It Really Only Cost $5.5M? Not So Fast.

Some argue that DeepSeek’s "$5.5M training cost" proves other AI labs are overpaying, but that’s misleading:

It ignores pre-training R&D costs—DeepSeek had 150 engineers working on this. Their salaries, experiments, and infrastructure don’t come cheap.
GPU pricing isn’t that simple—You can’t just rent 2,048 H100s at $2/hr for a three-month run. Long-term commitments and overhead drive costs up.
DeepSeek reportedly has 50,000 H100s—meaning their actual operational costs are likely closer to $1B per year, on par with other AI labs.

It’s like saying a new smartphone "only costs $200 to make", ignoring the billions spent on R&D, supply chains, and factory setup. The $5.5M number is just the tip of the iceberg.

Dense vs. Sparse Models: A Technical Shift

DeepSeek uses an MoE model, where only parts of the model activate per task. This makes it more efficient than dense models like GPT-4, but also harder to engineer.

Think of it like a Swiss Army knife (GPT-4) vs. a specialized toolset (DeepSeek).

领英推荐

Unlock the Potential of Generative AI: the big…

dunnhumby 1 年前

Multimodal AI - Shaping the Future of Commerce Search…

ViSenze 8 个月前

Hyperight Content Digest #25 - New Content Linked to…

Hyperight AB 1 年前

Swiss Army knife: Always on hand, works for everything but not always optimized.
Specialized toolset: More efficient if you know exactly what you need, but not as flexible.

The Scaling Debate: Are Big Models Dead?

While DeepSeek shows MoE’s potential, it doesn’t mean large models are obsolete.

Why Large Models Still Matter:

Big models generate synthetic training data—GPT-4’s dense model helps train smaller ones.
Reasoning & creativity require full activation—DeepSeek may be efficient, but GPT-4’s depth is still unmatched for complex reasoning.
Future AI = Hybrid approach—we’ll see a mix of massive foundation models + efficient MoE models for edge deployments.

What This Means for AI Investment

?? AI hardware demand isn’t slowing down—even "efficient" models still need massive compute. So keep buying NVIDIA stock. ??

?? Data is still king—better models need better data, not just more parameters.

?? VC focus is shifting—expect more funding for MoE architectures and customized AI models rather than the “one-model-to-rule-them-all” approach.

Final Take

DeepSeek is a huge milestone, but not a revolution. It confirms what experts already knew:

MoE models can be more efficient.
Open-source is catching up to proprietary AI. ? AI investment remains a high-stakes, multi-billion-dollar game.

AI isn’t getting cheaper—it’s just getting smarter about where to spend.

Thank you for reading !

References & Further Reading:

CNBC Interview on DeepSeek (Link)
DeepSeek Model Paper (ArXiv)
Analysis on AI Costs by Erich Elsen (LinkedIn)
Meta’s MoE Research (NeurIPS '24)
ChatGPT (2025). "DeepSeek vs. GPT-4: A Comparison of MoE and Dense Models." OpenAI, January 2025.

James (Jim) Pyers

Faith, Family, Freedom, Founder, Angel, NoAgenda Producer | Alumnus: Cameltrotter, Eagle, Antelope, Firestorm, Sony, Qualcomm, Sprosty

1 个月

Excellent post Sandip Bharati! Timely, well thought out, and a very informative way to look at the Deepseek developments of the past few weeks. What's the best tool for the job! I still like your Elephant story the best ;>)

1 次回应

要查看或添加评论，请登录

Sandip Bharati的更多文章

Rethinking Technical Debt: A Product-Centric Perspective

2024年12月10日

Rethinking Technical Debt: A Product-Centric Perspective

In the software industry, we have a remarkable knack for coining terms that elegantly encapsulate complex phenomena…

5 条评论
GDPR - Treat data like money

2021年4月29日

GDPR - Treat data like money

GDPR (General Data Protection Regulation) is here and we can not ignore it. This article is explaining an approach when…

1 条评论
disConnect()

2019年6月2日

disConnect()

After working for alomost 15 years for world class communications company which practically connects all of us, one…

2 条评论
Engineering Leadership: How to build a great team?

2019年1月16日

Engineering Leadership: How to build a great team?

Once someone asked me about things I enjoy the most as a leader and I got to say for me top of the list is - maximizing…
Engineering Leadership: Crawl, Walk, Run ...

2019年1月7日

Engineering Leadership: Crawl, Walk, Run ...

Often times as an engineering leader you participate in product roadmap meetings. In there all ideas sound great and…

See all articles

DeepSeek, OpenAI, and the AI Scaling Wars

Sandip Bharati

Healthcare Engineering

Chef vs. Buffet: Understanding DeepSeek vs. GPT-4 ???

Order from a Master Chef (GPT-4, Dense Model) ????

Go to a Smart Buffet (DeepSeek, MoE Model) ??

DeepSeek: Hype vs. Reality Check

Open Source vs. Proprietary: The Real Implication

Did It Really Only Cost $5.5M? Not So Fast.

Dense vs. Sparse Models: A Technical Shift

领英推荐

The Scaling Debate: Are Big Models Dead?

What This Means for AI Investment

Final Take

References & Further Reading:

Sandip Bharati的更多文章

社区洞察

其他会员也浏览了

LeewayHertz Weekly Digest – Unlocking AI Innovations: From LlamaIndex to AI Pricing Engines

INBOX INSIGHTS: Do Customers Want AI, Data is Everywhere

Stay Aligned September Edition

Your Weekly AI Roundup #23

Top AI Newsletters of 2024

Centralized Data: Unleashing the Full Potential of Conversational AI

Data Quality: The Backbone of Effective Conversational AI

Stay ahead and save time with NewsWhip’s AI Digest

Almost Timely News: A Guided Tour of the Most Powerful Generative AI Model Google Offers (2024-03-24)

Almost Timely News: ??? Generative AI Needs Better Data, Not Bigger Data (2024-04-14)

Chef vs. Buffet: Understanding DeepSeek vs. GPT-4 ???

Order from a Master Chef (GPT-4, Dense Model) ????

Go to a Smart Buffet (DeepSeek, MoE Model) ??

DeepSeek: Hype vs. Reality Check

Open Source vs. Proprietary: The Real Implication

Did It Really Only Cost $5.5M? Not So Fast.

Dense vs. Sparse Models: A Technical Shift

领英推荐

The Scaling Debate: Are Big Models Dead?

What This Means for AI Investment

Final Take

References & Further Reading:

Sandip Bharati的更多文章

Rethinking Technical Debt: A Product-Centric Perspective

GDPR - Treat data like money

disConnect()

Engineering Leadership: How to build a great team?

Engineering Leadership: Crawl, Walk, Run ...

社区洞察

其他会员也浏览了

LeewayHertz Weekly Digest – Unlocking AI Innovations: From LlamaIndex to AI Pricing Engines

INBOX INSIGHTS: Do Customers Want AI, Data is Everywhere

Stay Aligned September Edition

Your Weekly AI Roundup #23

Top AI Newsletters of 2024

Centralized Data: Unleashing the Full Potential of Conversational AI

Data Quality: The Backbone of Effective Conversational AI

Stay ahead and save time with NewsWhip’s AI Digest

Almost Timely News: A Guided Tour of the Most Powerful Generative AI Model Google Offers (2024-03-24)

Almost Timely News: ??? Generative AI Needs Better Data, Not Bigger Data (2024-04-14)