DeepSeek, OpenAI, and the AI Scaling Wars
Created using LinkedIn AI

DeepSeek, OpenAI, and the AI Scaling Wars


Chef vs. Buffet: Understanding DeepSeek vs. GPT-4 ???

Let’s start with a simple analogy.

Order from a Master Chef (GPT-4, Dense Model) ????

  • This chef knows every cuisine—Italian, Japanese, Indian, French—you name it!
  • No matter what you order, they’ll cook it from scratch using their full expertise.
  • Pros: High-quality, versatile, deep knowledge of flavors.
  • Cons: Takes more effort, uses all kitchen resources every time, expensive to run.

Go to a Smart Buffet (DeepSeek, MoE Model) ??

  • This buffet has specialized chefs for different cuisines.
  • When you order sushi, only the sushi chef works—others stay idle.
  • Pros: More efficient, faster for common dishes, cheaper to run.
  • Cons: If you ask for a rare or complex dish, it may not be as polished since no single chef knows everything.


DeepSeek: Hype vs. Reality Check

The buzz around DeepSeek has been intense—some call it a major open-source victory, while others question if big AI labs have been overspending. But let’s break it down beyond the hype.

Open Source vs. Proprietary: The Real Implication

DeepSeek’s execution is impressive, but it doesn’t eliminate the need for large AI models. Instead, it reinforces that MoE (Mixture of Experts) models can be more efficient for certain tasks. Big AI labs like Meta, Google, and Mosaic have explored this before—DeepSeek just did it at scale.


Did It Really Only Cost $5.5M? Not So Fast.

Some argue that DeepSeek’s "$5.5M training cost" proves other AI labs are overpaying, but that’s misleading:

  • It ignores pre-training R&D costs—DeepSeek had 150 engineers working on this. Their salaries, experiments, and infrastructure don’t come cheap.
  • GPU pricing isn’t that simple—You can’t just rent 2,048 H100s at $2/hr for a three-month run. Long-term commitments and overhead drive costs up.
  • DeepSeek reportedly has 50,000 H100s—meaning their actual operational costs are likely closer to $1B per year, on par with other AI labs.

It’s like saying a new smartphone "only costs $200 to make", ignoring the billions spent on R&D, supply chains, and factory setup. The $5.5M number is just the tip of the iceberg.


Dense vs. Sparse Models: A Technical Shift

DeepSeek uses an MoE model, where only parts of the model activate per task. This makes it more efficient than dense models like GPT-4, but also harder to engineer.

Think of it like a Swiss Army knife (GPT-4) vs. a specialized toolset (DeepSeek).

  • Swiss Army knife: Always on hand, works for everything but not always optimized.
  • Specialized toolset: More efficient if you know exactly what you need, but not as flexible.


The Scaling Debate: Are Big Models Dead?

While DeepSeek shows MoE’s potential, it doesn’t mean large models are obsolete.

Why Large Models Still Matter:

  • Big models generate synthetic training data—GPT-4’s dense model helps train smaller ones.
  • Reasoning & creativity require full activation—DeepSeek may be efficient, but GPT-4’s depth is still unmatched for complex reasoning.
  • Future AI = Hybrid approach—we’ll see a mix of massive foundation models + efficient MoE models for edge deployments.

What This Means for AI Investment

?? AI hardware demand isn’t slowing down—even "efficient" models still need massive compute. So keep buying NVIDIA stock. ??

?? Data is still king—better models need better data, not just more parameters.

?? VC focus is shifting—expect more funding for MoE architectures and customized AI models rather than the “one-model-to-rule-them-all” approach.


Final Take

DeepSeek is a huge milestone, but not a revolution. It confirms what experts already knew:

  1. MoE models can be more efficient.
  2. Open-source is catching up to proprietary AI. ? AI investment remains a high-stakes, multi-billion-dollar game.

AI isn’t getting cheaper—it’s just getting smarter about where to spend.

Thank you for reading !


References & Further Reading:

  • CNBC Interview on DeepSeek (Link)
  • DeepSeek Model Paper (ArXiv)
  • Analysis on AI Costs by Erich Elsen (LinkedIn)
  • Meta’s MoE Research (NeurIPS '24)
  • ChatGPT (2025). "DeepSeek vs. GPT-4: A Comparison of MoE and Dense Models." OpenAI, January 2025.



James (Jim) Pyers

Faith, Family, Freedom, Founder, Angel, NoAgenda Producer | Alumnus: Cameltrotter, Eagle, Antelope, Firestorm, Sony, Qualcomm, Sprosty

1 个月

Excellent post Sandip Bharati! Timely, well thought out, and a very informative way to look at the Deepseek developments of the past few weeks. What's the best tool for the job! I still like your Elephant story the best ;>)

要查看或添加评论,请登录

Sandip Bharati的更多文章

  • Rethinking Technical Debt: A Product-Centric Perspective

    Rethinking Technical Debt: A Product-Centric Perspective

    In the software industry, we have a remarkable knack for coining terms that elegantly encapsulate complex phenomena…

    5 条评论
  • GDPR - Treat data like money

    GDPR - Treat data like money

    GDPR (General Data Protection Regulation) is here and we can not ignore it. This article is explaining an approach when…

    1 条评论
  • disConnect()

    disConnect()

    After working for alomost 15 years for world class communications company which practically connects all of us, one…

    2 条评论
  • Engineering Leadership: How to build a great team?

    Engineering Leadership: How to build a great team?

    Once someone asked me about things I enjoy the most as a leader and I got to say for me top of the list is - maximizing…

  • Engineering Leadership: Crawl, Walk, Run ...

    Engineering Leadership: Crawl, Walk, Run ...

    Often times as an engineering leader you participate in product roadmap meetings. In there all ideas sound great and…

社区洞察

其他会员也浏览了