登录查看更多内容

From AlphaGo to ChatGPT to DeepSeek: The Three Defining Moments of Modern AI.

Harry Wang

Silicon Valley based sales leader with wide industry experiences focused on hyperscale clients.

发布日期: 2025年2月1日

It is fascinating to watch how DeepSeek, a private hedge fund company based in HangZhou China created so much news in late January 2025.

From reading news articles, tech journals, combing through message boards, listening to podcasts, and getting geeky with DeepSeek algorithms throughout this week, I would characterize the overall response to the open source release of DeepSeek as chaotic, confusing, biased, and in collective shock.

As someone who came from a software programming background in AI, and stumbled into a career in data center hardware sales for the last 20+ years, I find it worthwhile to internalize current events for myself, sharing my thoughts with other tech enthusiasts, and hopefully help AI novice to understand the progression of AI development.

My own journey in AI began in graduate school by programming GNUGo, a worldwide AI project in which I was able to improve its algorithm and beat original programming by 85% on HPC clusters. Since then, I have formed my own little garage-based firm, worked for large corporations like HP, Microsoft, Hitachi and others, and mainly focused on data center architectures and sales management. This career path has given me a holistic experience from low level machine code programming, to data center hardware platform design, to server/storage/networking implementation, to cloud architecture, to internal and external data center infrastructure. While my passion is in sales management, customer interactions and finance, having embedded all of the technical knowledge above gave me a better perspective on understanding technology drivers and trends in each role I've held. I suppose the collective sum of my previous roles truly represents an MoE (Mixture of Expert) model.

As a technologist who has been following AI development in both the US and China closely, I consider DeepSeek one of the three major defining moments in recent AI history. What's remarkable is that each new advancement is being achieved more rapidly than previous milestones.

The 1st defining moment that transformed AI field from "AI Winter" to "AI spring" is DeepMind's AlphaGo back in 2016. At the time, the ancient Chinese board game of Go was considered by many AI experts to be 'the last refuge of human intelligence,' as it was believed to be impossible to program effectively. From my experience with GnuGo, even though GNUGo was the winner of the Computer Olympiad, it was not able to beat an entry level Go player. This was due to the sheer complexity of the game: the number of possible board variations is so vast that brute-force methods—calculating every possible move—are computationally infeasible. To put it into perspective, Go, a Chinese board game created thousands of years ago, has more possible moves than the total number of protons in the known universe.

Remarkably, the human brain can grasp and learn the basics of Go in 10-15 minutes, while this would not be possible for computers even with all of Earth's computing resources combined into a single supercomputer. It was the consensus in the tech world that if computers could achieve the skill level of 8-dan or 9-dan Go players, they would have reached a level of consciousness, since playing Go requires human intuition rather than just brute force calculations.

Sounds familiar?

Yes, It would require AGI (Artificial General Intelligence) to program the board game of Go. So we thought. So, imagine the shock of the AI and tech community when AlphaGO, a deep learning model developed by DeepMind, defeated South Korean pro GO player Le Sedol (ranked 11th in the world) in 2016 and Chinese Pro GO player Ke Jie (ranked #1 in the world) in 2017. It was such complete defeat of human players who had studied Go their entire lives, playing with AlphaGO was praised by GO professionals as a 'God-like' experience. Even more astonishing was the subsequent release of AlphaGo Zero and AlphaZero, which achieved a 100% win rate against their predecessors.

Yet, the AGI moment did not arrive, despite this unprecedented AI achievement. However, DeepMind's contributions have undoubtedly made leaps and bounds in progress, propelling AI, and integration of machine learning in many fields, such as biological research, robotics, game-theory, complex problem solving, and firmly securing its place in history.

The second defining moment of AI advancement is the release of conversational AI, ChatGPT, in 2022. OpenAI developed ChatGPT based on Google's research on pre-trained transformers (PT). ChatGPT's release spurred the release of competing products, including Gemini, Claude, Llama, Ernie, Grok, and Qwen. This is when investment firms started to aggressively getting big money in AI startups. On the hardware side, the biggest winners were NVIDIA, TSMC, and other AI chip producers. On the software side, Microsoft and OpenAI emerged as key players. However, a dark twist emerged: OpenAI, which was originally founded with a mission to benefit humanity through open-source objectives and a non-profit company, saw its CEO transform it into a de facto for-profit entity. This shift prioritized the interests of its executives and led to the closure of its codebase. The model proved so profitable that the Trump administration committed to a $500 billion investment goal, with OpenAI, SoftBank, and Oracle leading the effort—known as the Stargate project. Despite the controversial privatization of OpenAI, ChatGPT remains one of the most significant milestones in AI advancement since AlphaGo five years prior.

The 3rd and latest AI advancement is obviously the open source program DeepSeek released by a small Chinese hedge fund firm called High-Flyer, using its subsidiary called DeepSeek. DeepSeek released its AI coder back in November 2023, a powerful V3 version aimed at solving more complex tasks in Dec 2024, and a lean and mean energy-efficient R1 version on 1/20/2025. DeepSeek has been releasing various versions on Github and Hugging Face under the MIT open source license since Nov 2023, including a small version that can be run on a beefy laptop.

领英推荐

The Economics of Artificial Intelligence, Causal…

Towards Data Science 2 个月前

This is the inside story of how ChatGPT was built…

MIT Technology Review 1 年前

Top Tech News

Analytics Insight? 7 个月前

What shocked the tech world is their V3 model with $5.6M USD training cost vs. typical $100M USD cost by most AI companies, completed only over a two months period.

It's noteworthy that High Flyer has only about 100 employees, and DeepSeek began as a side project to try to squeeze more computational bandwidth out of the inferior Nvidia GPUs. Due to the US sanctions on Nvidia GPU export, DeepSeek team was only able to use A100 and H800 which have slow interconnects. Think of this company as an early-stage stock trading firm, similar to the one portrayed in the TV series Billions, but replace Bobby Axelrod with a team of math nerds. They were able to figure out the best way to improve efficiency on a bunch of slow GPUs is not using NVIDIA's default CUDA coding, but with lower-level PTX (Parallel Thread Execution) coding. In essence, DeepSeek has demonstrated a groundbreaking approach to achieving good results with less hardware in a short amount of time, offering the world a new perspective on computational and energy efficiency.

Without getting into technical details, other original creations by DeepSeek team includes these elements:

MoE(Mixture of Experts)
Reinforcement learning
Multi-head Latent Attention
Multi-Token Prediction
Dual Pipe
FP8 Mixed-Precision Training

The distillation learning method, on top of all of DeepSeek's improvements, is not a new practice in the AI industry. Since the introduction of the distillation training method in 2015, companies like Microsoft, Meta, DeepMind, and many others have extensively used distillation to train AI models. In this approach, the student model—in this case, DeepSeek—generates its own answers based on its dataset and then consults a teacher model, such as Qwen, Llama, ChatGPT, or other AI models. It compares the responses and fine-tunes its results by adopting the better answer. However, this learning model is not perfect. At times, it can produce hallucinations or incorrect outputs, and in some cases, it may inadvertently embed responses from the parent AI model."

In the world of engineering and software programming, DeepSeek's approach is as original as it gets. This was evident in the earth-shattering response on January 27, when the stock market crash led to the realization that many assumptions about data center spending needed to be reassessed. Additionally, the nature of OpenAI's closed AI model has come into question.

Since DeepSeek's open-source release of R1, numerous controversies have emerged in the US regarding the authenticity of DeepSeek's code, the validity of its efficiency claims, and even widespread conspiracy theories circulating across various US media outlets. For the general public, experienced technologists who are not well-versed in AI, or even AI experts who have not researched this topic, the situation is highly confusing due to the sheer amount of disinformation surrounding DeepSeek. Some of this stems from deliberate attacks driven by self-interest, aimed at justifying excessive spending and maintaining continued investments—such as those by Alexandr Wang at Scale AI and Microsoft, who seek to discredit DeepSeek. Others take the form of "technical criticism" that lacks a solid foundation but is presented as credible. Meanwhile, many reactions are outright hate speech, fueled by racial prejudice against Chinese, advocating for increased US sanctions and calls for technological warfare.

The most important point I want to bring up is not the technical advancement DeepSeek has brought to the world, but rather how we should respond when advancement happens in a disruptive manner.

It is true we are in a competitive world, where we compete on education, jobs, and promotions at the individual level; nations compete at an industry level with energy, science, engineering, healthcare, manufacturing, etc. However, at what point do we start to collaborate with one another? With DeepSeek's case, releasing its method openly to provide massive reduction in computational and energy costs is one of the most altruistic postures I can think of. It is a Nobel Prize-worthy gesture that benefits the entire of humanity. Instead of thanking DeepSeek profusely for its profound contributions, the masses in the US are shouting "burn it". What does this say about America's psyche today?

However insignificant as individuals, what each of us says or does has an effect. Will our actions encourage more AI companies to release their models as open source? Or does the current reaction to the open release of DeepSeek discourage more collaboration in the tech community? Do we want AI for the good of all people, or as a controlling tool for the 1%? We teach our children to broaden their horizons and learn from others' perspectives, can we do the same? If you think you have already made a decision, perhaps just try to ponder on it a bit longer.

References:

DeepSeek

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Joeri van Haren

Operational Excellence ? Strategic Business Planning ? Executive Leadership ? Connecting Business + IT ? Cost Controls ? Turnaround Management ~ Building Company Culture ~ Multinational Technology Organizations

2 周

How to respond to a tectonic?shift in assumptions made is a hard thing to manage, I don't think DeepSeek had at all thought that their model would face criticism on the scale it did - I am hopeful that given time, and better review of what actually has been accomplished here, opinions will change and people will look back to this as a huge positive next step in the overall progression of AI technical approach. Side note: Love your geekiness side to this - once HP curious, always HP curious! Kudos!

1 次回应

Lisa Burnop

Large Order - Customer Success

3 周

What a great read, thank you for sharing Harry.

1 次回应

Jason Milgram

CTO @ OZ | Microsoft Azure MVP (2010-present) | Army Reserve Veteran | Author | NSSAR Member

3 周

Great read, thank you Harry!

1 次回应

Bob Taylor

3 周

Great overview, my friend.

1 次回应

查看更多评论

要查看或添加评论，请登录

Harry Wang的更多文章

A Call for Unity

2020年3月13日

A Call for Unity

Unity, only unity can help us to get through this once-in-a-generation health crisis. While we practice social…

1 条评论
PingPong and Success by the Wall Street Journal

2016年6月13日

PingPong and Success by the Wall Street Journal

PingPong and Success by the Wall Street Journal: https://www.wsj.
Star Chart Shows Seattle’s Tech Universe Extends Beyond Amazon and Microsoft

2015年12月5日

Star Chart Shows Seattle’s Tech Universe Extends Beyond Amazon and Microsoft

Source: https://recodetech.files.

From AlphaGo to ChatGPT to DeepSeek: The Three Defining Moments of Modern AI.

Harry Wang

Silicon Valley based sales leader with wide industry experiences focused on hyperscale clients.

It is fascinating to watch how DeepSeek, a private hedge fund company based in HangZhou China created so much news in late January 2025.

领英推荐

The most important point I want to bring up is not the technical advancement DeepSeek has brought to the world, but rather how we should respond when advancement happens in a disruptive manner.

Harry Wang的更多文章

社区洞察

其他会员也浏览了

Generative AI and ChatGPT: Blurring the Scientific Information

Chat GPT and the implications of this new technology

ChatGPT Has Competition: Meet DeepSeek AI

No AI, No Digital Transformation

Beyond ChatGPT: 14 Mind-blowing AI Tools Everyone Should Be Trying Out Now

ChatGPT and the Next Industrial Revolution

Innovation Party Weekly — Bard vs ChatGPT. The AI Game is On! | The Numeric Tesla Story for 2030. Why Invest in Tesla? | Why and How to Self-Publish

?? Another ChatGPT Moment

2016 - 2024: What AI Taught Me About Myself

eKontrol talks to ChatGPT, and its answer is surprising!

It is fascinating to watch how DeepSeek, a private hedge fund company based in HangZhou China created so much news in late January 2025.

领英推荐

The most important point I want to bring up is not the technical advancement DeepSeek has brought to the world, but rather how we should respond when advancement happens in a disruptive manner.

Harry Wang的更多文章

A Call for Unity

PingPong and Success by the Wall Street Journal

Star Chart Shows Seattle’s Tech Universe Extends Beyond Amazon and Microsoft

社区洞察

其他会员也浏览了

Generative AI and ChatGPT: Blurring the Scientific Information

Chat GPT and the implications of this new technology

ChatGPT Has Competition: Meet DeepSeek AI

No AI, No Digital Transformation

Beyond ChatGPT: 14 Mind-blowing AI Tools Everyone Should Be Trying Out Now

ChatGPT and the Next Industrial Revolution

Innovation Party Weekly — Bard vs ChatGPT. The AI Game is On! | The Numeric Tesla Story for 2030. Why Invest in Tesla? | Why and How to Self-Publish

?? Another ChatGPT Moment

2016 - 2024: What AI Taught Me About Myself

eKontrol talks to ChatGPT, and its answer is surprising!