How DeepSeek overcame US sanctions
Stephanie Arnett/ MIT Technology Review | Rawpixel

How DeepSeek overcame US sanctions

The AI community is abuzz over DeepSeek R1, a new open-source reasoning model. The model was developed by the Chinese AI startup DeepSeek, which claims that R1 matches or even surpasses OpenAI’s ChatGPT o1 on multiple key benchmarks but operates at a fraction of the cost. DeepSeek’s success is even more remarkable given the constraints facing Chinese AI companies in the form of increasing US export controls on cutting-edge chips. In this edition of What’s Next in Tech, discover how the company was able to overcome US sanctions to create DeepSeek R1.

??Flash sale alert! Subscribe today to save 25% on the 10 Breakthrough Technologies 2025 list and get a FREE digital report on small language models.

With a new reasoning model that matches the performance of ChatGPT o1, DeepSeek managed to turn restrictions into innovation.

Early evidence shows that the US’s export controls on advanced semiconductors are not working as intended. Rather than weakening China’s AI capabilities, the sanctions appear to be driving startups like DeepSeek to innovate in ways that prioritize efficiency, resource-pooling, and collaboration.

To create R1, DeepSeek had to rework its training process to reduce the strain on its GPUs, a variety released by Nvidia for the Chinese market that have their performance capped at half the speed of its top products, according to Zihan Wang, a former DeepSeek employee and current PhD student in computer science at Northwestern University.?

DeepSeek R1 has been praised by researchers for its ability to tackle complex reasoning tasks, particularly in mathematics and coding. The model employs a “chain of thought” approach similar to that used by ChatGPT o1, which lets it solve problems by processing queries step by step.

Dimitris Papailiopoulos, principal researcher at Microsoft’s AI Frontiers research lab, says what surprised him the most about R1 is its engineering simplicity. “DeepSeek aimed for accurate answers rather than detailing every logical step, significantly reducing computing time while maintaining a high level of effectiveness,” he says.

The company has also released six smaller versions of R1 that are small enough to? run locally on laptops. It claims that one of them even outperforms OpenAI’s o1-mini on certain benchmarks.

Although there's a lot of buzz around R1, DeepSeek remains relatively unknown. Read the story to dive deep into how the startup managed to create an AI model that one expert says could be a “truly equalizing breakthrough” despite tight US sanctions.

China Report is your weekly guide to everything happening in China and technology. Stay informed on the biggest headlines, deep analysis, and original stories. Sign up today to stay informed.

Get ahead with these related stories:

  1. The second wave of AI coding is here A string of startups are racing to build models that can produce better and better software. They claim it’s the shortest path to AGI.
  2. AI’s energy obsession just got a reality check DeepSeek poses a threat to the narrative that more computing power is the only thing that’ll unlock AI breakthroughs.
  3. What’s next for AI in 2025 You already know that agents and small language models are the next big things. Here are five other hot trends you should watch out for this year.

Image: Stephanie Arnett/ MIT Technology Review | Rawpixel


Flash sale! Save 25% when you subscribe today.


OK Bo?tjan Dolin?ek

回复

It's a good news for AI/ML industry indeed. While we create AI hype across the street, the main issue of this innovation is that we are trying to use more expensive technology to replace cheaper ones. This won't make any economic sense and eventually will bust. DeepSeek shared their breakthroughs like early OpenAI, now Meta. It lowers the cost and every AI/ML company can apply it. It will boost the AIML application development and increase the adoption. We still need the compute power.

Joseph Bayana

In the Business of Big Data

4 周

What's all the shock about low cost DeepSeek? It's made in China, where they also experiment on their 1.4 billion population using unbridled AI and the readily available big-to-massive-to-humongous-data for any and all models. If you think Meta, Microsoft, Amazon, ABC, among others, exploits data from Americans, wait til you find out what China does with their humongous data that's 4 times that of the USA. Kung Hei Fat Choi!

回复
MICHAEL WOO, MBA, CCS

Certified Collection Specialist at Wakefield & Associates, Inc

1 个月

Impressive

回复
Andrew Fox

UI & UX Product Specialist @ Freelance Digital | Creative / Design Director | Innovator

1 个月

Restrictions force us to think outside the box, leading us to natural innovation. When resources, time, options are limited, we find creative solutions to overcome obstacles. This constraint-driven problem-solving encourages efficiency, new perspectives, and unconventional thinking. If you want a creative mind to excel restrict it!! Disruption is king.

回复

要查看或添加评论,请登录

MIT Technology Review的更多文章

社区洞察

其他会员也浏览了