How is DeepSeek different than other AI apps?
This Chinese startup, DeepSeek, has made a groundbreaking development in the AI world with its open-source large language model, R1. It is making waves with its novelty and efficiency. Traditional AI models use the entire architecture of the model for every query. DeepSeek's R1 uses "inference-time computing." This technique only activates the most relevant parts of the model for every specific query, thus saving huge amounts of computational power and costs.
Top technocrats of the tech industry have also been praising DeepSeek. Most popular social media profile of the recognized tech investor, Marc Andreessen, flaunts praise on it as follows: "DeepSeek R1 is one of the most amazing and impressive breakthroughs I have ever seen-and as open source, a profound gift to the world." Venture capitalist David Sacks says that this invention of DeepSeek will therefore lead to highly competitive competition: "it shows that the AI race will be very competitive."
Yet some analysts are skeptical DeepSeek would be adopted by the largest U.S. companies. "No U.S. Global 2000 is going to use a Chinese startup DeepSeek to launch their AI infrastructure and use cases," said Dan Ives, analyst. Today, "there is only one chip company in the world launching autonomous, robotics, and broader AI use cases, and that is Nvidia," he said.
DeepSeek, founded in 2023 by Chinese entrepreneur Liang Wenfeng, has rapidly ascended in the AI landscape. Its open-source models became available for download in the United States in early January and quickly surged to the top of the iPhone download charts, surpassing even OpenAI’s ChatGPT app. The company's latest product, the R1 model, has been favorably compared to leading offerings from OpenAI and Meta. Also striking: R1 is seeming more efficient and more affordable to train and develop, and maybe it did not depend on the most powerful AI accelerators to be developed, which U.S. export controls have made more difficult for Chinese firms to access. Performance comparisons: Both DeepSeek's R1 and V3 models presently rank within the top 10 at Chatbot Arena, a platform hosted by the University of California, Berkeley.
领英推荐
The company said its models were matching or outperforming those of rivals in tasks such as mathematical computations, general knowledge and question-and-answer benchmarks. The cost to train one of its newest models was $5.6 million, compared with estimates of $100 million to $1 billion for similar models from other AI companies last year. However, according to some analysts, such as Stacy Rasgon at Bernstein, these DeepSeek numbers have been considered open to misinterpretation. DeepSeek's rise has caused ripples in the stock market. After DeepSeek's rise, major tech companies' stocks began to slide: Microsoft was down 3.7%, Tesla slid 1.3%, Nvidia tumbled 15%, and Broadcom shed 16%. The technology-heavy Nasdaq index sank 3.5% as well, its third worst day in the last two years.
DeepSeek's approach was identified to incorporate one of the important reasons for efficiency: "mixture of experts" architecture. That is a technique that activates only those computing resources which are necessary for a particular task, instead of the traditional paradigm of developing AI by scaling up computational power. It not only cuts costs but has the potential to democratize AI development and decrease energy consumption.
The success of DeepSeek underlines the growing importance of open-source AI in driving innovation and competition within the industry. As the AI landscape continues to evolve, the emergence of efficient and cost-effective models like DeepSeek's R1 may prompt a reevaluation of existing development strategies among established tech companies.
Philonoist
1 个月Very informative. Thanks Alle-AI. Your all-in-one AI indeed!