Wu Dao 2.0 - what China's state-of-the-art model is capable of and what that means for Europe
Alexander Thamm GmbH

Wu Dao 2.0 - what China's state-of-the-art model is capable of and what that means for Europe

There is an ever-increasing competitive pressure when it comes to developing innovative AI models. A year after OpenAI was able to land a huge leap with the GTP-3 model and sent the world into turmoil, researchers from the Beijing Academy of Artificial Intelligence (BAAI) now presented Wu Dao 2.0 in early June 2021 - 10 times larger than GPT-3 and now the world's largest neural network model. 

From a tech perspective, a fascinating announcement. For European and American politics as well as for the economy, a warning signal not to completely fall behind. Or in other words: a signal for China's ambition to become the world leader in AI development.

Wu Dao 2.0 surpassed GPT-3 and Google Switch Transformer

It was only in March 2021 that BAAI released Wu Dao 1.0 and just a month later, the research group led by industry partners such as Xiaomi, Meituan and Kuaishou unveiled the updated version of the multimodal model.

Wu Dao 2.0, which literally means "understanding the laws of nature", has 1.75 trillion parameters. Thus, it outperforms GPT-3 by a factor of 10 and breaks the size record previously set in May by Google's Switch Transformer AI language model (1.6 trillion parameters) by 150 billion parameters.

In line with last year's increased move toward multimodal AI systems, Wu Dao 2.0 also learns from image and text data and can flexibly handle complex tasks based on both types of data. That means, it masters tasks such as natural language processing, text generation, image recognition and image generation, and can even predict 3D structures of proteins, similar to DeepMind's AlphaFold.  

Size and robustness of Wu Dao 2.0.

The model was trained using 4.9 TB of text and image data, which makes the GPT-3 training set (570 GB of clean data from 45 TB of curated data) look shockingly small in comparison. This data is composed of 1.2 TB of Chinese text data, 2.5 TB of Chinese graphics data, and 1.2 TB of English text data. 

Comparable multimodal approaches include OpenAI's DALL-E and CLIP or Google's LaMDA and MUM. But the Chinese model is much more complex in scale and achieves robustness that outperforms the current state-of-the-art (SOTA) in nine widely used AI benchmarks, according to BAAI researchers:

·      ImageNet (zero-shot): OpenAI CLIP  

·      LAMA (factual und commonsense knowledge): AutoPrompt  

·      LAMBADA (cloze tasks): Microsoft Turing NLG  

·      SuperGLUE (few-shot): OpenAI GPT-3  

·      UC Merced Land Use (zero-shot): OpenAI CLIP  

·      MS COCO (text generation diagram): OpenAI DALL·E  

·      MS COCO (English graphic retrieval): OpenAI CLIP und Google ALIGN  

·      MS COCO (multilingual graphic retrieval): vor UC2 (bestes multilingual und multimodal pre-trained model) 

·      Multi 30K (multilingual graphic retrieval): vor UC2. 

Wu Dao 2.0. and FastMoE

If you now ask the question of usability and commercialization possibilities, you will probably get FastMoE as an answer. This open source architecture, which is similar to Google's Mixture of Experts (MoE), was used for Google's Switch Transformer. Certain information is only ever routed to one expert network within the large model. This reduces the necessary computing power, since only certain sections of the model are active at any given time, depending on the information being processed. Hyperscaling, efficiency and high precision are thus ensured. In addition, FastMoE is more flexible than Google's system because it has been trained by supercomputers as well as on conventional GPUs and thus does not require proprietary hardware. 

It should be noted that a scientific publication on Wu Dao 2.0 is still pending. However, it seems that Wu Dao 2.0 can generate noteworthy results in the most important benchmarks.

Wu Dao 2.0. in practice – creating an AI grid

One goal being pursued, according to Tang Jie, deputy director of the BAAI, is the development and implementation of cognitive capabilities in machines (Turing tests). 

This was demonstrated during the presentation of Hua Zhibing, a virtual student who learned to compose music, write poetry, paint pictures and write code based on Wu Dao 2.0. Unlike GPT-3, Wu Dao 2.0 seems to resemble human memory and learning mechanisms more in that it doesn’t forget what has been previously learned.

However, this playful avatarization aside, Wu Dao 2.0. should be understood much more as the next milestone towards an area-wide transformative AI industrial infrastructure, similar to a power grid. This will interconnect AI applications and manage capacity in a smart way. This will be reinforced by the fact that vendors will use the data provided by customers via the interfaces to expand the training set to contribute to a continuous improvement of the overall system.

Wu Dao demonstrates the status quo of China's AI strategy

That the Chinese government has been using the potential of artificial intelligence as a strategic advantage in international competition for several years is certainly not a new insight. The first fruits from the AI and Innovation Plan, which called for the establishment of 50 new AI institutes by 2020, are being harvested with Wu Dao 2.0. Whether this was already the "big breakthrough", as China paraphrases its strategic goal for 2025, is probably the big hope from a European perspective, but also naive.

After all, in 2018 and 2019, the government in Beijing already put more than $50 million into the Beijing Academy of Artificial Intelligence. 

From a research perspective, China can now consider itself the world's leading nation in AI publications and patents. The global share has shifted in recent years from 4% in 1997 to 28% in 2017, and it‘s not going down. This trend also indicates the power that China can unleash in the field of AI-enabled businesses, such as voice and image recognition applications.

A challenge for Europe

As a consequence of this prevailing trend, offerings from Chinese providers that have already followed the AI transformation will exert enormous market pressure on European companies and states. A prominent example that has recently sparked geopolitical dynamics is the Chinese social media platform TikTok.

Another effect that should not be underestimated is that AI models also always express the data and biases of their programmers. This means that if developments manifest themselves in the direction of English and Chinese language models, other cultures will have to fight to have their languages and values taken into account.

This makes it all the more important to emphasize that AI models are an informal indicator of continental or national progress and a key dimension of technological competition between China, the U.S. and Europe.

According to a study by the European Investment Bank, about 80 percent of investments in AI and blockchain technologies are made by the U.S. and China, while Europe claims only 7 percent of the investment amount, about 1.75 billion euros. 

The latest developments around Wu Dao 2.0 raise fears that Europe is facing the situation of losing its digital sovereignty in the field of AI. 

Europe needs to strengthen its AI position

In April 2021, seven European AI industry associations, including ones from Germany, Austria, Sweden, Croatia, Slovenia, the Netherlands, France and Bulgaria, approached the EU Commission to draw attention to the situation and propose measures to develop large-scale AI models in Europe.

This is because if Europe does not react quickly, there is a risk that oligopoly or monopoly markets controlled by China and the US will form. The forces and resources allocated to AI at the German and European level must be bundled and invested more strongly in moonshot projects - this is the only way to avoid falling behind.

Stefan Hermann

Senior Systemengineer - the art of system engineering

11 个月

Machine learning should not be confused with artificial intelligence. Intelligence means: the ability for abstraction, logic, understanding, self-awareness, learning, emotional knowledge, reasoning, planning, creativity, critical thinking and problem solving. It can be described as the ability to perceive or infer information; and retaining it as knowledge that can be applied to adaptive behavior within an environment or context. See https://en.wikipedia.org/wiki/Intelligence. #AI, #Learning, #Database

回复
Helen Kasai

Co-Founder in ANODA | We build consistent digital experience for apps

3 年

Cool

回复
Nicolas Waern - Digital Twin Specialist

Strategy & Innovation Creator - Helping Leaders succeed in the age of AI | Thought Leader | Digital Twins | IoT | Smart Cities | Smart Buildings | Manufacturing | Data Strategies | Metaverse | AI |

3 年
Dr. Jan Therhaag

Agile leadership, Data Science & AI

3 年

要查看或添加评论,请登录

社区洞察

其他会员也浏览了