What is DeepMind's Gopher?

What is DeepMind's Gopher?

Join me on this journey in my Newsletter for such topics called AiSupremacy. https://aisupremacy.substack.com/p/coming-soon

How artificial intelligence language models bring AI a step closer to human understanding of language evokes the imagination doesn't it? That's how I feel at least as we inch into 2022. AGI might not be achieved in our lifetime, but how we use artificial intelligence in society is changing the world.

DeepMind’s Work in NLP and Gopher

Google’s DeepMind is behind some of the most impressive AI breakthroughs and headline grabbing advances in the field over the last decade. In recent years Microsoft backed OpenAI has stolen some of the limelight.

Not to be outdone, DeepMind recently trained 280 Billion Parameter AI Language Model Gopher. You can read DeepMind’s blog of December 8th, 2021?here.

Language, and its role in demonstrating and facilitating comprehension - or intelligence - is a fundamental part of being human. That artificial intelligence is at the point of arriving at the moment in its awakening in language related tasks is incredible.

DeepMind and OpenAI both claim to have relevance to the future of AGI, or artificial general intelligence. While that’s debatable, what they are doing with language models is impressive.

As part of a broader portfolio of AI research, firms are building GPT-3 like models and GPT-4 should be announced soon, or in early 2023. So what is Gopher? You can read its academic paper?here.

In the quest to explore language models and develop new ones, DeepMind trained a series of transformer language models of different sizes, ranging from 44 million parameters to 280 billion parameters (the largest model they named?Gopher).

  • Based on the?Transformer?architecture and trained on a 10.5TB corpus called MassiveText.
  • Gopher outperformed the current state-of-the-art on 100 of 124 evaluation tasks.
  • DeepMind’s research in 2021 investigated the strengths and weaknesses of those different-sized models, highlighting areas where increasing the scale of a model continues to boost performance – for example, in areas like reading comprehension, fact-checking, and the identification of toxic language.
  • They also surfaced results where model scale does not significantly improve results — for instance, in logical reasoning and common-sense tasks.

No alt text provided for this image

Above: Performance on the Massive Multitask Language Understanding (MMLU) benchmark broken down by category. Gopher improves upon prior work across several categories.

The?model and several experiments?were described in a paper published on arXiv. As part of their research effort in general AI, the DeepMind team trained Gopher and several smaller models to explore the strengths and weaknesses of large language models (LLMs).

Given the unparalleled history of DeepMind’s AI developments, it was surprising they hadn’t made an appearance in the flourishing area of large language models (LLMs). Some people didn’t catch the news in December, 2021 about Gopher.

So why is it important?

In particular, the researchers identified tasks where increased model scale led to improved accuracy, such as reading comprehension and fact-checking, as well as those where it did not, such as logical and mathematical reasoning.

The team evaluated Gopher on a large number of NLP benchmarks, including?Massive Multitask Language Understanding?(MMLU) and?BIG-bench?and compared?its performance to several baseline models such as?GPT-3, noting a general trend that Gopher showed consistent improvement on knowledge-intensive tasks, but less on reasoning-intensive ones.

Bigger is not always Better

Gopher, like GPT-3, is an autoregressive transformer-based dense LLM— basically, it predicts the next word given a text history.

  • Language models predict the next item or?token?in a sequence of text, given the previous tokens; when such a model is used iteratively, with the predicted output fed back as the input, the model is termed?autoregressive.
  • Autoregressive language models based on the Transformer deep-learning architecture have set state-of-the-art performance records on many NLP tasks, and many researchers have developed very large-scale models.

According to DeepMind’s research results, they found the capabilities of?Gopher?exceed existing language models for a number of key tasks. This includes the Massive Multitask Language Understanding (MMLU) benchmark, where?Gopher?demonstrates a significant advancement towards human expert performance over prior work.

2021 was a pivotal year for advances in A.I. 2021 has been a transformational year for large language models, and it is getting more and more intense.

As always it’s interesting to read the?comments on Hacker News?and Reddit on such announcements for further insights.

Gopher's rank on?several NLP benchmarks?can be found on the Papers with Code website.

Gopher, GLaM Vs Other Large Language Models

DeepMind’s research went on to say that?Gopher almost halves?the accuracy gap from GPT-3 to human expert performance and exceeds forecaster expectations.

That’s a bold and interesting conclusion by the Google owned AI Research firm. DeepMind Technologies is a British artificial intelligence subsidiary of Alphabet Inc. and research laboratory founded in September 2010.

OpenAI, DeepMind and Microsoft Research are three of my favorite groups of AI researchers to watch in 2022.

In the research paper,?DeepMind tries to draw a comparison between the models that exist and Gopher. DeepMind concluded that Gopher lifts performance over current state-of-the-art language models across roughly 81% of tasks containing comparable results. This works notably in knowledge-intensive domains like fact-checking and general knowledge.

Gopher demonstrates improved modelling on 11 of 19 tasks, in particular books and articles.

According to them, Gopher showed the most uniform improvement across reading comprehension, humanities, ethics, STEM and medicine categories. It showed a general improvement in fact-checking.

This is good news for Alphabet who have paid the bill for DeepMind at quite a yearly cost. If it wasn’t profitable in 2021, I do expect it will be in 2022. Google likes the long-game in A.I. development.

Join me on this journey of getting to know AI better in my Newsletter for such topics called AiSupremacy. https://aisupremacy.substack.com/p/coming-soon

Thanks for reading!

Michael Spencer

A.I. Writer, researcher and curator - full-time Newsletter publication manager.

2 年

LinkedIn, please don't alter my own links.

回复
sudershan gaur

Administrative Assistant at Cisco

2 年

AI,, neurospinal neuroscience

回复
Michael Spencer

A.I. Writer, researcher and curator - full-time Newsletter publication manager.

2 年

Why am I seeing so many headlines about OpenAI but not DeepMind? It's confusing.

Michael Spencer

A.I. Writer, researcher and curator - full-time Newsletter publication manager.

2 年

要查看或添加评论,请登录

Michael Spencer的更多文章

社区洞察

其他会员也浏览了