登录查看更多内容

What is DeepMind's Gopher?

Michael Spencer

A.I. Writer, researcher and curator - full-time Newsletter publication manager.

发布日期: 2022年1月6日

Join me on this journey in my Newsletter for such topics called AiSupremacy. https://aisupremacy.substack.com/p/coming-soon

How artificial intelligence language models bring AI a step closer to human understanding of language evokes the imagination doesn't it? That's how I feel at least as we inch into 2022. AGI might not be achieved in our lifetime, but how we use artificial intelligence in society is changing the world.

DeepMind’s Work in NLP and Gopher

Google’s DeepMind is behind some of the most impressive AI breakthroughs and headline grabbing advances in the field over the last decade. In recent years Microsoft backed OpenAI has stolen some of the limelight.

Not to be outdone, DeepMind recently trained 280 Billion Parameter AI Language Model Gopher. You can read DeepMind’s blog of December 8th, 2021?here.

Language, and its role in demonstrating and facilitating comprehension - or intelligence - is a fundamental part of being human. That artificial intelligence is at the point of arriving at the moment in its awakening in language related tasks is incredible.

DeepMind and OpenAI both claim to have relevance to the future of AGI, or artificial general intelligence. While that’s debatable, what they are doing with language models is impressive.

As part of a broader portfolio of AI research, firms are building GPT-3 like models and GPT-4 should be announced soon, or in early 2023. So what is Gopher? You can read its academic paper?here.

In the quest to explore language models and develop new ones, DeepMind trained a series of transformer language models of different sizes, ranging from 44 million parameters to 280 billion parameters (the largest model they named?Gopher).

Based on the?Transformer?architecture and trained on a 10.5TB corpus called MassiveText.
Gopher outperformed the current state-of-the-art on 100 of 124 evaluation tasks.
DeepMind’s research in 2021 investigated the strengths and weaknesses of those different-sized models, highlighting areas where increasing the scale of a model continues to boost performance – for example, in areas like reading comprehension, fact-checking, and the identification of toxic language.
They also surfaced results where model scale does not significantly improve results — for instance, in logical reasoning and common-sense tasks.

Above: Performance on the Massive Multitask Language Understanding (MMLU) benchmark broken down by category. Gopher improves upon prior work across several categories.

The?model and several experiments?were described in a paper published on arXiv. As part of their research effort in general AI, the DeepMind team trained Gopher and several smaller models to explore the strengths and weaknesses of large language models (LLMs).

Given the unparalleled history of DeepMind’s AI developments, it was surprising they hadn’t made an appearance in the flourishing area of large language models (LLMs). Some people didn’t catch the news in December, 2021 about Gopher.

So why is it important?

In particular, the researchers identified tasks where increased model scale led to improved accuracy, such as reading comprehension and fact-checking, as well as those where it did not, such as logical and mathematical reasoning.

The team evaluated Gopher on a large number of NLP benchmarks, including?Massive Multitask Language Understanding?(MMLU) and?BIG-bench?and compared?its performance to several baseline models such as?GPT-3, noting a general trend that Gopher showed consistent improvement on knowledge-intensive tasks, but less on reasoning-intensive ones.

领英推荐

The Sparks of AGI May Catch Fire

Michael Spencer 1 年前

Large language models can do jaw-dropping things. But…

MIT Technology Review 7 个月前

HockeyStick #2 - LLMs in Production

Miko Pawlikowski ??? 5 个月前

Bigger is not always Better

Gopher, like GPT-3, is an autoregressive transformer-based dense LLM— basically, it predicts the next word given a text history.

Language models predict the next item or?token?in a sequence of text, given the previous tokens; when such a model is used iteratively, with the predicted output fed back as the input, the model is termed?autoregressive.
Autoregressive language models based on the Transformer deep-learning architecture have set state-of-the-art performance records on many NLP tasks, and many researchers have developed very large-scale models.

According to DeepMind’s research results, they found the capabilities of?Gopher?exceed existing language models for a number of key tasks. This includes the Massive Multitask Language Understanding (MMLU) benchmark, where?Gopher?demonstrates a significant advancement towards human expert performance over prior work.

2021 was a pivotal year for advances in A.I. 2021 has been a transformational year for large language models, and it is getting more and more intense.

As always it’s interesting to read the?comments on Hacker News?and Reddit on such announcements for further insights.

Gopher's rank on?several NLP benchmarks?can be found on the Papers with Code website.

Gopher, GLaM Vs Other Large Language Models

DeepMind’s research went on to say that?Gopher almost halves?the accuracy gap from GPT-3 to human expert performance and exceeds forecaster expectations.

That’s a bold and interesting conclusion by the Google owned AI Research firm. DeepMind Technologies is a British artificial intelligence subsidiary of Alphabet Inc. and research laboratory founded in September 2010.

OpenAI, DeepMind and Microsoft Research are three of my favorite groups of AI researchers to watch in 2022.

In the research paper,?DeepMind tries to draw a comparison between the models that exist and Gopher. DeepMind concluded that Gopher lifts performance over current state-of-the-art language models across roughly 81% of tasks containing comparable results. This works notably in knowledge-intensive domains like fact-checking and general knowledge.

Gopher demonstrates improved modelling on 11 of 19 tasks, in particular books and articles.

According to them, Gopher showed the most uniform improvement across reading comprehension, humanities, ethics, STEM and medicine categories. It showed a general improvement in fact-checking.

This is good news for Alphabet who have paid the bill for DeepMind at quite a yearly cost. If it wasn’t profitable in 2021, I do expect it will be in 2022. Google likes the long-game in A.I. development.

Join me on this journey of getting to know AI better in my Newsletter for such topics called AiSupremacy. https://aisupremacy.substack.com/p/coming-soon

Thanks for reading!

Artificial Intelligence Report

241,351 位关注者

Michael Spencer

A.I. Writer, researcher and curator - full-time Newsletter publication manager.

2 年

LinkedIn, please don't alter my own links.

sudershan gaur

Administrative Assistant at Cisco

2 年

AI,, neurospinal neuroscience

Michael Spencer

A.I. Writer, researcher and curator - full-time Newsletter publication manager.

2 年

Why am I seeing so many headlines about OpenAI but not DeepMind? It's confusing.

1 次回应

Michael Spencer

A.I. Writer, researcher and curator - full-time Newsletter publication manager.

2 年

For serious reactions to the news check out Hacker News: https://news.ycombinator.com/item?id=29486607 Or Reddit: https://www.reddit.com/r/Futurology/comments/rd97zh/deepmind_debuts_massive_language_ai_that/

1 次回应

查看更多评论

要查看或添加评论，请登录

Michael Spencer的更多文章

The Datacenter Big Bang is about to start

2024年10月9日

The Datacenter Big Bang is about to start

Hey Everyone, I’m very drawn to the idea that a major datacenter expansion is underway that will change the future of…

6 条评论
Why 2025 will be the Key year for OpenAI

2024年10月3日

Why 2025 will be the Key year for OpenAI

Hey Everyone, 2025 will have to be the year that OpenAI reinvents itself. In a weird way, the AI systems of the future…

9 条评论
Google's ChatGPT? NotebookLM Mania has Set In

2024年10月2日

Google's ChatGPT? NotebookLM Mania has Set In

Hey Everyone, Google thinks it’s hard to go from information to insight and that NotebookLM can help us with that…

6 条评论
Google NotebookLM is a Multimodal Research Assistant

2024年9月27日

Google NotebookLM is a Multimodal Research Assistant

Hey Everyone, Getting back to some applied AI tool discovery: Let’s take a spin of one of the most trending products…

10 条评论
AGI is all you Need - Is o1 Reasoning AI?

2024年9月26日

AGI is all you Need - Is o1 Reasoning AI?

Hey Everyone, Since ChatGPT launched nearly two years ago, Sam Altman has become a celebrity. On the hazy path to…

7 条评论
Can a Frontier Model Teach? NotebookLM's Audio Overviews are Fascinating

2024年9月25日

Can a Frontier Model Teach? NotebookLM's Audio Overviews are Fascinating

As frontier models got multimodal a new class of AI products is reaching the world. More immersive ones than a boring…

11 条评论
Top Software Development Newsletters for your Career

2024年9月23日

Top Software Development Newsletters for your Career

Hey Everyone, This is an article specifically for the software engineers and developers among you. In the past year…

7 条评论
Microsoft's Copilot Wave 2 Event and More

2024年9月20日

Microsoft's Copilot Wave 2 Event and More

Hey Everyone, Copilot Wave 2's event had some interesting previews in how software is getting more useful at scale…

3 条评论
The New Generative Economy of AI

2024年9月18日

The New Generative Economy of AI

Hey Everyone, Is Generative AI about to build several new kinds of economic drivers? Is Generative AI going to lead to…

7 条评论
Does OpenAI's CoT Orion Model Series Make them the Hunter or the Hunted?

2024年9月16日

Does OpenAI's CoT Orion Model Series Make them the Hunter or the Hunted?

Hello Everyone, Today I continue my series of articles on OpenAI and the future of AI towards AGI. As you can imagine I…

7 条评论

See all articles

What is DeepMind's Gopher?

Michael Spencer

A.I. Writer, researcher and curator - full-time Newsletter publication manager.

DeepMind’s Work in NLP and Gopher

领英推荐

Bigger is not always Better

Gopher, GLaM Vs Other Large Language Models

Artificial Intelligence Report

241,351 位关注者

Michael Spencer的更多文章

社区洞察

其他会员也浏览了

AI 'Breakthrough': Neural Net Mirrors Human Language Mastery

Understanding the Inner Workings of Large Language Models

Exploring the Evolution of AI in Search and Contact Centers: Insights from Google’s Vlad Vuskovic

The Human Impersonator: Language, AI and GPT-3

Decoding Transformers: The Heart of Large Language Models

Mapping the Mind of a Large Language Model

Testing AI the Human Way: Misguided or Revealing?

?? Top 10 AI researches of the week (Jan 1 - Jan 7)

Innovations in Small Language Models

Large Language Models vs. Short Language Models

DeepMind’s Work in NLP and Gopher

领英推荐

Bigger is not always Better

Gopher, GLaM Vs Other Large Language Models

Artificial Intelligence Report

241,351 位关注者

Michael Spencer的更多文章

The Datacenter Big Bang is about to start

Why 2025 will be the Key year for OpenAI

Google's ChatGPT? NotebookLM Mania has Set In

Google NotebookLM is a Multimodal Research Assistant

AGI is all you Need - Is o1 Reasoning AI?

Can a Frontier Model Teach? NotebookLM's Audio Overviews are Fascinating

Top Software Development Newsletters for your Career

Microsoft's Copilot Wave 2 Event and More

The New Generative Economy of AI

Does OpenAI's CoT Orion Model Series Make them the Hunter or the Hunted?

社区洞察

其他会员也浏览了

AI 'Breakthrough': Neural Net Mirrors Human Language Mastery

Understanding the Inner Workings of Large Language Models

Exploring the Evolution of AI in Search and Contact Centers: Insights from Google’s Vlad Vuskovic

The Human Impersonator: Language, AI and GPT-3

Decoding Transformers: The Heart of Large Language Models

Mapping the Mind of a Large Language Model

Testing AI the Human Way: Misguided or Revealing?

?? Top 10 AI researches of the week (Jan 1 - Jan 7)

Innovations in Small Language Models

Large Language Models vs. Short Language Models