What is A.I. Emergent Abilities of Language Models?

What is A.I. Emergent Abilities of Language Models?

Emergent Abilities of Language Models

If you enjoy articles about A.I. at the intersection of breaking news join AiSupremacy?here. I cannot continue to write without community support. (follow the link below). For the price of a cup of coffee, Join 91 other paying subscribers.

https://aisupremacy.substack.com/subscribe

AI scientists are studying the “emergent” abilities of large language models


Hey Guys,

I hope you had a good summer! A quick look at the subreddit for Stable diffusion raises my eyebrows.

See my related Poll Post on this topic.

Recent we’ve seen some hype around the performance of larger fine-tuned language models. In recent years,?scaling up the size of language models?has been shown to be a reliable way to improve performance on a range of natural language processing (NLP) tasks.

I’ve been going crazy for Stable Diffusion, Midjourney and DALL-E 2 in terms of ease of text-to-image art, landscape and creative generation. Indeed, large language models (LLMs) have become the center of attention and hype because of their seemingly magical abilities to produce long stretches of coherent text, do things they weren’t trained on, and engage (to some extent) in topics of conversation that were thought to be off-limits for computers.

A LinkedIn News story has the ridiculous headline,?AI is Getting Good and Fast. I think what the editor means is the emergent abilities of language models. He points to some vague?NYT article.

But what are we actually talking about? Even as OpenAI’s dalle2 becomes nearly obsolete with the rise of Stable Diffusion that is open-source, the world of LLMs moves forward quickly.

The topic is fairly interesting with quite a few papers on the subject.

You are reading AI Supremacy, one of the fastest growing AI Newsletters born in 2022 on Substack. You can consider upgrading for access to more articles per month and access to?locked archive?posts.

Today’s language models at the scale of 100B or more parameters achieve strong performance on tasks like?sentiment analysis?and machine translation,?even with little or no training examples. DeepMind seems bullish about this too, but then again Google sees it as its job to be bullish about A.I., a product where it ranks as a global leader.

A?recent paper?(June, 2022) sheds some light on this. A?new study?by researchers at Google, Stanford University, DeepMind, and the University of North Carolina at Chapel Hill explores novel tasks that LLMs can accomplish as they grow larger and are trained on more data.

For me this is a rather important paper.

Scaling up language models has been shown to predictably improve performance and sample efficiency on a wide range of downstream tasks. This paper instead discusses an unpredictable phenomenon that we refer to as emergent abilities of large language models. We consider an ability to be emergent if it is not present in smaller models but is present in larger models. Thus, emergent abilities cannot be predicted simply by extrapolating the performance of smaller models. The existence of such emergence implies that additional scaling could further expand the range of capabilities of language models.

Google is right about a few things, a previous paper had explored how with?chain of thought prompting, language models of sufficient scale (~100B parameters) can solve complex reasoning problems that are not solvable with standard prompting methods. Essentially “basic reasoning” was performed here according to the authors.

No alt text provided for this image


What is emergence?

This new study is focused on emergence in the sense that has long been discussed in domains such as physics, biology, and computer science.

To identify emergent abilities in large language models, the researchers looked for?phase transitions,?where below a certain threshold of scale, model performance is near-random, and beyond that threshold, performance is well above random.

I wonder as usual if the hype is justified. Certainly LLMs have been seductive for the public. Oddly I noticed TechCrunch had actually?covered this topic?back in April. Could the emergent abilities of LLMs truly lead to an “ah-ha” moment in fact? There’s been some rumblings from DeepMind that this is indeed the case.

“This distinguishes emergent abilities from abilities that smoothly improve with scale: it is much more difficult to predict when emergent abilities will arise,” Bommasani said.

But then again, it is sort of their job to be excited. The researchers cannot even agree as to how to measure scale.

Scale can be measured in different ways, including computation (FLOPs), model size (number of parameters), or data size. In their study, the researchers focus on computation and model size, but stress that “there is not a single proxy that adequately captures all aspects of scale.”

Whereas standard prompting asks the model to directly give the answer to a multi-step reasoning problem,?chain of thought prompting?induces the model to decompose the problem into intermediate reasoning steps, in this case leading to a correct final answer.

No alt text provided for this image


Large language models are an especially interesting case study because they have shown very clear signs of emergence. LLMs are very large transformer neural networks, often spanning across hundreds of billions of parameters, trained on hundreds of gigabytes of text data.

Google will be now slowly rolling out public access to its controversial LaMDA bot. It does not help with former employees claimed it’s a “Sentient AI”. That’s an illegal accusation to make at Alphabet.

LLMs aren’t the holy grail, but in a world of headlines we can just pretend that they are. According to the researchers and Google, language models have revolutionized natural language processing (NLP) in recent years. It is now well-known that increasing the scale of language models (e.g., training compute, model parameters, etc.) can lead to better performance and sample efficiency on a range of downstream NLP tasks (Devlin et al., 2019; Brown et al., 2020, inter alia).

That being said, I personally hope that LaMDA 2, Google’s pride and joy, will be a good AI companion. Microsoft’s OpenAI’s Dalle2, wasn’t the heir apparent to commercialize AI image generation that some imagined it would be with so many more accessible and open-source competitors.

  1. Going back to the paper, in their study, the researchers tested several popular LLM families, including LaMDA, GPT-3, Gopher, Chinchilla, and PaLM. They chose several tasks from?BIG-Bench, a crowd-sourced benchmark of over 200 tasks “that are believed to be beyond the capabilities of current language models.”
  2. They also used challenges from?TruthfulQA, Massive Multi-task Language Understanding (MMLU), and Word in Context (WiC), all benchmarks that are designed to test the limits of LLMs in tackling complicated language tasks.
  3. The researchers also took extra efforts to test the LLMs on multi-step reasoning, instruction following, and multi-step computation.

No alt text provided for this image


Is Emergence of LLMs of Scale in Performance real?


The findings of the study show that scale is highly correlated with the emergence of new abilities. Each of the LLM families, which come in different sizes, show random or below-random performance on the tasks below a certain size.

DeepMind asserted its own convictions.

Emergence is when quantitative changes in a system result in qualitative changes in behavior.

So when Meta and Tesla are upgrading their supercomputers, they are on to this trend. Scale once it reaches a certain threshold can see “emergent abilities”.

The researchers explored?emergence with respect to model scale, as measured by training compute and number of model parameters. Specifically, they?define emergent abilities of large language models as abilities that are not present in smaller-scale models?but are present in large-scale models; thus they cannot be predicted by simply extrapolating the performance improvements on smaller-scale models.

The implications are really fascinating.

Emergent abilities in LLMs seem to refer to a “sudden jump in accuracy” and continue to improve as the model grows larger.


The data on this is really promising and I’m surprised this isn’t a bigger deal in the headlines.


With Large models are used for zero-shot scenarios or few-shot scenarios where little domain-[tailored] training data is available and usually work?okay?generating something based on a few prompts (Fangzheng Xu, a Ph.D. student at Carnegie Mellon). This in itself has future implications for Emergent abilities.

Importantly also, the presence of emergent abilities in large language models shows that we can’t predict the capabilities of LLMs by extrapolating the performance of smaller scale models.

A huge caveat is of course that some studies show that when a neural network provides correct results, it is often mapping inputs to output?without learning causal relations, common sense, and other knowledge underlying the learned skill.

Emergent performance is still very exciting since Deep Learning’s results in the last decade have been less than stellar and rather slow.

What do you think?

Leave a comment

You are reading AI Supremacy, one of the fastest growing AI Newsletters born in 2022 on Substack. You can consider upgrading for access to more articles per month and access to?locked archive?posts.

Upgrade for Better Coverage

If you enjoy articles about A.I. at the intersection of breaking news join AiSupremacy?here. I cannot continue to write without community support. (follow the link below). For the price of a cup of coffee, Join 91 other paying subscribers.

https://aisupremacy.substack.com/subscribe

Cyrus F Nourani

EIC Apple Series; Management professor Institute Affiliate Prof. AI Berlin, MIR, &Singularity University Hub

2 年

Hi Michael, on a research project towards new AI NLP models since 2017 at TU Berlin with the goal to process any natural language, from any planet:-)) https://www.researchgate.net/publication/334625891_Product_Language_Models_Factors_and_NLP_Implementation_Algebras_A_Summary_Overview

回复
Zero Murray

Specialist Design Consultant for Complex Access & Rescue Systems at Kattsafe

2 年
Joaquim Le?o

Gest?o, Marketing Digital, Estratégia, Planeamento, Apps, SEO, E-Commerce, Website, B2B, B2C, e Vendas, Interim Manager, Entrepreneur, Vendas, MultiLingue.

2 年

Insightful as always Michael, thanks for sharing

NUZHAT Ansari

Chief Executive Officer at EBBINTERNATIONALE' ORGANIZATION

2 年

Looking forward to similar reports Michael!!

回复
Michael Spencer

A.I. Writer, researcher and curator - full-time Newsletter publication manager.

2 年

Really enjoyed this paper Rishi Bommasani

回复

要查看或添加评论,请登录

社区洞察