登录查看更多内容

Many-Shot In-Context Learning

Vlad Bogolin

AI/ML Engineer & Researcher | Large Language Models (LLMs)

发布日期: 2024年4月22日

Today's paper explores many-shot in-context learning, where large language models (LLMs) are provided with hundreds or thousands of examples at inference in order to learn new tasks. The authors leverage the recently expanded context windows of LLMs like Gemini 1.5 Pro to investigate performance gains from few-shot to many-shot learning across a wide range of tasks.

Overview

This paper tests many-shot in-context learning on several tasks. Since in some cases it can be hard to have multiple examples to use in the context, the authors propose reinforced ICL and unsupervised ICL. Reinforced ICL replaces human-written rationales with model-generated ones which are then filtered via answer correctness. Unsupervised ICL is also proposed, where only problems instead of problem-solution pairs are used in the prompt. Next, let’s see what are the used tasks for evaluation:

- Machine translation: Using up to 997 translation pairs as in-context examples, performance improves by 4.5% on English to Kurdish and 1.5% on English to Tamil compared to 1-shot prompts.

- Summarization: Using up to 500 (news article, summary) pairs, performance approaches that of models fine-tuned on the XSum and XLSum datasets.

- Planning in logistics domain: Success rate improves substantially with up to 800 in-context examples of planning problems and solutions.

- Learning code verifiers: Using up to 512 (problem, code solution) pairs labeled for correctness, the model becomes better at verifying code solutions.

- Problem-solving on MATH and GSM8K: Both Reinforced ICL using model-generated rationales and Unsupervised ICL using only problems outperform using human-written solutions.

- Question-answering on GPQA: Reinforced ICL matches performance of state-of-the-art few-shot models.

领英推荐

Fine Tuning : A Deep Dive into Techniques…

Ramachandran Murugan 7 个月前

Top 5 RAG (Retrieval Augmented Generation)Tutorials: A…

Aqsa Z. 3 个月前

Zero-Shot Learning: Teaching Machines to Recognize the…

Noorain Fathima 5 个月前

- Algorithmic reasoning on BIG-Bench Hard: Reinforced ICL outperforms human-written chain-of-thought prompts on 8 challenging tasks.

Keypoints

1) Many-shot learning leads to significant performance gains over few-shot learning across machine translation, summarization, planning, code verification, problem-solving, question-answering and algorithmic reasoning tasks.

2) With sufficient examples, many-shot learning can overcome pre-training biases and adapt to non-natural language tasks that are difficult for few-shot learning.

3) Performance is still sensitive to example ordering even with many shots.

Conclusion

This work thoroughly tests many-shot in-context learning on multiple tasks. Using many-shot in-context learning, large language models can be more versatile and adaptable without task-specific fine-tuning. For more information please consult the full paper.

Congrats to the authors for their work!

Agarwal, Rishabh, et al. "Many-Shot In-Context Learning." arXiv preprint arXiv:2404.11018 (2024).

AI Paper of the Day

1,323 位关注者

要查看或添加评论，请登录

Vlad Bogolin的更多文章

LEGION: Learning to Ground and Explain for Synthetic Image Detection

2025年3月23日

LEGION: Learning to Ground and Explain for Synthetic Image Detection

Today's paper introduces LEGION, a comprehensive framework for synthetic image detection that not only identifies fake…
JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse

2025年3月22日

JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse

Today's paper introduces JARVIS-VLA, a novel approach for training Vision-Language-Action (VLA) models to play visual…
Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models

2025年3月21日

Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models

Today's paper presents the a comprehensive survey on efficient reasoning for Large Language Models (LLMs). It addresses…
TULIP: Towards Unified Language-Image Pretraining

2025年3月20日

TULIP: Towards Unified Language-Image Pretraining

Today's paper introduces TULIP (Towards Unified Language-Image Pretraining), a novel approach to image-text contrastive…
Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLM

2025年3月19日

Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLM

Today's paper introduces Creation-MMBench, a novel benchmark designed to evaluate the creative capabilities of…
SPIN-Bench: How Well Do LLMs Plan Strategically and Reason Socially?

2025年3月18日

SPIN-Bench: How Well Do LLMs Plan Strategically and Reason Socially?

Today's paper introduces SPIN-Bench, a comprehensive benchmark designed to evaluate how well Large Language Models…
ReCamMaster: Camera-Controlled Generative Rendering from A Single Video

2025年3月17日

ReCamMaster: Camera-Controlled Generative Rendering from A Single Video

Today's paper introduces ReCamMaster, a framework that enables re-shooting videos with new camera trajectories while…
CoSTAast: Cost-Sensitive Toolpath Agent for Multi-turn Image Editing

2025年3月16日

CoSTAast: Cost-Sensitive Toolpath Agent for Multi-turn Image Editing

Today's paper introduces CoSTA* (Cost-Sensitive Toolpath Agent), a novel approach for multi-turn image editing that…
OmniPaint: Mastering Object-Oriented Editing via Disentangled Insertion-Removal Inpainting

2025年3月15日

OmniPaint: Mastering Object-Oriented Editing via Disentangled Insertion-Removal Inpainting

Today's paper introduces OmniPaint, a unified framework for object-oriented image editing that reconceptualizes object…
Charting and Navigating Hugging Face's Model Atlas

2025年3月14日

Charting and Navigating Hugging Face's Model Atlas

Today's paper introduces the concept of a "model atlas" for navigating the vast landscape of publicly available neural…

See all articles

Many-Shot In-Context Learning

Vlad Bogolin

AI/ML Engineer & Researcher | Large Language Models (LLMs)

Overview

领英推荐

Keypoints

Conclusion

AI Paper of the Day

1,323 位关注者

Vlad Bogolin的更多文章

社区洞察

其他会员也浏览了

Mastering ChatGPT Prompts: Harnessing Zero, One, and Few-Shot Learning, Fine-Tuning, and Embeddings for Enhanced GPT Performance

Cognitive AI Architectures and Organizational Learning in Knowledge Management

How to keep competitive

Unlearn to Learn: How Unlearning is going to be a crucial part of Responsible (AI) by Design?

Revolutionising Learning: Zavmo.AI’s Innovative Use of Sentiment Analysis and xAPI

Low-Budget Judge for High-End Hallucination Verdicts

Unveiling Andrew Ng's Latest Leap in AI Learning: Five Cutting-Edge Courses

Self-Supervised Transfer Learning: Revolutionizing AI with Unlabeled Data

?? This Prompt Makes Learning FUN

Overview

领英推荐

Keypoints

Conclusion

AI Paper of the Day

1,323 位关注者

Vlad Bogolin的更多文章

LEGION: Learning to Ground and Explain for Synthetic Image Detection

JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse

Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models

TULIP: Towards Unified Language-Image Pretraining

Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLM

SPIN-Bench: How Well Do LLMs Plan Strategically and Reason Socially?

ReCamMaster: Camera-Controlled Generative Rendering from A Single Video

CoSTAast: Cost-Sensitive Toolpath Agent for Multi-turn Image Editing

OmniPaint: Mastering Object-Oriented Editing via Disentangled Insertion-Removal Inpainting

Charting and Navigating Hugging Face's Model Atlas

社区洞察

其他会员也浏览了

Mastering ChatGPT Prompts: Harnessing Zero, One, and Few-Shot Learning, Fine-Tuning, and Embeddings for Enhanced GPT Performance

Cognitive AI Architectures and Organizational Learning in Knowledge Management

How to keep competitive

Unlearn to Learn: How Unlearning is going to be a crucial part of Responsible (AI) by Design?

Revolutionising Learning: Zavmo.AI’s Innovative Use of Sentiment Analysis and xAPI

Low-Budget Judge for High-End Hallucination Verdicts

Unveiling Andrew Ng's Latest Leap in AI Learning: Five Cutting-Edge Courses

Self-Supervised Transfer Learning: Revolutionizing AI with Unlabeled Data

?? This Prompt Makes Learning FUN