Meet LIMA: A New 65B Parameter LLaMa Model Fine-Tuned On 1000 Carefully Curated Prompts And Responses ????

SWARNODIP NAG

MCA @CU '26 | AI/ML Researcher ?? | Data Scientist ?????? | Computer vision ???| Python (Django) Developer?? | FSD enthusiast ?? | Graphic Designer ???? | VITMEE '24 GMR: 23 | WBJECA '24 GMR: 55

发布日期: 2023年5月30日

By being pretrained to predict the next token at an astoundingly large scale, language models provide general-purpose representations that can be used for nearly any language interpretation or producing task. As a result, a variety of language model alignment strategies have been proposed to aid in this transfer, with a focus on instruction tuning over large datasets with millions of examples and, more recently, Reinforcement Learning from Human Feedback (RLHF) gathered over millions of interactions with human annotators. However, for existing alignment techniques to perform at ChatGPT levels, large computing, and specialized data resources are required.?

However, they show that with a good language model already trained, very good performance may be obtained by just tweaking 1,000 properly chosen training instances. According to their hypothesis, alignment may be a quick and easy procedure where the model learns the format or style of engaging users to disclose the skills and information already learned during pretraining. They collect 1,000 instances that resemble authentic user cues and excellent replies to verify this idea. They choose 750 of the best questions and responses from online discussion boards like Stack Exchange and wikiHow, evaluating them for quality and variety.

They also manually compose 250 instances of questions and answers while emphasizing a consistent response style in the vein of an AI assistant and optimizing for task diversity. Researchers from AI at Meta , 美国卡内基梅隆大学 , 美国南加州大学 , and Tel Aviv University USA (AFTAU) train LIMA, a 65B-parameter LLaMa model previously trained and improved on this collection of 1,000 examples. Three hundred difficult test questions compare LIMA against contemporary language models and products.

LIMA surpasses RLHF-trained DaVinci003 from OpenAI , which was trained with RLHF, as well as a 65B-parameter replica of Alpaca, which was introduced on 52,000 samples, in a study of human preference.?

Although humans frequently prefer GPT-4, Claude, and Bard replies over LIMA responses, this is not always the case; LIMA consistently yields equivalent or preferable results in 43%, 46%, and 58% of the situations, respectively. They repeat the annotations of human preferences using GPT-4 as the annotator confirms their findings. When LIMA replies are evaluated on an absolute scale, 88% satisfy the prompt’s requirements, and 50% are rated outstanding. Ablation tests show significant improvements when improving data quality and significantly falling returns when increasing data amount without simultaneously increasing prompt variety.?

Shushant Lakhyani 4 个月前

A Historic Week for ?O?p?e?n? ?S?o?u?r?c?e? AI

Pascal Biese 2 个月前

How Do AI Detection Tools Work? Everything You Need to…

Opuere Odu 4 个月前

Furthermore, they discover that LIMA can carry on coherent multi-turn discourse despite having no dialogue examples. Including 30 hand-crafted dialogue chains in training may enhance this capacity. Overall, these amazing results show the effectiveness of pretraining and its relative value over approaches to reinforcement learning and large-scale instruction tailoring. They demonstrate how a robust pretrained language model may be tuned to provide outstanding, competitive outcomes on various prompts using 1,000 well-picked samples.

There are, however, drawbacks to this strategy.?The mental work required to create such instances is enormous and challenging to scale up. Second, while LIMA normally provides strong replies, an unfortunate sample during decoding or an aggressive prompt can frequently result in a weak response. LIMA is less resilient than product-grade models.

Nevertheless, the data provided in this work shows that it is possible to address the difficult alignment problems straightforwardly.

Thanks for reading! ??

Follow me SWARNODIP NAG ↗

Meet LIMA: A New 65B Parameter LLaMa Model Fine-Tuned On 1000 Carefully Curated Prompts And Responses ????

SWARNODIP NAG

MCA @CU '26 | AI/ML Researcher ?? | Data Scientist ?????? | Computer vision ???| Python (Django) Developer?? | FSD enthusiast ?? | Graphic Designer ???? | VITMEE '24 GMR: 23 | WBJECA '24 GMR: 55

领英推荐

更多精彩文章

社区洞察

其他会员也浏览了

Unleashing the Power of GPT-4: What to Expect from the Next Generation of AI Language Models

ChatGPT is not all you need

Are Small Language Models the Next Big Thing?

The LLaMA Effect: A Deep Dive into Meta's New Large Language Model

Unraveling the Frontiers of Knowledge: New Research in Large Language Models

Comparing AI Language Models and API Providers

Filling the Gap: The Next Frontier for GenAI.

The Rise of Large Language Models (LLMs) and Their Impact on AI Development

Testing AI the Human Way: Misguided or Revealing?

Retrieval Augmented Generation (RAG): The Next Frontier in Large Language Models

领英推荐

Introducing DALL·E 3: The Ultimate Text-to-Image Generator ????

2023年9月23日

Google AI revolutionizing Computational Fluid Dynamics with TPU-Based Simulation Framework

2023年9月10日

??A new era of Music Generation is here: AI can now generate realistic song covers????

2023年8月12日

SDXL is a Game-Changer !! ????

2023年8月4日

Russia’s First Quantum Computer: A 16-Qubit Breakthrough

2023年7月27日

??Revolutionizing Content Creation: Text-to-Image Diffusion Models on Mobile Devices within Two Seconds ????

2023年6月2日

Guanaco is a ChatGPT competitor trained on a single GPU in one day

2023年5月28日