登录查看更多内容

Tuning LLMs: a galaxy of endless possibilities

Cristiano De Nobili, PhD

Physicist ∣↑↓? | Lead AI Scientist | Lecturer & Speaker

发布日期: 2024年2月14日

This is the 4th article of Beyond Entropy, a space where the chaos of the future, the speed of emerging technologies and the explosion of opportunities are slowed down, allowing you to turn (qu)bits into dreams.

Since LLMs have been around, the number of new techniques in the world of AI have been growing almost as quickly as the number of new galaxies made visible by the James Webb Telescope. It is easy to lose the thread, miss the newest methods and get confused in a universe of new names.

The purpose of this article is to shed light on some new techniques whose names are not very different from some already known methods.

The outline will be:

Adaptive Tuning and Fine Tuning;
Defining Instruction Tuning and Prompt Tuning;
?? Job & Research opportunities, talks, and events in AI.

Let’s start!

1. Delving into Instruction Tuning

Instruction Tuning is not Fine Tuning! It is the next step forward. Before diving into it, let us recap some basics.

Adaptive Tuning and Fine Tuning

Starting from a pre-trained Large Language Model, we use to say that we proceed with a fine tuning when we want to train it on a specific task (and therefore data) in a supervised setting. For instance, we might fine tune a generic language model in distinguishing irony and sarcastic sentences.

Sometimes, before fine tuning on task-specific data, the model might undergo an additional pre-training phase on a large corpus of domain-specific text, helping it to adapt to the vocabulary and style of the target domain (this is named Domain Adaptive Tuning). In the previous example, we could adaptive tune our model on a huge collection of comic books.

To recap, the standard procedure is

Training a model from scratch (aka pre-training) on a general purpose dataset, following a self-supervised training, such as Masked or Causal Language Modelling (MLM or CLM);
Adaptive tuning a model on a large corpus of domain-specific text, according again to MLM, CLM or similars;
Fine tuning the model on a small task-specific dataset in a supervised setting.

Training from scratch (pre-training), Domain Adaptive Tuning, Fine Tuning.

Instruction Tuning

Given a LLM, the problem we want to solve is:

How to turn a vanilla next-word predictor to a model that generates a good answer given a prompt?

Mathematically speaking, starting from a model trained on basics language objective P(word | context) how can we transform it into a model that, given a task/prompt, optimizes P(answer|prompt)?

This is where Instruction Tuning comes in. It is about bridging the gap between next-word prediction objective and the user's wish for the model to follow specific instructions. This is possible by standard fine tuning an LLM using pairs of instructions and their corresponding outputs. Each data sample may have a different instruction/task with respect to the next sample.

Data & Analytics 5 个月前

?? Mamba > Transformers?

Pascal Biese 11 个月前

Probabilistic Nearest Neighbors: The Swiss Army Knife…

Vincent Granville 5 个月前

Instruction Tuning vs Fine Tuning

There are several other methods for adapting LLMs to new tasks, such as zero-shot prompting, few-shot prompting and supervised (standard) fine tuning. Zero-shot and few-shot prompting are a simple, fast and inexpensive way to adapt the LLM to a new task. Supervised fine tuning of an LLM to a specific task is accurate, but time-consuming and expensive.

While standard fine tuning focuses on a specific task (e.g. NER), instruction tuning can train the model simultaneously on many tasks, as mentioned before.

If you wish to better grasp instruction tuning, you can find more on this blog by Sebastian Ruder. To do some practice, I found this notebook where LLaMa 2 is instruction tuned by Philipp Schmid . If you are looking instead for a collection of open-source instruction tuning datasets, checkout this github repository (text and multimodal sets).

2. Delving into Prompt Tuning

Prompt Tuning is not Prompt Engineering! It's the next step forward. To avoid confusion, let's dive in to demystify these concepts.

The Craft of Prompt Engineering

Prompt Engineering is the widely known art of designing by hand hard prompts, i.e. optimal static prompts that serve to instantiate an LLM and direct its behaviour towards the desired results without further training the model (i.e. without updating its weights). It is sometimes referred to as a in-context prompting.

Despite the enormous success of Prompt Engineering, it has several limitations. Crafting the perfect prompt demands a lot of trial and error. In addition, it is limited by the context window.

Introducing Prompt Tuning

This is where Prompt Tuning comes in, introducing soft prompts (original paper). These are additional trainable tokens that are added to the hard prompt. By supervised training these additional parameters for a given downstream task, we tailor the model's response more precisely. The advantage of Prompt Tuning over standard fine-tuning is that only a few new embedding weights are trained, instead of the entire pre-trained LLM (see the picture!). For the above reasons, soft prompts are usually referred as AI-designed prompts.

The intention of this article was not to list all the possible ways of fine-tuning an LLM but to highlight some of them by showing the differences. Going back to the opening words, there is a galaxy of different techniques and certainly a universe of many more yet to be discovered.

Opportunities, talks, and events

I share some opportunities from my network that you might find interesting:

?? Job opportunity:

An intern position is opened at Moody's Quantum Group (through Sergio Gago );
Metis, an early-stage AI startup active in NLP for document processing automation, is looking for a Full-stack Software Engineer (through Nathan Shammah );
Briink , an AI-powered Green Tech startup, opened an AI Engineer position;
Carbon Re , an AI company focusing on cement and steel decarbonization, is looking for a Lead Machine Learning Engineer;

?? Research opportunity:

New positions at different levels (post-degree, post-doc and researcher positions) on research topics at the intersection of HPC and AI for Climate Change at University of Trento (through Sandro Fiore, Ph.D. , more info here);
Two open positions for Computer Scientists on Multi-agents Gene Networks and Machine Learning in an ERC project in Computational Biology (through Francesca Buffa , more info here);

?? Learning opportunities:

Application are now open for 10 grants (8-week hands-on AI programme on real-world challenge) at the Pi School of AI Pi School starting around April (apply here);
Application are now open for several full fellowships at the Master in High-performance Computing (MHPC, SISSA/ICTP, Trieste), a 15-months-program for future experts in HPC, AI and Quantum Computing (Link to apply);
Dati, Innovazione e Sostenibilità at EIIS - European Institute of Innovation for Sustainability (in Italiano): un evento per introdurre ad alto livello l'AI e imparare a creare tramite questa strategie di sostenibilità.
A Tech Talk given by Nicola Procopio ?? at Pi School about the open-source project Cheshire Cat AI : A Production Ready AI Assistant Framework.

Beyond Entropy

2,131 位关注者

Michael Spencer

A.I. Writer, researcher and curator - full-time Newsletter publication manager.

9 个月

Is there a Substack version of this Newsletter? Because on LinkedIn I don't think I'm going to catch it.

1 次回应

Federico Loffredo

Commercialista tecnologico: outsourcing, RPA e Intelligenza Artificiale sono le mie parole d'ordine.

9 个月

Ben fatto.

1 次回应

Free AI Tools & ChatGPT Prompts ??

9 个月

Your article sounds fascinating! Looking forward to reading it.

1 次回应

Sameer Joshi

9 个月

Can't wait to read it! ??

1 次回应

Briink

9 个月

Thank you for mentioning Briink, Cristiano! We are looking for a passionate AI engineer who is looking to make a difference in the emerging ESG and sustainability knowledge economy. ?? ??

1 次回应

查看更多评论

要查看或添加评论，请登录

查看全部

Tuning LLMs: a galaxy of endless possibilities

Cristiano De Nobili, PhD

Physicist ∣↑↓? | Lead AI Scientist | Lecturer & Speaker