Tuning LLMs: a galaxy of endless possibilities
Cristiano De Nobili, PhD
Physicist ∣↑↓? | Lead AI Scientist | Lecturer & Speaker
This is the 4th article of Beyond Entropy, a space where the chaos of the future, the speed of emerging technologies and the explosion of opportunities are slowed down, allowing you to turn (qu)bits into dreams.
Since LLMs have been around, the number of new techniques in the world of AI have been growing almost as quickly as the number of new galaxies made visible by the James Webb Telescope. It is easy to lose the thread, miss the newest methods and get confused in a universe of new names.
The purpose of this article is to shed light on some new techniques whose names are not very different from some already known methods.
The outline will be:
Let’s start!
1. Delving into Instruction Tuning
Instruction Tuning is not Fine Tuning! It is the next step forward. Before diving into it, let us recap some basics.
Adaptive Tuning and Fine Tuning
Starting from a pre-trained Large Language Model, we use to say that we proceed with a fine tuning when we want to train it on a specific task (and therefore data) in a supervised setting. For instance, we might fine tune a generic language model in distinguishing irony and sarcastic sentences.
Sometimes, before fine tuning on task-specific data, the model might undergo an additional pre-training phase on a large corpus of domain-specific text, helping it to adapt to the vocabulary and style of the target domain (this is named Domain Adaptive Tuning). In the previous example, we could adaptive tune our model on a huge collection of comic books.
To recap, the standard procedure is
Instruction Tuning
Given a LLM, the problem we want to solve is:
How to turn a vanilla next-word predictor to a model that generates a good answer given a prompt?
Mathematically speaking, starting from a model trained on basics language objective P(word | context) how can we transform it into a model that, given a task/prompt, optimizes P(answer|prompt)?
This is where Instruction Tuning comes in. It is about bridging the gap between next-word prediction objective and the user's wish for the model to follow specific instructions. This is possible by standard fine tuning an LLM using pairs of instructions and their corresponding outputs. Each data sample may have a different instruction/task with respect to the next sample.
领英推荐
Instruction Tuning vs Fine Tuning
There are several other methods for adapting LLMs to new tasks, such as zero-shot prompting, few-shot prompting and supervised (standard) fine tuning. Zero-shot and few-shot prompting are a simple, fast and inexpensive way to adapt the LLM to a new task. Supervised fine tuning of an LLM to a specific task is accurate, but time-consuming and expensive.
While standard fine tuning focuses on a specific task (e.g. NER), instruction tuning can train the model simultaneously on many tasks, as mentioned before.
If you wish to better grasp instruction tuning, you can find more on this blog by Sebastian Ruder. To do some practice, I found this notebook where LLaMa 2 is instruction tuned by Philipp Schmid . If you are looking instead for a collection of open-source instruction tuning datasets, checkout this github repository (text and multimodal sets).
2. Delving into Prompt Tuning
Prompt Tuning is not Prompt Engineering! It's the next step forward. To avoid confusion, let's dive in to demystify these concepts.
The Craft of Prompt Engineering
Prompt Engineering is the widely known art of designing by hand hard prompts, i.e. optimal static prompts that serve to instantiate an LLM and direct its behaviour towards the desired results without further training the model (i.e. without updating its weights). It is sometimes referred to as a in-context prompting.
Despite the enormous success of Prompt Engineering, it has several limitations. Crafting the perfect prompt demands a lot of trial and error. In addition, it is limited by the context window.
Introducing Prompt Tuning
This is where Prompt Tuning comes in, introducing soft prompts (original paper). These are additional trainable tokens that are added to the hard prompt. By supervised training these additional parameters for a given downstream task, we tailor the model's response more precisely. The advantage of Prompt Tuning over standard fine-tuning is that only a few new embedding weights are trained, instead of the entire pre-trained LLM (see the picture!). For the above reasons, soft prompts are usually referred as AI-designed prompts.
The intention of this article was not to list all the possible ways of fine-tuning an LLM but to highlight some of them by showing the differences. Going back to the opening words, there is a galaxy of different techniques and certainly a universe of many more yet to be discovered.
Opportunities, talks, and events
I share some opportunities from my network that you might find interesting:
?? Job opportunity:
?? Research opportunity:
?? Learning opportunities:
A.I. Writer, researcher and curator - full-time Newsletter publication manager.
9 个月Is there a Substack version of this Newsletter? Because on LinkedIn I don't think I'm going to catch it.
Commercialista tecnologico: outsourcing, RPA e Intelligenza Artificiale sono le mie parole d'ordine.
9 个月Ben fatto.
Your article sounds fascinating! Looking forward to reading it.
Can't wait to read it! ??
Thank you for mentioning Briink, Cristiano! We are looking for a passionate AI engineer who is looking to make a difference in the emerging ESG and sustainability knowledge economy. ?? ??