LLM Finetuning
LinkedIn Generated Image

LLM Finetuning

Excited to finish this course “Fine Tuning Large Language Models” Course by Deep Learning AI over this weekend {A new ritual which I am trying to follow every weekend}.

On a high level, you will be learning about the following

  • Learn the fundamentals of fine-tuning a large language model (LLM).
  • Understand how fine-tuning differs from prompt engineering, and when to use both.
  • Get practical experience with real data sets, and how to use techniques for your own projects.

What is Finetuning?

Fine-tuning retrains a base model which is pre-trained on general domains to a specific task.

Fine-Tuning (FT) is a common approach for adaptation. During fine-tuning, the model is initialized to the pre-trained weights and biases, and all model parameters undergo gradient updates.A simple variant is to update only some layers while freezing others. We include one such baseline reported in prior work?(Li & Liang, 2021) on GPT-2, which adapts just the last two layers.

For e.g : Code Llama has multiple versions based on the different types of domain Specific training (fine-tuning).

  • Foundation models (Code Llama)
  • Python specializations (Code Llama - Python), and
  • Instruction-following models (Code Llama - Instruct) with 7B, 13B, 34B and 70B parameters each.

The following diagram shows how each of the Code Llama models is trained:

From Llama.meta.com

(Fig: The Code Llama specialization pipeline. The different stages of fine-tuning annotated with the number of tokens seen during training.)

So here are the things you need to get started on Fine Tuning an LLM for a specific task.

  • Base Model / Foundational model which is good for general domain
  • Domain Specific training dataset (This course provides some best practices to consider when applying fine Tuning).
  • Memory to load the model and Compute resources to do the training
  • Evaluation of the fine tuned model and further iteration

Practical approach to fine tuning :

From

Although there are several fine Tuning techniques, Parameter Efficient fine Tuning or PEFT allows one to fine-tune models with minimal resources and costs.

How is fine tuning different from RAG and Prompt Engineering:

I recommend this Google AI Blog Post which explains the major difference between Finetuning / RAG and Prompt Engineering.

Although you can choose to take any path to arrive at a goal, Availability of resources and different talent requirements can influence the decision.

Comparing prompt engineering, RAG, and fine-tuning

Performance Efficient Fine Tuning (PEFT):

Parameter Efficient Fine-Tuning (PEFT) is a technique used to fine-tune pre-trained language models (PLMs) for downstream natural language processing (NLP) tasks while reducing the number of parameters and computation required. PEFT aims to overcome the limitations of traditional fine-tuning methods that require a large amount of data, computation, and memory.

PEFT works by identifying the most important parameters in the pre-trained model that are relevant to the target task and freezing the remaining parameters. This reduces the number of parameters that need to be fine-tuned, resulting in faster fine-tuning times and lower computational requirements.

LoRA: Low-Rank Adaptation of Large Language Models

LoRA is a method used to fine-tune large language models (LLMs) in an efficient way. It involves freezing the weights of the LLM and injecting trainable rank-decomposition matrices.

There is an exciting Kaggle Competition : LLM Prompt Recovery which is related to Fine-tuning technique with LoRA. Request you to check it out and let me know if you are willing to collaborate on attempting a solution.

References and Further reading:


要查看或添加评论,请登录

社区洞察