Maximizing Effectiveness of Large Language Models (LLMs): Fine-Tuning Methods
Fine-Tuning

Maximizing Effectiveness of Large Language Models (LLMs): Fine-Tuning Methods

In our previous article, we explored various techniques for enhancing prompt engineering for our Language Model (LLMs), such as RAG, CoT, and Prompt Tuning. Now, we will understand the concept of Fine-tuning in this article.

Before diving into this article, it's recommended to read the following articles to better grasp the content:


What is Fine-tuning?

Fine-tuning is an approach to transfer learning (solution approach from one model to solve another problem), it is used to adjust parameters of a pre-trained model for a specific task. As shown in title image above.

Now that we've comprehended Language Models (LLMs) through a question-answer approach, let's apply the same method to understand fine-tuning.


1. When to use fine-tuning?

Despite the advancements in prompt engineering techniques, challenges like hallucinations persist due to the broad training data used in base models, whereas specific tasks may require more targeted responses. There's also a need for the model to retain all information at once, rather than fetching it repeatedly. Moreover, privacy concerns arise from querying the LLM which leads to need of data protection measures.

Fine-tuning can mitigate these issues by potentially reducing the cost per query and offering better control over LLM behavior. Developing a custom LLM is expensive and demands large amount of data, making fine-tuning a more feasible solution. Hence, fine-tuning enables the model to align with specialized information and desired behaviours.


2. What are different methods of fine-tuning?

  • Unsupervised Fine-Tuning - It involves using unlabeled data or a knowledge base to refine the model, leading to text generation that remains consistent with the pre-trained model's style and content.

A knowledge base refers to information pertaining to a product, service, department, or topic.

  • Supervised Fine-Tuning (SFT) utilizes supervised learning, which involves labeled data, to refine the model for specific tasks such as text classification or semantic analysis.
  • RLHF - It stands for Reinforcement Learning from Human Feedback, is a method that involves utilizing human feedback to refine the LLM. This is done by providing ratings to the generated prompts. While RLHF can be complex and costly, it proves to be effective in fine-tuning LLM when there's a lack of sufficient labeled data available.
  • Domain-specific fine-tuning involves refining the model using text specific to a particular industry or domain. This process aims to enhance the model's understanding of context and knowledge relevant to tasks within that specific domain, such as the medical field, finance, and others.
  • Instruction fine-tuning - In this fine-tuning method along with providing the data, instructions or prompts with guidelines are added to it to guide the LLM to generate data based on the instruction like “Summarize the text”.
  • Full fine-tuning - It involves updating all trainable parameters of the model based on the new dataset.

This process demands significant computation and memory to manage and process weights, gradients, optimizers, and other components of the LLM. However, one drawback is the occurrence of catastrophic forgetting, wherein the model tends to forget previously learned information while adapting to the new dataset.

  • Parameter Efficient Fine-Tuning (PEFT) - It differs from full fine-tuning by only updating a small subset of parameters while keeping the other frozen. This approach allows for fine-tuning with a reduced number of parameters, helping to mitigate the drawbacks associated with full fine-tuning.
  • Retrieval-Augmented Fine-Tuning (RAFT) - It is a fairly new technique introduced by Microsoft to address the limitations of fine-tuning (such as approximation and hallucination) and RAG (issues with retrieval). It involves conducting domain-specific fine-tuning on the model before utilizing RAG. This pre-fine-tuning step assists the model in retrieving relevant data effectively, thus enhancing its performance.
  • Reasoning with Reinforced Fine-Tuning (ReFT) - It aims to enhance the reasoning capabilities of LLMs by enabling the model to learn from multiple reasoning paths. This differs from approaches like CoT (Chain-of-Thoughts), which typically focus on a single path. In ReFT, the LLM undergoes initial fine-tuning using Supervised Fine Tuning (SFT), followed by online Reinforcement Learning (RL). During RL, numerous reasoning paths are automatically sampled based on the question, and rewards are derived from ground-truth answers, facilitating the model's improvement in reasoning abilities.


3. How can we categorize Parameter Efficient Fine-Tuning (PEFT)?

PEFT (Parameter Efficient Fine-Tuning) fine-tuning has gained popularity due to its significant advantages, notably its low computational requirements and reduced data needs. Thus, understanding PEFT better is important.

  • Selective - It involves fine-tuning only a selected subset of parameters within the LLM to enhance its performance. Examples of this approach include techniques like Bitfit, Child-Tuning, and Diff Pruning.
  • Reparameterized - It uses low-rank transformations to decrease the number of trainable parameters from the original weights. Two types of low-rank decompositions are commonly used: Low-Rank Decompositions and Low-Rank Adjustment Derivatives. LoRA and QLoRA are popular implementations of these techniques.

LoRA

LoRA involves freezing the pre-trained model weights and introducing trainable rank decomposition matrices into each layer of the Transformer architecture. This significantly reduces the number of trainable parameters for downstream tasks. LoRA performs better than soft prompts and closely approximates the performance of full fine-tuning.

  • Additive - It involves keeping the original parameters of the LLM unchanged while adding a small number of trainable parameters to the model. Two common types of additive fine-tuning are adapters, which add new trainable layers to the model architecture, and Soft Prompts, which we discussed in a previous article.

For more detailed information, please refer to the paper: https://arxiv.org/pdf/2312.12148.pdf


4. How to decide which fine-tuning method to use from the above?

The decision on which fine-tuning method to use depends on factors such as the specific use case, available computational resources, budget constraints, the quality of available data, and the chosen base model. It's advisable to begin with simpler approaches and progressively explore more complex fine-tuning methods. This allows for evaluating performance and understanding any drawbacks of the fine-tuned model along the way.


5. What type of dataset is a good dataset and how much?

Using high-quality labeled data alongside clear prompts or instructions can significantly enhance the effectiveness of fine-tuning.

The quantity of data needed varies depending on factors such as the fine-tuning method employed and the base model being used. For instance, GPT-3.5 Turbo typically requires around 50-100 examples for fine-tuning, as suggested by OpenAI, whereas other models might require a larger dataset.


Conclusion:

Fine-tuning may not always be the optimal approach, especially when there's a scarcity of data or computational resources. In cases where prompt engineering is executed effectively, fine-tuning may even be unnecessary. The decision largely depends on factors such as the desired latency, behavioural adjustments, and privacy considerations.


Next week we will be looking at how to implement fine-tuning. Stay tuned!

Happy reading ?? . For more such articles subscribe to my newsletter: https://lnkd.in/guERC6Qw

I would love to connect with you on Twitter: @MahimaChhagani. Feel free to contact me via email at [email protected] for any inquiries or collaboration opportunities.

Chris Mann

AI Product Management. Former LinkedIn, IBM, Bizo, 1Password and several 0-1's. [I am NOT looking for marketing or development - engineering services]

6 个月

Fantastic article Mahima

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了