登录查看更多内容

Fine tuning Large Language Models (using Instruction Tuning and RLHF)

Nikhil Goel

AI | Machine Learning | AI SAAS B2C Platform Leader

发布日期: 2023年10月1日

Today I am going to talk about fine tuning large language models (LLM's). Let me first start by giving you a very brief background of what LLM's are.

LLM's are Decoder only models and are a step towards Generative AI. LLM generates text and hence can be used for any downstream production tasks such as Q&A, Conversational AI, Sentiment Analysis, Text Summarization or anything. LLM has taken the world of NLP by storm and all big tech companies have come out with their own LLM.

Where LLM benefits the most to industry is leveraging LLM's for any downstream task. And for this we need to fine tune the LLM's. Fine tune the LLM's for any task that you want to accomplish.

But how do you do fine tuning? What are the fine-tuning methods available? Where do we get data for fine tuning are some questions? These are some questions I will try to answer today.

For fine tuning LLM there are two methods available.

Instruction Tuning - Instructing the model to perform some tasks.
Reinforcement Learning for Human Feedback (RLHF)

What is the difference between the two - Instruction Tuning is basically fine tuning the LLM's by providing labelled instructions and responses. RLHF is based on fine tuning the model based on feedback provided by humans on the labels.

Let me explain in steps specifically for Instructing Tuning based LLM.

Step1: Providing supervised training data to the model. As below:

Example of Data for Task Instruction Tuning to the model would be:

{

Instruction (Q&A task to LLM Model): "List all non-stop flights from Singapore to SFO in descending order of fares."

Context (Q&A Task Context to LLM Model): "Singapore airlines, Emirates, Qatar Airways airlines fly nonstop to SFO and have high fares."

Response (Supervised Response to be generated from Model):"SG 202, EM-204, QTR-909, AIR-404."

Category:closed_Q&A

}

The important point that comes here is that such supervised training data as above could be only 20 in number and still the model is fine-tuned and gives good results. That is what makes LLM's so powerful. Bootstrapping to production grade usage with very low data.

Step2: Choose the LLM Model to fine tune and define the configuration

Take any LLM such as "openlm-research/open_llama_7b_700bt_preview" from Hugging Face Hub.

Define LoRA Config (Quantization) to train the LLM as above. (refer to my other article on what is LoRA and the huge importance of LoRA in fine tuning LLM).

Step3: Supervised Tuning of the LLM on the above instruction data set:

Use the "trl" library and in that use 'SFTTrainer' for training the model construct defined in step 2. This "trl" library can be found at Hugging Face.

领英推荐

RAG Techniques Every AI/ML/Data Engineer Should Know!

Pavan Belagatti 6 个月前

?? What Next-Gen RAG Is About

Pascal Biese 6 个月前

LLM Watch#11: Equipping LLMs with Better Long-Term…

Pascal Biese 1 年前

Step4: Make an inference (at production time) from the fine-tuned model:

Instruction: "List all non-stop flights from SFO to Singapore in descending order of fares."

Model Response:

Emirates nonstop - 5000$, Qatar nonstop - 4000$

Instruction based fine tuning can be done based on ZERO Shot learning means providing no examples of the task.

Why does this work? This works as models' weights are fine-tuned based on instruction and response and secondly the construct of LLM is CausalLM . meaning generating the next word based on earlier words. (THIS IS KEY). This is based on Masked ATTENTION

To summarise Instruction tuning is a game changer for making LLM’s work for production grade tasks. This is revolutionary . Just see the potential of what can be done with just small data sets and 100 lines of code.

So this is it for today. I will cover RLHF in my next article next week.

Hope you all have a good read.

Disclaimer: Opinion / Views expressed above are the author's personal and has no bearing or affiliation to the authors current employer or any earlier/past employers.

Credits:??

https://huggingface.co/

Image Credit:?

https://www.analyticsvidhya.com/blog/2023/07/build-your-own-large-language-models/

要查看或添加评论，请登录

Nikhil Goel的更多文章

Code Generation LLM's and how businesses are using these for improving productivity

2025年3月16日

Code Generation LLM's and how businesses are using these for improving productivity

Today's article in on a very unique topic of Code Generation LLM and how businesses are using these LLMs for improving…
Qwen2.5B Coder LLM and how transformative is for Business

2025年3月9日

Qwen2.5B Coder LLM and how transformative is for Business

Today's post is on Qwen2.5B Code LLM and how Qwen2.
AI Agents(smolagents) and how these are transforming business

2025年2月23日

AI Agents(smolagents) and how these are transforming business

Today's article is on AI Agents and specifically on "smolagents" and how agents are transforming businesses. As part of…
DeepSeekMath - LLM for mathematical reasoning and how it transforms Mathematical Problem-Solving

2025年2月16日

DeepSeekMath - LLM for mathematical reasoning and how it transforms Mathematical Problem-Solving

Today's article about DeepSeekMath an LLM that is pushing the limits of mathematical reasoning and how this aspect is…
deepseek R1 LLM - What is and why this LLM is game changer for business

2025年2月2日

deepseek R1 LLM - What is and why this LLM is game changer for business

Today's article is on deepseek R1, what is deepseek R1 - the model architecture, how the model was trained (RLHF) and…
AI Agents What are AI Agents and why AI Agents are transformative to business

2025年1月19日

AI Agents What are AI Agents and why AI Agents are transformative to business

Today I am going to write about AI Agents, what are AI Agents and why AI Agents are transformative to business. We are…

2 条评论
"Text2SQL" how LLM's enable this and why this is transformative for Businesses

2025年1月12日

"Text2SQL" how LLM's enable this and why this is transformative for Businesses

Today I am going to write about "Text2SQL", what exactly is text2sql is, why this is needed and how this is hugely…
Self-Instruct and Instruction Tuning of LLM and applying to solve a business use case

2024年12月29日

Self-Instruct and Instruction Tuning of LLM and applying to solve a business use case

Today I am going to write what Self-Instruct is, importance of Self-Instruct, what is Instruction tuning and how…

4 条评论
Retrieval Augmented Generation (RAG) vs Fine Tuning of LLMs - What is right for business

2024年12月15日

Retrieval Augmented Generation (RAG) vs Fine Tuning of LLMs - What is right for business

Today's article is about comparing RAG Vs Fine Tuning of LLM's, and what of these two is more apt for businesses if the…

1 条评论
Contextual RAG - What it is and the value over simple RAG

2024年11月24日

Contextual RAG - What it is and the value over simple RAG

Today's article is about Contextual RAG, what it is and the value over simple RAG. In the article I am going to cover…

See all articles

Fine tuning Large Language Models (using Instruction Tuning and RLHF)

Nikhil Goel

AI | Machine Learning | AI SAAS B2C Platform Leader

领英推荐

Nikhil Goel的更多文章

社区洞察

其他会员也浏览了

LLM Pulse - September 16, 2024

Watch#8: Extreme Teachers and Mixing Tokens, not Experts

SLM and LLM... My Top 10 in July 2024

Retrieval-Augmented Generation (RAG) and Agentic RAG

Demystifying the Building Blocks: A Look Inside LLMs

Revolutionizing Language Models with LangChain

Large Language Models in Production: A Practical Guide to Deployment and Optimization

Training, Tuning, and Retrieval: How Large Language Models Get Smart

Give Us the Facts: Large Language Models vs. Knowledge Graphs

How To Use Prompt Engineering With Large Language Models

领英推荐

Nikhil Goel的更多文章

Code Generation LLM's and how businesses are using these for improving productivity

Qwen2.5B Coder LLM and how transformative is for Business

AI Agents(smolagents) and how these are transforming business

DeepSeekMath - LLM for mathematical reasoning and how it transforms Mathematical Problem-Solving

deepseek R1 LLM - What is and why this LLM is game changer for business

AI Agents What are AI Agents and why AI Agents are transformative to business

"Text2SQL" how LLM's enable this and why this is transformative for Businesses

Self-Instruct and Instruction Tuning of LLM and applying to solve a business use case

Retrieval Augmented Generation (RAG) vs Fine Tuning of LLMs - What is right for business

Contextual RAG - What it is and the value over simple RAG

社区洞察

其他会员也浏览了

LLM Pulse - September 16, 2024

Watch#8: Extreme Teachers and Mixing Tokens, not Experts

SLM and LLM... My Top 10 in July 2024

Retrieval-Augmented Generation (RAG) and Agentic RAG

Demystifying the Building Blocks: A Look Inside LLMs

Revolutionizing Language Models with LangChain

Large Language Models in Production: A Practical Guide to Deployment and Optimization

Training, Tuning, and Retrieval: How Large Language Models Get Smart

Give Us the Facts: Large Language Models vs. Knowledge Graphs

How To Use Prompt Engineering With Large Language Models