Jili demo account create philippines,Dragon link slots real money.REGISTER NOW GET FREE 888 PESOS REWARDS!

In the AI Toolbox series, we aim to provide you with key insights into important tools for building AI systems. In the previous edition, we looked at Vector Search; in this edition, we will consider fine-tuning.?

Key Terms and Concepts

Training Data Sets
Hyperparameters
Models
Validation

Introduction

Consider the situation of high-performance cars involved in multi-track international or national racing, say in Formula 1, NASCAR, or IndyCar. If you have a car designed to be generally fast around various racetracks around the world, you still need to configure it to the nuances of each particular racetrack to get the most out of it.?

But what do you do when you have a model that has already been trained on a large, general data set but want to tailor it for use with a task-specific dataset?

Today, we have many pre-trained models available to us. You may also have trained your own corporate model with your data. ?

Fine-tuning is a powerful machine learning and artificial intelligence technique that allows practitioners to adapt pre-trained models to specific tasks or domains. It leverages the knowledge gained from the initial training and then optimizes it for a particular application.

In this article, we explore the this concept further

Why has it come about?
When should you use it?
What are its limitations?

Finally, we will provide a practical example of its implementation so that you can start incorporating it into your ML and AI implementations.

FINE TUNING – A PRIMER?

When to Use Fine-Tuning

The key advantage of fine-tuning is that it allows us to benefit from the features and patterns learned by the original model while adapting it to our specific needs. It will also generally outperform models trained from scratch.?

Fine-tuning is particularly effective in the following scenarios:

? Limited Datasets:

Fine-tuning allows for the adaptation of a pre-trained model with fewer data, as the model already captures general features from its initial training. This reduces the risk of overfitting and enhances performance, even with smaller datasets[1][2].

? Similar Tasks:

When the new task is closely related to the task on which the model was pre-trained, fine-tuning the higher layers of the model often suffices. The lower layers, which learn more generic features, can remain largely unchanged. This ensures a faster and more efficient training process[3].

? Time and Resource Constraints:

Fine-tuning is computationally efficient as it requires fewer parameters to be updated compared to training from scratch. Techniques like parameter-efficient fine-tuning (PEFT) and partial fine-tuning focus on updating only a subset of the parameters, which further reduces computational costs and memory requirements[2].

As you spend more time adjusting the model, layers, hyperparameters etc. to meet output results rather than creating a model from scratch.

So now you've decided to use Fine-Tuning lets look a how you go about implementing it.

Steps for Fine-Tuning a Model

Fine-tuning typically involves the following steps:

1?? Start with a pre-trained model

2?? Replace the final layer(s) of the model to match the new task

3?? Update the Hyperparameters

4?? Train the model on a new dataset, usually with a lower learning rate

5?? Validate and adjust the models to ensure the best fit

When Not to Use Fine-Tuning

While fine-tuning is a powerful technique, it's not always the best approach. Let’s look at some situations where you might want to consider alternatives:

? Significantly Different Tasks: If your target task is very different from the original task the model was trained on, fine-tuning may not be effective. In such cases, training from scratch or using a different architecture might be more appropriate[4].

? Sufficient Data and Resources: If you have a large, high-quality dataset and ample computational resources, training a custom model from scratch might yield better results tailored to your specific problem[5][6].

? Regulatory or Explainability Requirements: In some cases, using a pre-trained model might raise concerns about model interpretability or compliance with regulatory standards. In such situations, developing a custom model with a known architecture and training process might be necessary[5].

? Overfitting Concerns: Fine-tuning can sometimes lead to overfitting, especially when the new dataset is small. If you notice that your fine-tuned model performs well on the training data but poorly on new, unseen data, you might need to explore other approaches[4][6].

RAGs vs Fine-tuning?

We have previously discussed RAG [7] as a mechanism for improving the the ability of LLMs to provide a more contextual response.? ?

Fine-Tuning involves training an LLM on a smaller, specialized dataset to adjust its parameters for specific tasks while RAG involves augmenting an LLM with access to a dynamic, curated database to improve outputs.

Lets compare the key considerations and properties of the two approaches

PURPOSE

?? Fine-tuning: Adapts a pre-trained model to perform well on a specific task or domain.

?? RAG: Enhances a model's ability to generate accurate and relevant responses by incorporating external knowledge.

FUNDAMENTALS

?? Fine-tuning: Involves additional training of a pre-trained model based on a task-specific dataset.

?? RAG: Combines a pre-trained language model with a retrieval system that fetches relevant information from an external knowledge base.

MODEL MODIFICATION

?? Fine-tuning: The underlying model is, including model's weights and architecture are potentially updated.

?? RAG: Doesn't modify the underlying language model but augments its input with retrieved information.

DATA USAGE

?? Fine-tuning: Retraining requires a labeled dataset specific to the target task.

?? RAG: Uses a large knowledge base or corpus of documents that can be queried during inference.

FLEXIBILITY

?? Fine-tuning: The new model is specialized for a particular task or domain, and may not no longer be as suitable for a general application.

?? RAG: The new information added to the knowledge base, so it can can adapt to different topics without retraining. The base model remains the same allowing greater flexibility to its application.

UPDATING KNOWLEDGE

?? Fine-tuning: Requires retraining of the model to incorporate new knowledge.

?? RAG: Knowledge is updated via augmentation, and a separate system (vector database) which can be easily added to as new information becomes available.

COMPUTATIONAL RESOURCES

?? Fine-tuning: Training can be quite computationally intensive, requiring additional resources and costs to fine tune the model

?? RAG: It may may require more resources during inference due to the activities in the retrieval step.

EXPLAINABILITY

?? Fine-tuning: The decision-making process can be less transparent creating concerns in CausalAI

?? RAG: Often more explainable, as you can see which documents were retrieved to inform the response.

Example: Fine-Tuning a BERT Model for Sentiment Analysis

Now, let's examine a practical example of fine-tuning using the BERT (Bidirectional Encoder Representations from Transformers) model for sentiment analysis.?

For this example, we will use the Hugging Face Transformers library, an easy-to-use implementation of many popular pre-trained models.

Here's a Python code sample that demonstrates how to fine-tune a BERT model for sentiment analysis:

In this example we will break down the sections for easier explanation

?Install the necessary libraries

pip install transformers  torch  numpy scikit-learn

?Import the libraries and methods required

import torch

from transformers import BertForSequenceClassification, BertTokenizer, AdamW

from torch.utils.data import DataLoader, TensorDataset

from sklearn.model_selection import train_test_split

from sklearn.metrics import accuracy_score

import numpy as np

? Load the pretrained model - we are using BERT in this example

# Load pre-trained BERT model and tokenizer

model_name = 'bert-base-uncased'

model = BertForSequenceClassification.from_pretrained(model_name, num_labels=2)

tokenizer = BertTokenizer.from_pretrained(model_name)

? Preparing the data set to tune the model

# Prepare your dataset (example data)

texts = ["I love this product!", "This movie was terrible.", "The service was okay."]
labels = [1, 0, 1]? # 1 for positive, 0 for negative

? Now we tokenize and encode the texts for the model

# Tokenize and encode the texts

encodings = tokenizer(texts, truncation=True, padding=True, max_length=128, return_tensors="pt")

input_ids = encodings['input_ids']
attention_mask = encodings['attention_mask']

? Load the data set, we are using Tensor

# Create DataLoader

dataset = TensorDataset(input_ids, attention_mask, torch.tensor(labels))
train_loader = DataLoader(dataset, batch_size=2, shuffle=True)

? Set up the optimizer, in this case we are using the AdamW , a powerful optimization algorithm that can help accelerate the training of deep neural networks and improve their performance you may use other optimizers such as RMSProp and Adadelta depending on the use case

# Set up optimizer

optimizer = AdamW(model.parameters(), lr=2e-5)

? Now we create the fine tuning training with the model created above.

# Fine-tuning loop

num_epochs = 3
for epoch in range(num_epochs):
     model.train()

for batch in train_loader:
    optimizer.zero_grad()

input_ids, attention_mask, labels = batch
outputs = model(input_ids, attention_mask=attention_mask, labels=labels)

loss = outputs.loss
loss.backward()

optimizer.step()

? Now for the final validation of the outputs to ensure that the model fits well

# Evaluation

model.eval()

with torch.no_grad():

inputs = tokenizer("This product exceeded my expectations!", return_tensors="pt")

outputs = model(**inputs)

prediction = torch.argmax(outputs.logits, dim=1)

print(f"Sentiment: {'Positive' if prediction == 1 else 'Negative'}")

? When you execute the python script you will see tensors modeling and a final result like the following

Sentiment: Positive

There you have it a fine tuned model! Happy modelling.

Looking Forward

Fine-Tuning and RAGs are two great methods for improving the outcome of using a Large Language Model. Its important to take into consideration, your use case, cost, speed and scalability and traceability of decisions within your solution when choosing which approach to take. Prompt Engineering which we did not discuss here is another mechanism to improve the outcome from the use of large Language.??

In future articles we will further discuss these to enable you in your design of your AI/ML solutions.

About the Co-Authors

Paul-Benjamin Ramírez is the CTO of Automi and writes about creativity, data and security, regulations, and AI David Willett is a technical leader in AI/ML implementations and a keen researcher in models and approaches and creates accessibility to AI/ML by demystifying the terminology.

References

[1] "Fine-Tuning Pre-Trained Models: Unlocking the Power of Generative AI Applications.", Webisoft. (2024)

[3] "Fine-Tuning AI Models with Your Organization's Data: A Comprehensive Guide." ITMAGINATION. (2024)

[4] "Adapting AI Models: The Strategic Choice Between Fine-Tuning and RAG" Radiansys (2024)

[5] "Rush to Fine-Tune LLMs", Thoughtworks. (2024)

[6] "The Ultimate Guide to LLM Fine Tuning: Best Practices & Tools." Lakera. (2023)

[7] "The AI Tool Box : #1 Combatting Hallucinations with Retrieval-Augmented Generation (RAG)", Ramirez (2024)

AI ToolBox #3: Fine-Tuning in Machine Learning and AI

Paul-Benjamin Ramírez

Co-Founder and CTO @ Automi | Sales and Project Manager | Engineering | Patent-Pending Inventor | Adjunct Fellow UNSW

Introduction

FINE TUNING – A PRIMER?

When to Use Fine-Tuning

Steps for Fine-Tuning a Model

When Not to Use Fine-Tuning

RAGs vs Fine-tuning?

领英推荐

Example: Fine-Tuning a BERT Model for Sentiment Analysis

Looking Forward

About the Co-Authors

References

更多精彩文章

社区洞察

其他会员也浏览了

AI Week In Review: AI Agents are HERE, 5+ NEW AI Models & Much More!

These Are the Top Generative AI Opportunities We’re Seeing in Apps and Tech Infrastructure

Machine learning in the automotive industry

Save R&D Resources Using Machine Learning

Lore Issue #70: Sam Altman Seeks $7 Trillion For AI Chips

The AI Conversation | Benchmarks, copilots, and more

AI & Machine Learning: Driving Next-Level Innovation in Enterprise Solutions

Leading Change in the Era of AI

Did You Just Make Something Automated?

Building SLM vs Fine-Tuning Existing LLMs

Introduction

FINE TUNING – A PRIMER?

When to Use Fine-Tuning

Steps for Fine-Tuning a Model

When Not to Use Fine-Tuning

RAGs vs Fine-tuning?

领英推荐

Example: Fine-Tuning a BERT Model for Sentiment Analysis

Looking Forward

About the Co-Authors

References

The AI ToolBox #2: Vector Search in Machine Learning and AI

2024年9月9日

The AI Tool Box : #1 Combatting Hallucinations with Retrieval-Augmented Generation (RAG)

2024年8月27日

Code, Ethics, and Chaos: AI Digital Guardrails - Part 2: Deepening Understanding and Maturity Model

2024年8月21日

The Creative Spark #9 - Revolutionizing Culinary Arts: Pioneers, AI, and Robotics

2024年8月19日

Code, Ethics, and Chaos: Navigating the AI Frontier with Digital Guardrails - Part 1: Understanding the Landscape

2024年8月13日

The EU AI Act: A Game-Changer for Global Business - What You Need to Know and Do Now

2024年8月6日

The Creative Spark #8 - Soaring to New Heights: The Future of Air Travel in the Age of AI

2024年8月4日

Global Harmony or Regulatory Chaos? AI's Role in Unifying Medical Device Laws

2024年7月31日

The Creative Spark #7 - Emotional Intelligence: The Human Creative Edge in the Age of AI

2024年7月29日

AI and Data Privacy: Navigating the Complexities of the Digital Age (Article #3)

2024年7月29日

社区洞察

其他会员也浏览了

AI Week In Review: AI Agents are HERE, 5+ NEW AI Models & Much More!

These Are the Top Generative AI Opportunities We’re Seeing in Apps and Tech Infrastructure

Machine learning in the automotive industry

Save R&D Resources Using Machine Learning

Lore Issue #70: Sam Altman Seeks $7 Trillion For AI Chips

The AI Conversation | Benchmarks, copilots, and more

AI & Machine Learning: Driving Next-Level Innovation in Enterprise Solutions

Leading Change in the Era of AI

Did You Just Make Something Automated?

Building SLM vs Fine-Tuning Existing LLMs