Beyond Prompts: Fine-Tuning Your LLM
Hari Galla
Techno-Functional Manager | Process Mining Consultant | Intelligent Automation | Generative AI & ML | FINTECH & Emerging Trends | Digital Transformation| Trainer & Mentor | Tech Talk | Partnering for Client Success |
WHY FINE TUNING?
While both prompt engineering and fine-tuning aim to enhance the capabilities of large language models (LLMs), they tackle different challenges. Here's a breakdown of some key limitations addressed by fine-tuning but not by prompt engineering, along with illustrative examples:
Prompt Challenge 1: Knowledge Gap
Example: Imagine asking an LLM to diagnose an illness. A well-crafted prompt can guide it through symptoms, but without medical knowledge, the LLM might miss crucial details.
Solution: Fine-tuning exposes the LLM to a vast dataset of labeled medical cases, equipping it with the knowledge needed for accurate diagnoses.
Prompt Challenge 2: Limited Control
Example: You ask an LLM to write a persuasive essay. While a prompt might outline the arguments, the LLM might struggle to maintain a coherent flow or address counter-arguments effectively.
Solution: Fine-tuning can train the LLM on specific reasoning patterns and argument structures, enabling it to construct logical arguments and build a compelling case.
Fine-Tuning LLMs for Real-World Tasks: A Step-by-Step Approach
Data Acquisition:
# The instruction dataset to use
dataset_name = "mlabonne/guanaco-llama2-1k"
Model LLM:
领英推荐
# The model that you want to train from the Hugging Face hub
model_name = "NousResearch/Llama-2-7b-chat-hf"
# Fine-tuned model name
new_model = "Llama-2-7b-chat-finetune"
Specify Fine Tuning Parameters
################################################################################
# QLoRA parameters
################################################################################
# LoRA attention dimension
lora_r = 64
# Alpha parameter for LoRA scaling
lora_alpha = 16
# Dropout probability for LoRA layers
lora_dropout = 0.1
################################################################################
# bitsandbytes parameters
################################################################################
# Activate 4-bit precision base model loading
use_4bit = True
# Compute dtype for 4-bit base models
bnb_4bit_compute_dtype = "float16"
# Quantization type (fp4 or nf4)
bnb_4bit_quant_type = "nf4"
# Activate nested quantization for 4-bit base models (double quantization)
use_nested_quant = False
################################################################################
# SFT parameters
################################################################################
# Maximum sequence length to use
max_seq_length = None
# Pack multiple short examples in the same input sequence to increase efficiency
packing = False
# Load the entire model on the GPU 0
device_map = {"": 0}
Fine Tuning Configuration
# Load LLaMA tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right" # Fix weird overflow issue with fp16 training
# Load LoRA configuration
peft_config = LoraConfig(
lora_alpha=lora_alpha,
lora_dropout=lora_dropout,
r=lora_r,
bias="none",
task_type="CAUSAL_LM",
)
# Set training parameters
training_arguments = TrainingArguments(
output_dir=output_dir,
num_train_epochs=num_train_epochs,
per_device_train_batch_size=per_device_train_batch_size,
gradient_accumulation_steps=gradient_accumulation_steps,
optim=optim,
save_steps=save_steps,
logging_steps=logging_steps,
learning_rate=learning_rate,
weight_decay=weight_decay,
fp16=fp16,
bf16=bf16,
max_grad_norm=max_grad_norm,
max_steps=max_steps,
warmup_ratio=warmup_ratio,
group_by_length=group_by_length,
lr_scheduler_type=lr_scheduler_type,
report_to="tensorboard"
)
# Set supervised fine-tuning parameters
trainer = SFTTrainer(
model=model,
train_dataset=dataset,
peft_config=peft_config,
dataset_text_field="text",
max_seq_length=max_seq_length,
tokenizer=tokenizer,
args=training_arguments,
packing=packing,
)
Model Training & Saving
# Train model
trainer.train()
# Save trained model
trainer.model.save_pretrained(new_model)
Conclusion: Remember, choosing the right technique depends on your needs. Prompting offers flexibility, while fine-tuning empowers the LLM with deeper knowledge and stronger control - like choosing the perfect tools for the job!