Google Colab Notebook to Step by Step finetune BLOOMZ-3B #LLM using #LoRA

Notebook Link: https://gist.github.com/chakra4/a0787cefeab4943b87cf8e046609a9fa

Model: I fine tuned BLOOMZ-3B, a large language model trained by the BigScience team. The full paper is available here!

Dataset: I used lamini_docs question answering dataset from Hugging Face.

Training Infra: I have used Google Colab T4 runtime with 16GB RAM, so failed to do Full Fine Tuning.

Failed to perform Full Fine Tuning in the Infra: To load 3B parameter model in BFLOAT16 requires 6GB memory for inference & 120GB memory for training.

  • We need 20X memory for training compared to inference to save Adam optimizer states, gradients, activations and temporary variables during training.
  • To load a 1 Billion parameter model in BFLOAT16 precision requires 2GB RAM, as 1 parameter requires 2 byte.
  • To train a 1 Billion parameter model in BFLOAT16 precision requires 2GB x 20 RAM = 40 GB RAM.

LoRA: Fine Tuned using LoRA and achieve good results in 3 epochs, as the universal model already has a good language understanding.

  • For fine tuning LLMs we don't need to finetune every single parameter.
  • In LoRA we generally fine tune 1% of the parameters.
  • Researchers have found that applying LoRA to just the self-attention layers of the model is often enough to achieve performance gains.
  • Idea behind LORA is to take these matrices like query, key and inject Low Rank metrics beside that.
  • This Low rank matrix is updated in the fine tuning and pre-trained weights are frozen.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了