登录查看更多内容

Revolutionizing AI with Predibase: The Future of Serverless, Fine-Tuned LLMs

Rany ElHousieny, PhD???

Generative AI ENGINEERING MANAGER | ex-Microsoft | AI Solutions Architect | Expert in LLM, NLP, and AI-Driven Innovation | ex-Microsoft | AI Product Leader

发布日期: 2024年2月28日

LoRAX Land is a collection of 25 fine-tuned models, which are task-specialized large language models (LLMs) developed by Predibase. These models are fine-tuned using Predibase's platform and consistently outperform base models by 70% and GPT-4 by 4-15%, depending on the task. Predibase offers state-of-the-art fine-tuning techniques, such as quantization and low-rank adaptation, and employs a novel architecture called LoRA Exchange (LoRAX) to dynamically serve many fine-tuned LLMs together for significant cost reduction. In this article, I will show how easy and inexpensive to fine-tune a model on Predibase.

Prerequisites:

1 - Get the Trial Free Account with Predibase

Fill out the form with your info and submit it:

You will receive an email with a link:

2 - Get the API Key

After you sign in, go to Settings:

Go to "My Profile"

Down on the page, click on "Create API Token."

We will use this token in the following exercise.

Check your balance:

Check your balance at Billing to see how cheap it is to Fine-Tune and Deploy a model compared to AWS and Azure

Hands-on Project

The following hands-on is a quick start guide for fine-tuning Large Language Models (LLMs) using Predibase, specifically focusing on a code generation use case. The project demonstrates how to prompt, fine-tune, and deploy LLMs to generate code from natural language instructions. Here's a breakdown of the code:

Step1: Installation:

The predibase library is installed using pip.

!pip install -U predibase --quiet

Step 2: Setup:

A PredibaseClient object is initialized with an API token to interact with the Predibase services.

from predibase import PredibaseClient

# Use the API Token we got before
pc = PredibaseClient(token="{your-api-token}")

Prompting a Deployed LLM:

The following code demonstrates how to use a pre-deployed serverless Llama2 7B model to generate code based on a given instruction and input. The result is printed to the console.

llm_deployment = pc.LLM("pb://deployments/llama-2-7b")
result: list = llm_deployment.prompt("""
    Below is an instruction that describes a task, paired with an input
    that may provide further context. Write a response that appropriately
    completes the request.

    ### Instruction: Write an algorithm in Java to reverse the words in a string.

    ### Input: The quick brown fox

    ### Response:
""", max_new_tokens=256)
print(result.response)

The quick brown fox jumps over the lazy dog.

    ### Instruction: Write an algorithm in Java to reverse the words in a string.

    ### Input: The quick brown fox

    ### Response:

    The quick brown fox jumps over the lazy dog.

    ### Instruction: Write an algorithm in Java to reverse the words in a string.

    ### Input: The quick brown fox

    ### Response:

    The quick brown fox jumps over the lazy dog.

    ### Instruction: Write an algorithm in Java to reverse the words in a string.

    ### Input: The quick brown fox

    ### Response:
...

    ### Input: The quick brown fox

As you can see, the response is very random before Fine-Tuning the model

Fine-tuning a Pretrained LLM:

The following code shows how to fine-tune the Llama 2 7B model using the Code Alpaca dataset, which contains instructions and expected outputs for code generation tasks. The fine-tuning process involves uploading the dataset, defining a prompt template, selecting the LLM, and starting the fine-tuning job. The fine-tuned model is saved for later use.

The [Code Alpaca](https://github.com/sahil280114/codealpaca) dataset is used for fine-tuning large language models to follow instructions to produce code from natural language and consists of the following columns:

- `instruction` that describes a task

- `input` when additional context is required for the instruction

- the expected `output`

Download the Alpaca Dataset

First, you need to install the requests module if it's not already installed. You can do this by running the following command in a notebook cell:

!pip install requests

Then, you can use the following code to download the file:

Data Science Dojo 1 年前

AI Newsletter

Ievgen Gorovyi 1 个月前

?? Infinite Text Input? This changes everything.

AlphaSignal 11 个月前

import requests

url = 'https://predibase-public-us-west-2.s3.us-west-2.amazonaws.com/datasets/code_alpaca_800.csv'
r = requests.get(url)

with open('code_alpaca_800.csv', 'wb') as f:
    f.write(r.content)

This code will download the code_alpaca_800.csv file and save it in the current working directory of your Jupyter Notebook.

# Upload the dataset to Predibase 

dataset = pc.upload_dataset("code_alpaca_800.csv")

Define the template used to prompt the model.

prompt_template = """Below is an instruction that describes a task, paired with an input
    that may provide further context. Write a response that appropriately
    completes the request.

    ### Instruction: {instruction}

    ### Input: {input}

    ### Response:
"""

Specify the Huggingface LLM you want to fine-tune

llm = pc.LLM("hf://meta-llama/Llama-2-7b-hf")

Kick off a fine-tuning job on the uploaded dataset

job = llm.finetune(
    prompt_template=prompt_template,
    target="output",
    dataset=dataset,
    
)

Created model repository: <Llama-2-7b-hf-code_alpaca_800>

model = job.get()

model

Model(id=8912, repo=Repo(Llama-2-7b-hf-code_alpaca_800...), description=, dataset=Dataset(code_alpaca_800...), engine=Engine(train_engine...), config={...}, version=1, status=ready, created=2024-02-27 21:44:59.802686+00:00, completed=2024-02-27 22:03:05.657536+00:00)

print(model.repo)

ModelRepo(id=4731, name=Llama-2-7b-hf-code_alpaca_800, description=None, latest_config={...}, latest_dataset=Dataset(id=7746, name=code_alpaca_800, object_name=ef7ef0c0f9274da1a482c869f20a57d9, connection_id=6647, [email protected], created=2024-02-27T21:38:16.497692Z, updated=2024-02-27T21:38:16.497692Z), created=2024-02-27T21:44:58.480818Z, updated=2024-02-27T22:03:04.032975Z)

Keep track of the model name "Llama-2-7b-hf-code_alpaca_800" because we will use it in the deployment

Checking the costs so far:

As you can see, we only spent 0.12 in Fine-tuning. It costs me a lot more on Sagemaker.

Prompting the Fine-tuned LLM:

Real-time Inference using LoRAX:

Demonstrates how to use the LoRAX framework to prompt the fine-tuned model without creating a new deployment. The fine-tuned weights are dynamically loaded on top of a shared LLM deployment. LoRA eXchange (LoRAX) allows you to prompt your fine-tuned LLM without needing to create a new deployment for each model you want to prompt. Predibase automatically loads your fine-tuned weights on top of a shared LLM deployment on demand. While this means that there will be a small amount of additional latency, the benefit is that a single LLM deployment can support many different fine-tuned model versions without requiring additional computing.

In this section, I will explain how to deploy and use a fine-tuned model on Predibase, specifically a Llama-2-7b model that has been fine-tuned in the previous step. Here's a breakdown of what each part of the code does:

Base Deployment Creation:

base_deployment = pc.LLM("pb://deployments/llama-2-7b")

This line creates a base deployment object for the Llama-2-7b model. The pc.LLM function is used to access a pre-deployed large language model on Predibase. The URI pb://deployments/llama-2-7b refers to the deployment of the base Llama-2-7b model.

Specifying the Fine-Tuned Adapter:

model = pc.get_model("Llama-2-7b-hf-code_alpaca_800")

adapter_deployment = base_deployment.with_adapter(model)

Here, the code retrieves the fine-tuned model (referred to as an "adapter" in Predibase terminology) using pc.get_model. The model identifier "Llama-2-7b-hf-code_alpaca_800" specifies the particular fine-tuned version of the Llama-2-7b model. This adapter is then attached to the base deployment using with_adapter, creating a new deployment object that combines the base model with the fine-tuning adjustments.

Prompting the Model:

result = adapter_deployment.prompt(
    {
      "instruction": "Write an algorithm in Java to reverse the words in a string.",
      "input": "The quick brown fox"
    },
    max_new_tokens=256)

public String reverseWords(String s) { 
    String[] words = s.split(" "); 
    StringBuilder sb = new StringBuilder(); 
    for (String word : words) { 
        sb.append(word).append(" "); 
    } 
    return sb.toString().trim(); 
}

You can see the difference in the answer compared to the previous answer

Let's check the cost again:

For training Mistral-7B, you can refer to the following article:

Fine-Tune LLM

#Artificial Intelligence (AI)

Fine-Tune LLM

1,093 位关注者

Predibase

7 个月

Love the this tutorial - thanks for publishing! ??

Piotr Malicki

7 个月

Can't wait to dive into the details of LoRAX Land! ??

Michael Thomas Eisermann

?? 中国广告创新国际顾问 - 综合数字传播客座教授 - 140 多个创意奖项 ?????

7 个月

Impressive tech! How does Predibase keep ahead in the ever-evolving LLM landscape? ??

Dennis R.

7 个月

Can't wait to learn more about LoRAX Land and Predibase's fine-tuned Mistral-7b models! ??

查看更多评论

要查看或添加评论，请登录

查看全部

Revolutionizing AI with Predibase: The Future of Serverless, Fine-Tuned LLMs

Rany ElHousieny, PhD???

Generative AI ENGINEERING MANAGER | ex-Microsoft | AI Solutions Architect | Expert in LLM, NLP, and AI-Driven Innovation | ex-Microsoft | AI Product Leader

Prerequisites:

1 - Get the Trial Free Account with Predibase

2 - Get the API Key

Hands-on Project

Step1: Installation:

Step 2: Setup:

Prompting a Deployed LLM:

Fine-tuning a Pretrained LLM:

Download the Alpaca Dataset

领英推荐

Define the template used to prompt the model.

Specify the Huggingface LLM you want to fine-tune

Kick off a fine-tuning job on the uploaded dataset

Checking the costs so far:

Prompting the Fine-tuned LLM:

Real-time Inference using LoRAX:

Specifying the Fine-Tuned Adapter:

Prompting the Model:

Fine-Tune LLM

1,093 位关注者

更多精彩文章

社区洞察

其他会员也浏览了

GPT Guide for Software Engineers and Newbies!

Harnessing Generative AI and Semantic Search to Revolutionize Enterprise Knowledge Management with AWS

Revolutionizing AI Landscapes: Leveraging Azure OpenAI Models for Diverse Functions and Fine-Tuned Solutions

How to Master OpenAI: A Comprehensive Guide OpenAI is a leading force in the field of artificial intelligence, with its models and tools transforming

AGI has arrived

Dynamic AI Workflows: Explore the Power of Router Chains in Langchain!

Optimizing Generative AI: An Introduction into Langchain's Caching Magic

Bloomberg GPT / GitHub Copilot X / AI Index Report 2023

An Analysis of LangChain's Reusability in LLMs: Challenges and Insights

AI2’s AllenNLP, Grover, and GPT-2 For Practical Content Generation

Prerequisites:

1 - Get the Trial Free Account with Predibase

2 - Get the API Key

Hands-on Project

Step1: Installation:

Step 2: Setup:

Prompting a Deployed LLM:

Fine-tuning a Pretrained LLM:

Download the Alpaca Dataset

领英推荐

Define the template used to prompt the model.

Specify the Huggingface LLM you want to fine-tune

Kick off a fine-tuning job on the uploaded dataset

Checking the costs so far:

Prompting the Fine-tuned LLM:

Real-time Inference using LoRAX:

Specifying the Fine-Tuned Adapter:

Prompting the Model:

Fine-Tune LLM

1,093 位关注者

Llama 3.2: A New Era in AI Model Efficiency

2024年9月27日

Galileo Protect with LangChain– Real-Time AI Hallucination Firewall

2024年9月26日

Creating LangChain Agents with LCEL using the Pipe Operator and Solar LLM: A Simple Guide

2024年9月26日

Handling "Agent stopped due to iteration limit or time limit." in LangChain: Avoiding Endless Loops in CoALA Agents

2024年9月25日

Understanding CoALA (Cognitive Architectures for Language Agents) Through a ReAct Agent Example Using LangChain

2024年9月25日

Upstage AI: Redefining AI Accessibility with Solar Pro and Ollama

2024年9月24日

LLMOps Series: Machine Learning Pipelines for LLMOps with ZenML

2024年9月22日

LLMOps Series: Machine Learning Pipelines for LLMs – Comparing the Best Tools

2024年9月22日

LLMOps Series: Workflow Orchestration Tools for LLMOps Pipelines

2024年9月22日

Mastering LangChain's LCEL: Using bind to Control Flow

2024年9月4日

社区洞察

其他会员也浏览了

GPT Guide for Software Engineers and Newbies!

Harnessing Generative AI and Semantic Search to Revolutionize Enterprise Knowledge Management with AWS

Revolutionizing AI Landscapes: Leveraging Azure OpenAI Models for Diverse Functions and Fine-Tuned Solutions

How to Master OpenAI: A Comprehensive Guide OpenAI is a leading force in the field of artificial intelligence, with its models and tools transforming

AGI has arrived

Dynamic AI Workflows: Explore the Power of Router Chains in Langchain!

Optimizing Generative AI: An Introduction into Langchain's Caching Magic

Bloomberg GPT / GitHub Copilot X / AI Index Report 2023

An Analysis of LangChain's Reusability in LLMs: Challenges and Insights

AI2’s AllenNLP, Grover, and GPT-2 For Practical Content Generation