LLM Deployment Pipeline with Azure and Kubeflow

o deploy model espcially LLM based application in Azure can be daunting task manually. We can automate the deployment pipeline with Kubeflow.?

I am providing one example of an end-to-end machine learning deployment pipeline using Kubeflow on Azure. This example will cover setting up a Kubeflow pipeline, training a model, and deploying the model.


Prerequisites:


1. Azure Account: You need an Azure account.

2. Azure Kubernetes Service (AKS): You need a Kubernetes cluster. You can create an AKS cluster via the Azure portal or CLI.

3. Kubeflow: You need Kubeflow installed on your AKS cluster. Follow the [Kubeflow on Azure documentation](https://www.kubeflow.org/docs/azure/aks/) to set this up.


Step 1: Setting Up the Environment


First, ensure you have the Azure CLI and kubectl installed and configured.


```sh

# Install Azure CLI

curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash


# Install kubectl

az aks install-cli


# Log in to Azure

az login


# Set the subscription (if you have multiple subscriptions)

az account set --subscription "<your-subscription-id>"


# Get credentials for your AKS cluster

az aks get-credentials --resource-group <resource-group-name> --name <aks-cluster-name>

```


Step 2: Deploying Kubeflow on AKS


Follow the official Kubeflow deployment guide for Azure AKS:

[Deploy Kubeflow on Azure AKS](https://www.kubeflow.org/docs/azure/aks/)


Step 3: Creating a Kubeflow Pipeline


We'll create a simple pipeline that trains and deploys a machine learning model.


Pipeline Definition


Create a file pipeline.py:


```python

import kfp

from kfp import dsl

from kfp.components import create_component_from_func


def train_model() -> str:

? ? import pandas as pd

? ? from sklearn.datasets import load_iris

? ? from sklearn.linear_model import LogisticRegression

? ? from sklearn.model_selection import train_test_split

? ? import joblib


? ? iris = load_iris()

? ? X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.2)

? ??

? ? clf = LogisticRegression()

? ? clf.fit(X_train, y_train)

? ??

? ? accuracy = clf.score(X_test, y_test)

? ? print(f"Model accuracy: {accuracy}")

? ??

? ? model_path = "/model.pkl"

? ? joblib.dump(clf, model_path)

? ??

? ? return model_path


train_model_op = create_component_from_func(

? ? train_model, base_image='python:3.8-slim'

)


@dsl.pipeline(

? ? name='Iris Training Pipeline',

? ? description='A pipeline to train and deploy an Iris classification model.'

)

def iris_pipeline():

? ? train_task = train_model_op()

? ??

if name == '__main__':

? ? kfp.compiler.Compiler().compile(iris_pipeline, 'iris_pipeline.yaml')

```


Step 4: Deploying the Pipeline


Upload the pipeline to your Kubeflow instance.


```sh

pip install kfp


kfp_client = kfp.Client()

kfp_client.upload_pipeline(pipeline_package_path='iris_pipeline.yaml', pipeline_name='Iris Training Pipeline')

```


Step 5: Running the Pipeline


Once the pipeline is uploaded, you can run it via the Kubeflow dashboard or programmatically.


```python

# Run the pipeline

experiment = kfp_client.create_experiment('Iris Experiment')

run = kfp_client.run_pipeline(experiment.id, 'iris_pipeline_run', 'iris_pipeline.yaml')

```


Step 6: Deploying the Model


Assuming the trained model is saved in a storage bucket, you can create a deployment pipeline to deploy the model to Azure Kubernetes Service (AKS).


Model Deployment Component


Create a file deploy.py:


```python

from kubernetes import client, config


def deploy_model(model_path: str):

? ? config.load_kube_config()

? ??

? ? # Define deployment specs

? ? deployment = client.V1Deployment(

? ? ? ? metadata=client.V1ObjectMeta(name="iris-model-deployment"),

? ? ? ? spec=client.V1DeploymentSpec(

? ? ? ? ? ? replicas=1,

? ? ? ? ? ? selector={'matchLabels': {'app': 'iris-model'}},

? ? ? ? ? ? template=client.V1PodTemplateSpec(

? ? ? ? ? ? ? ? metadata=client.V1ObjectMeta(labels={'app': 'iris-model'}),

? ? ? ? ? ? ? ? spec=client.V1PodSpec(containers=[client.V1Container(

? ? ? ? ? ? ? ? ? ? name="iris-model",

? ? ? ? ? ? ? ? ? ? image="mydockerhub/iris-model:latest",

? ? ? ? ? ? ? ? ? ? ports=[client.V1ContainerPort(container_port=80)]

? ? ? ? ? ? ? ? )])

? ? ? ? ? ? )

? ? ? ? )

? ? )

? ??

? ? # Create deployment

? ? apps_v1 = client.AppsV1Api()

? ? apps_v1.create_namespaced_deployment(namespace="default", body=deployment)


deploy_model_op = create_component_from_func(

? ? deploy_model, base_image='python:3.8-slim'

)


@dsl.pipeline(

? ? name='Iris Deployment Pipeline',

? ? description='A pipeline to deploy an Iris classification model.'

)

def iris_deploy_pipeline(model_path: str):

? ? deploy_task = deploy_model_op(model_path)


if name == '__main__':

? ? kfp.compiler.Compiler().compile(iris_deploy_pipeline, 'iris_deploy_pipeline.yaml')

```


Step 7: Running the Deployment Pipeline


Upload and run the deployment pipeline.


```sh

# Upload the deployment pipeline

kfp_client.upload_pipeline(pipeline_package_path='iris_deploy_pipeline.yaml', pipeline_name='Iris Deployment Pipeline')


# Run the deployment pipeline

experiment = kfp_client.create_experiment('Iris Deployment Experiment')

run = kfp_client.run_pipeline(experiment.id, 'iris_deploy_pipeline_run', 'iris_deploy_pipeline.yaml', params={'model_path': '<path-to-your-model>'})

```


Conclusion

This end-to-end example demonstrates setting up a Kubeflow pipeline on Azure, training a model, and deploying it to AKS. Customize the model_path, Docker image, and other specifics as needed for your actual use case.


Deploying a Large Language Model (LLM) involves a few additional steps compared to a general machine learning model. Here’s how you can set up an end-to-end deployment pipeline for an LLM using Kubeflow on Azure, similar to the previous example.


Prerequisites


Ensure you have the necessary tools and environment set up as mentioned in the previous steps, including an Azure account, AKS cluster, and Kubeflow.


Step 1: Setting Up the Environment


Use the same steps as before to install Azure CLI, kubectl, and configure your environment.


Step 2: Deploying Kubeflow on AKS


Follow the official Kubeflow deployment guide for Azure AKS:

[Deploy Kubeflow on Azure AKS](https://www.kubeflow.org/docs/azure/aks/)


Step 3: Creating a Kubeflow Pipeline for LLM


Let's create a pipeline that fine-tunes a Hugging Face LLM and deploys it.


Pipeline Definition


Create a file llm_pipeline.py:


```python

import kfp

from kfp import dsl

from kfp.components import create_component_from_func


def train_llm() -> str:

? ? from transformers import AutoModelForCausalLM, AutoTokenizer, Trainer, TrainingArguments

? ? from datasets import load_dataset

? ? import torch


? ? # Load dataset

? ? dataset = load_dataset("wikitext", "wikitext-2-raw-v1")


? ? # Load model and tokenizer

? ? model_name = "gpt2"

? ? model = AutoModelForCausalLM.from_pretrained(model_name)

? ? tokenizer = AutoTokenizer.from_pretrained(model_name)


? ? def tokenize_function(examples):

? ? ? ? return tokenizer(examples["text"], padding="max_length", truncation=True)


? ? tokenized_datasets = dataset.map(tokenize_function, batched=True)

? ? tokenized_datasets = tokenized_datasets.remove_columns(["text"])

? ? tokenized_datasets.set_format("torch")


? ? # Define training arguments

? ? training_args = TrainingArguments(

? ? ? ? output_dir="./results",

? ? ? ? evaluation_strategy="epoch",

? ? ? ? learning_rate=2e-5,

? ? ? ? per_device_train_batch_size=8,

? ? ? ? per_device_eval_batch_size=8,

? ? ? ? num_train_epochs=3,

? ? ? ? weight_decay=0.01,

? ? )


? ? # Create Trainer

? ? trainer = Trainer(

? ? ? ? model=model,

? ? ? ? args=training_args,

? ? ? ? train_dataset=tokenized_datasets["train"],

? ? ? ? eval_dataset=tokenized_datasets["validation"],

? ? )


? ? # Train model

? ? trainer.train()


? ? # Save model

? ? model_path = "/model"

? ? model.save_pretrained(model_path)

? ? tokenizer.save_pretrained(model_path)


? ? return model_path


train_llm_op = create_component_from_func(

? ? train_llm, base_image='python:3.8-slim'

)


@dsl.pipeline(

? ? name='LLM Training Pipeline',

? ? description='A pipeline to train and deploy a Large Language Model.'

)

def llm_pipeline():

? ? train_task = train_llm_op()

? ??

if name == '__main__':

? ? kfp.compiler.Compiler().compile(llm_pipeline, 'llm_pipeline.yaml')

```


Step 4: Deploying the Pipeline


Upload the pipeline to your Kubeflow instance.


```sh

pip install kfp


kfp_client = kfp.Client()

kfp_client.upload_pipeline(pipeline_package_path='llm_pipeline.yaml', pipeline_name='LLM Training Pipeline')

```


Step 5: Running the Pipeline


Once the pipeline is uploaded, run it via the Kubeflow dashboard or programmatically.


```python

# Run the pipeline

experiment = kfp_client.create_experiment('LLM Experiment')

run = kfp_client.run_pipeline(experiment.id, 'llm_pipeline_run', 'llm_pipeline.yaml')

```


Step 6: Deploying the Model


Create a deployment pipeline to deploy the LLM to Azure Kubernetes Service (AKS).


Model Deployment Component


Create a file deploy_llm.py:


```python

from kubernetes import client, config


def deploy_llm(model_path: str):

? ? config.load_kube_config()

? ??

? ? # Define deployment specs

? ? deployment = client.V1Deployment(

? ? ? ? metadata=client.V1ObjectMeta(name="llm-deployment"),

? ? ? ? spec=client.V1DeploymentSpec(

? ? ? ? ? ? replicas=1,

? ? ? ? ? ? selector={'matchLabels': {'app': 'llm'}},

? ? ? ? ? ? template=client.V1PodTemplateSpec(

? ? ? ? ? ? ? ? metadata=client.V1ObjectMeta(labels={'app': 'llm'}),

? ? ? ? ? ? ? ? spec=client.V1PodSpec(containers=[client.V1Container(

? ? ? ? ? ? ? ? ? ? name="llm",

? ? ? ? ? ? ? ? ? ? image="mydockerhub/llm:latest",

? ? ? ? ? ? ? ? ? ? ports=[client.V1ContainerPort(container_port=80)],

? ? ? ? ? ? ? ? ? ? volume_mounts=[client.V1VolumeMount(mount_path="/model", name="model-volume")]

? ? ? ? ? ? ? ? )],

? ? ? ? ? ? ? ? volumes=[client.V1Volume(

? ? ? ? ? ? ? ? ? ? name="model-volume",

? ? ? ? ? ? ? ? ? ? persistent_volume_claim=client.V1PersistentVolumeClaimVolumeSource(claim_name="model-pvc")

? ? ? ? ? ? ? ? )])

? ? ? ? ? ? )

? ? ? ? )

? ? )

? ??

? ? # Create deployment

? ? apps_v1 = client.AppsV1Api()

? ? apps_v1.create_namespaced_deployment(namespace="default", body=deployment)


deploy_llm_op = create_component_from_func(

? ? deploy_llm, base_image='python:3.8-slim'

)


@dsl.pipeline(

? ? name='LLM Deployment Pipeline',

? ? description='A pipeline to deploy a Large Language Model.'

)

def llm_deploy_pipeline(model_path: str):

? ? deploy_task = deploy_llm_op(model_path)


if name == '__main__':

? ? kfp.compiler.Compiler().compile(llm_deploy_pipeline, 'llm_deploy_pipeline.yaml')

```


Step 7: Running the Deployment Pipeline


Upload and run the deployment pipeline.


```sh

# Upload the deployment pipeline

kfp_client.upload_pipeline(pipeline_package_path='llm_deploy_pipeline.yaml', pipeline_name='LLM Deployment Pipeline')


# Run the deployment pipeline

experiment = kfp_client.create_experiment('LLM Deployment Experiment')

run = kfp_client.run_pipeline(experiment.id, 'llm_deploy_pipeline_run', 'llm_deploy_pipeline.yaml', params={'model_path': '<path-to-your-model>'})

```


Conclusion

This example demonstrates how to create a Kubeflow pipeline for training and deploying a Large Language Model (LLM) on Azure Kubernetes Service (AKS). Adjust the model_path, Docker image, and other specifics as needed for your actual use case. The steps involve setting up the pipeline, running the training, and deploying the trained model, all within the Kubeflow framework.


To deploy containerized LLMs with Kubeflow on Azure, you'll need to follow these steps:


1. Containerize Your LLM: Create a Docker image of your LLM application.

2. Push the Docker Image to a Container Registry: Push the Docker image to Azure Container Registry (ACR) or Docker Hub.

3. Create a Kubeflow Pipeline for Deployment: Define a Kubeflow pipeline to deploy your LLM application using the Docker image.

4. Run the Deployment Pipeline: Execute the pipeline to deploy your LLM application on AKS.


Step 1: Containerize Your LLM


Create a Dockerfile for your LLM application.


Example Dockerfile


```Dockerfile

# Use an official Python runtime as a parent image

FROM python:3.11-slim


# Set the working directory in the container

WORKDIR /app


# Copy the current directory contents into the container at /app

COPY . /app


# Install any needed packages specified in requirements.txt

RUN pip install --no-cache-dir -r requirements.txt


# Make port 80 available to the world outside this container

EXPOSE 80


# Define environment variable

ENV NAME World


# Run app.py when the container launches

CMD ["python", "app.py"]

```


Example app.py


```python

from flask import Flask, request, jsonify

from transformers import AutoModelForCausalLM, AutoTokenizer


app = Flask(__name__)


model_name = "gpt2"

model = AutoModelForCausalLM.from_pretrained(model_name)

tokenizer = AutoTokenizer.from_pretrained(model_name)


@app.route('/predict', methods=['POST'])

def predict():

? ? data = request.json

? ? inputs = tokenizer.encode(data['text'], return_tensors='pt')

? ? outputs = model.generate(inputs)

? ? response = tokenizer.decode(outputs[0], skip_special_tokens=True)

? ? return jsonify({'response': response})


if name == '__main__':

? ? app.run(host='0.0.0.0', port=80)

```


Build and Push Docker Image


```sh

# Build the Docker image

docker build -t mydockerhub/llm:latest .


# Push the Docker image to Docker Hub or ACR

docker push mydockerhub/llm:latest

```


Step 2: Push Docker Image to Azure Container Registry


If you prefer to use ACR:


```sh

# Log in to Azure

az login


# Create an ACR if you don't have one

az acr create --resource-group <your-resource-group> --name <your-registry-name> --sku Basic


# Log in to the ACR

az acr login --name <your-registry-name>


# Tag the Docker image with the ACR login server name

docker tag mydockerhub/llm:latest <your-registry-name>.azurecr.io/llm:latest


# Push the Docker image to ACR

docker push <your-registry-name>.azurecr.io/llm:latest

```


Step 3: Create a Kubeflow Pipeline for Deployment


Create a deployment pipeline to deploy the containerized LLM.


Deployment Component


Create a file deploy_llm.py:


```python

from kubernetes import client, config

from kfp.components import create_component_from_func

from kfp import dsl


def deploy_llm(image: str):

? ? config.load_kube_config()


? ? deployment = client.V1Deployment(

? ? ? ? metadata=client.V1ObjectMeta(name="llm-deployment"),

? ? ? ? spec=client.V1DeploymentSpec(

? ? ? ? ? ? replicas=1,

? ? ? ? ? ? selector={'matchLabels': {'app': 'llm'}},

? ? ? ? ? ? template=client.V1PodTemplateSpec(

? ? ? ? ? ? ? ? metadata=client.V1ObjectMeta(labels={'app': 'llm'}),

? ? ? ? ? ? ? ? spec=client.V1PodSpec(containers=[client.V1Container(

? ? ? ? ? ? ? ? ? ? name="llm",

? ? ? ? ? ? ? ? ? ? image=image,

? ? ? ? ? ? ? ? ? ? ports=[client.V1ContainerPort(container_port=80)]

? ? ? ? ? ? ? ? )])

? ? ? ? ? ? )

? ? ? ? )

? ? )


? ? service = client.V1Service(

? ? ? ? metadata=client.V1ObjectMeta(name="llm-service"),

? ? ? ? spec=client.V1ServiceSpec(

? ? ? ? ? ? selector={'app': 'llm'},

? ? ? ? ? ? ports=[client.V1ServicePort(protocol="TCP", port=80, target_port=80)]

? ? ? ? )

? ? )


? ? apps_v1 = client.AppsV1Api()

? ? core_v1 = client.CoreV1Api()


? ? apps_v1.create_namespaced_deployment(namespace="default", body=deployment)

? ? core_v1.create_namespaced_service(namespace="default", body=service)


deploy_llm_op = create_component_from_func(

? ? deploy_llm, base_image='python:3.8-slim'

)


@dsl.pipeline(

? ? name='LLM Deployment Pipeline',

? ? description='A pipeline to deploy a containerized LLM.'

)

def llm_deploy_pipeline(image: str):

? ? deploy_task = deploy_llm_op(image=image)


if name == '__main__':

? ? kfp.compiler.Compiler().compile(llm_deploy_pipeline, 'llm_deploy_pipeline.yaml')

```


Step 4: Run the Deployment Pipeline


Upload and run the deployment pipeline.


```sh

# Upload the deployment pipeline

kfp_client = kfp.Client()

kfp_client.upload_pipeline(pipeline_package_path='llm_deploy_pipeline.yaml', pipeline_name='LLM Deployment Pipeline')


# Run the deployment pipeline

experiment = kfp_client.create_experiment('LLM Deployment Experiment')

run = kfp_client.run_pipeline(

? ? experiment.id,?

? ? 'llm_deploy_pipeline_run',?

? ? 'llm_deploy_pipeline.yaml',?

? ? params={'image': '<your-registry-name>.azurecr.io/llm:latest'}

)

```


Conclusion

By following these steps, you can deploy a containerized LLM using Kubeflow on Azure. This process involves containerizing your LLM application, pushing the Docker image to a container registry, creating a deployment pipeline in Kubeflow, and running the pipeline to deploy your LLM application on Azure Kubernetes Service (AKS). Adjust the specifics as needed for your actual use case.

You can get more help here. Also you can get many Machine Learning and LLM notebooks including few for Kubeflow here.

要查看或添加评论,请登录

Dhiraj Patra的更多文章

  • Harnessing Senior Experience for Economic Growth

    Harnessing Senior Experience for Economic Growth

    The Growing Population of Seniors and the Need for Productivity The global population is aging rapidly, with people…

  • Mobile Addiction Destroying The Society

    Mobile Addiction Destroying The Society

    In India and other poor and developing nations, a significant portion of the population, particularly young adults and…

  • The Misleading Narrative: AI Will Replace Jobs

    The Misleading Narrative: AI Will Replace Jobs

    The Misleading Narrative: "AI Will Replace Jobs" In recent years, some tech tycoons, industry leaders, and even…

  • GAN, Stable Diffusion, GPT, Multi Modal Concept

    GAN, Stable Diffusion, GPT, Multi Modal Concept

    In recent years, advancements in artificial intelligence (AI) and machine learning (ML) have revolutionized how we…

  • Forced Labour of Mobile Industry

    Forced Labour of Mobile Industry

    Today I want to discuss a deeply troubling and complex issue involving the mining of minerals used in electronics…

  • NVIDIA DGX Spark: A Detailed Report on Specifications

    NVIDIA DGX Spark: A Detailed Report on Specifications

    nvidia NVIDIA DGX Spark: A Detailed Report on Specifications The NVIDIA DGX Spark represents a significant leap in…

  • Future Career Options in Emerging & High-growth Technologies

    Future Career Options in Emerging & High-growth Technologies

    1. Artificial Intelligence & Machine Learning Generative AI (LLMs, AI copilots, AI automation) AI for cybersecurity and…

  • Construction Pollution in India: A Silent Killer of Lungs and Lives

    Construction Pollution in India: A Silent Killer of Lungs and Lives

    Construction Pollution in India: A Silent Killer of Lungs and Lives India is witnessing rapid urbanization, with…

  • COBOT with GenAI and Federated Learning

    COBOT with GenAI and Federated Learning

    The integration of Generative AI (GenAI) and Large Language Models (LLMs) is poised to significantly enhance the…

  • Robotics Study Guide

    Robotics Study Guide

    image credit wikimedia Here is a comprehensive study guide for robotics covering the topics you mentioned: Linux for…

社区洞察

其他会员也浏览了