登录查看更多内容

INTEGRATION OF ML/DL WITH DEVOPS

Wang Sherpa

Data Scientist II @ Amphora

发布日期: 2020年5月30日

+ 关注

TASK-3 MLOPS

CHALLENGES:

MLOps level 0 is common in many businesses that are beginning to apply ML to their use cases. This manual, data-scientist-driven process might be sufficient when models are rarely changed or trained. In practice, models often break when they are deployed in the real world. The models fail to adapt to changes in the dynamics of the environment or changes in the data that describes the environment.

SOLUTION:

To address the challenges of this manual process, MLOps practices for CI/CD and CT are helpful. By deploying an ML training pipeline, you can enable CT, and you can set up a CI/CD system to rapidly test, build, and deploy new implementations of the ML pipeline.

This project is a simple or basic version of the solution addressed above

Task Description:

Create a container image that has Python3 and Keras or NumPy installed
When we launch this image, it should automatically start training the model in the container.
Create a job chain of job1 to job5 using the build pipeline plugin in Jenkins
Job-1: Pull the Github repository automatically when some developers push the repository to Github.
Job-2: By looking at the code or program file, Jenkins should automatically start the respective machine learning software installed interpreter, install image container to deploy code, and start training.
Job-3: Train your model and predict accuracy or metrics
Job-4: If accuracy is less than 80%, then tweak the machine learning model architecture
Job-5: Retrain the model or notify that the best mode is being created
Create one extra job Job-6 to monitor: If the container where the app is running fails due to any reason then this job should automatically start the container again from the last trained model left

My Steps involved in achieving the above tasks:

I have used PyTorch framework here instead of Keras I have used a prebuilt PyTorch image available at hub.docker.com Creating the docker file is very easy with this:
I have created two Dockerfiles: This one without hyperparameter tuning support

This one will create images with hyperparameter tuning support

 FROM pytorch/pytorch

 CMD ["python", "train.py"]

This accepts a command-line argument -t which is a type of boolean and if True it activates hyperparameter tuning that I have defined inside the training code.

 FROM pytorch/pytorch

 CMD ["python", "train.py", "-t True"]

3. This step will create a build pipeline of JOB-1 to JOB-5 Final look:

4. JOB-1: This will pull the GitHub repo whenever it updates or changes using GitHub webhook technique inside a folder /pytorch Choose this option in configuration and update the GitHub webhook accordingly.

Command to copy all files pulled from GitHub to the /pytorch folder sudo cp -v -r -f * /pytorch

5. JOB-2: This job will create an os image by looking at the code inside train.py Here it will create a PyTorch image as my code contains Convolutional Neural Network implemented using PyTorch

The above code in the image can be changed as below to meet the task requirements

if cat /pytorch/network.py | grep Conv2d
then
  if sudo docker images | grep pytorch_train_without_hyper
  then
    echo "Required image already exist! Next job will run a container using this image"
  else
    echo "Creating the required Image..."

      if sudo docker build -t pytorch_train_without_hyper /pytorch-dockerfile/dockerfile1/
      then
        echo "Image created Successfully"
      else
        echo "Something went wrong while creating the image!"
      fi
  fi
else
  echo "Implement for other types of deeplearning and machine learning algorithms using else if statements"

6. JOB-3: This will create and run a container using an appropriate os image. Running this container will automatically start training the network for certain epochs and the test accuracy will be saved in the accuracy.txt file inside the same folder. Adding the below command in the build->Execute shell

This code can also be changed as below to meet the task requirement

if cat /pytorch/network.py | grep Conv2d
then
  sudo docker run -v /pytorch:/workspace pytorch_train_without_hyper
else
  echo "using else if statement and grep we can implement other functions in the similar way as this one"

If everything goes well then the output will look like this: The accuracy is very low as I have trained this model for only 1 epoch.

7. JOB-4: This job will fetch the accuracy saved in a file accuracy.txt and checks whether it meets the condition such as whether the accuracy is greater than 80% or not. If the accuracy is less than the expected one then this job will recreate an OS image using Docker file saved inside /pytorch-dockerfiles/dockerfile2/Dockerfile. The code for Dockerfile is already mentioned in step 1. This way it checks for the condition and creates an image if it doesn't exist.

If everything goes well here then the output will look like this:

8. JOB-5: This job will train the network with hyperparameter tuning: Supported Hyperparameter to tune here is: Learning Rate and Optimizer However I could have added dropout, epochs, etc. But for now, there are no good resources about hyperparameter tuning in PyTorch, so, I have implemented this using simple for loops and list of parameters to tune. My virtual box is not being able to use Cuda so if I add many hyperparameters now this will take a very long time to train. That's why I have added only a few hyperparameters to tune. This job will build only if the current accuracy is less than the required which is 80% in this case.

Output if this JOB builds successfully This is just a sample output Same will be send to the owner also, through email.

Important Note about this job. After successful completion of this job, it will send email to the user/owner regardless of how much the accuracy is. It will send the best accuracy the model reached so far and the best hyperparameters that were used. It's not true that if we tune the model if will perform well. The accuracy gets increased but it's not always possible that these hyperparameter pairs always perform well. So I have done this to avoid the model Jenkins's job to enter an infinite loop. How will Jenkins Job enter an infinite loop? If the jobs are created in a way where the job2 will train a model and if the accuracy is not enough then the job4 will tune the hyperparameters and again invoke job2 to retrain. But in the worst case if none of the hyperparameter pairs give the required accuracy then this will keep executing like-> job2 will keep retraining and job4 will keep invoking job2 to retrain the model.

EMAIL EXAMPLE:

9. JOB-6 (Extra Job): This job will run once in every week. This will send the post request to job2. Commands to send a post request to job2 to train the model once in a week.

command:-

curl -X Post https://192.168.225.38:8080/view/Deep_learning/job/JOB-2/build?token=YourToken --user "username:password"

Dataset Used:

Convolutional Neural Networks

In this task, we train CNN to classify images from the CIFAR-10 database. The images in this database are small color images that fall into one of ten classes; some example images are pictured below.

Model Architecture Used

THANK YOU FOR GIVING YOUR PRECIOUS TIME

GitHub Link -> click here

要查看或添加评论，请登录

Wang Sherpa的更多文章

Task-1 Hybrid Multi-Cloud

2020年6月16日

Task-1 Hybrid Multi-Cloud

** Select a provider and a region ** provider "aws" { region = "ap-south-1" profile = "sherpa" } ** Create a…
Implementing Simple COVID-19 face mask detector with Pytorch, OpenCV and Deep Learning

2020年6月5日

Implementing Simple COVID-19 face mask detector with Pytorch, OpenCV and Deep Learning

I have used Anaconda's Jupyter notebook to write and visualize the result of the python program. This face mask…

INTEGRATION OF ML/DL WITH DEVOPS

Wang Sherpa

Data Scientist II @ Amphora

TASK-3 MLOPS

CHALLENGES:

SOLUTION:

This project is a simple or basic version of the solution addressed above

Task Description:

My Steps involved in achieving the above tasks:

EMAIL EXAMPLE:

Dataset Used:

Convolutional Neural Networks

Wang Sherpa的更多文章

社区洞察

其他会员也浏览了

Integration of Machine Learning and Devops

MLOps series: What are 'Github Actions'?

INTEGRATION OF MACHINE LEARNING WITH DEVOPS

Machine Learning + DevOps

#Task-2 Mlops+DevOps: CI/CD

MLOps : Integrating ML with DevOps

Integration of Machine Learning with Devops (MLOps)

Machine Learning Model with Integration of Devops (MLOps)

Integration of Machine learning with Devops : MLOps task3

ML-DevOps Integration

TASK-3 MLOPS

CHALLENGES:

SOLUTION:

This project is a simple or basic version of the solution addressed above

Task Description:

My Steps involved in achieving the above tasks:

EMAIL EXAMPLE:

Dataset Used:

Convolutional Neural Networks

Wang Sherpa的更多文章

Task-1 Hybrid Multi-Cloud

Implementing Simple COVID-19 face mask detector with Pytorch, OpenCV and Deep Learning

社区洞察

其他会员也浏览了

Integration of Machine Learning and Devops

MLOps series: What are 'Github Actions'?

INTEGRATION OF MACHINE LEARNING WITH DEVOPS

Machine Learning + DevOps

#Task-2 Mlops+DevOps: CI/CD

MLOps : Integrating ML with DevOps

Integration of Machine Learning with Devops (MLOps)

Machine Learning Model with Integration of Devops (MLOps)

Integration of Machine learning with Devops : MLOps task3

ML-DevOps Integration