登录查看更多内容

MLOps : Integrating Machine Learning with DevOps

Priyanshi Kaila

SDE-II @ Walmart Global Tech. India

发布日期: 2020年5月26日

Around 90% of the machine learning models are created but are never deployed. Integrating Machine Learning with DevOps provides us a solution for overcoming such failure. In Machine Learning we have to generally find hyper-parameters for training a model but finding them manually is very tiring as it is totally a hit and trial process and sometimes we also do not get the desired accuracy even after a large number of trials. That is why most of the models fail.

In Machine Learning, a hyperparameter is a parameter whose value is set before the learning process begins.

This is an article taken from a famous website of data science.

I have created a small project by integrating Machine Learning with DevOps to make the model training process of Machine Learning automatic. This model does the hit and trial for the hyper-parameters by its own and makes the complete process automatic till a desired accuracy is achieved. I have used Convolution Neural Network (CNN) for training my model.

The architecture of CNN is depicted by the below picture

Here, is the link to my GitHub Repository

OUTLINES

1. Creating a container image that has the desired setup which will be required for training of your model installed using Dockerfile.

2. Creating a job chain of job1, job2, job3, job4 and job5 using build pipeline plugin in Jenkins.

3. # Job1 : Pull the Github repo automatically when some developers push repo to Github.

4. # Job2 : By looking at the code or program file, Jenkins should automatically start the respective machine learning software installed interpreter image container to deploy code and start training( eg. If code uses CNN, then Jenkins should start the container that has already installed all the softwares required for the cnn processing).

5. # Job3 : Train your model and predict accuracy or metrics.

6. # Job4 : If metrics accuracy is less than 80% , then tweak the machine learning model architecture.

7. # Job5: Retrain the model or notify that the best model is being created

8. Creating one extra job # Job6 for monitor : If container where model is being trained fails due to any reason then this job should automatically start the container again from where the last trained model left

DESCRIPTION

I have taken a Dog and Cat Dataset for training my model. This dataset can be anything depending on what you want your machine learning model to be trained for.

You have to create a container image which contains all the softwares and libraries necessary to train your model. I have used a CNN model and created a container image as

After you have created the image you have to build it by using the command shown

Create a Job chain of Job1, Job2, Job3, Job4 and Job5 using Build Pipeline Plugin in Jenkins.

JOB 1 : Pulling the Github repo automatically when some developers push repo to Github.

For automatically performing this action you need a public IP for Jenkins. If it is available to you can use it but if it is not available you can use a software i.e, ngrok to provide your Jenkins with pubic IP.

Create a Webhook in the GitHub repository where the code has to be pushed by the developer.

Create a Job in Jenkins which will take the code from the GitHub repo when the developer pushes it and copies the code to a folder in our O.S.

JOB 2 : By looking at the code or program file, Jenkins should automatically start the respective container

This Job is triggered when the first Job is successfully built.

You have to check the code to know which kind of code has the developer uploaded. For checking this you can go inside the file uploaded by the developer and search for a specific keyword that is used only for a specific kind of model. I used a keyword as 'Convolution2D' as i used a CNN model.

JOB 3 : Train your model and predict accuracy or metrics.

This job runs only after the previous Job is built

You have to start training your model inside your container and predict and extract the accuracy of the model for further use.

My model achieved an accuracy of around 65% when the hyper-parameters which the developer has given in the code were used.

JOB4 : If metrics accuracy is less than 80% , then tweak the machine learning model architecture.

Job4 runs after successful build of Job3

You have to compare the accuracy of the trained model with whatever percentage you want your model to achieve. And if the model has an accuracy of less than the desired percentage then it would tweak i.e, the model will train again with different hyper-parameters and this process will go on until the desired accuracy is achieved. My desired accuracy was 80% so i used 80 to compare.

I have changed the hyper-parameters only once but you can change it as many times you want to and according to your convenience. All the conditions are to be written once and the model will tweak according to the given conditions till the time the desired accuracy is achieved.

Here i have made the job to fail after it has reached accuracy greater than 80% so that it does not go to the third job again i.e, it does not tweak again.

I got an accuracy of around 81% by using the new hyper-parameters

JOB 5 : Retrain the model or notify that the best model is being created

This Job will only run when Job 4 fails otherwise it would keep on tweaking and go to Job 3.

You can send a email to the developer or whomsoever you wish to tell that your model has been trained successfully achieving an accuracy of more than 80%.

As the achieved accuracy is more than 80% mail has been sent.

JOB 6 : If container where app is running fails due to any reason then this job should automatically start the container again from where the last trained model left.

You can also create a Job for monitoring the container so that it would restart the container whenever it fails due to any reason. You can set a time for the Job to build i.e, you can set after what time you want your Job to go to the container and check for its working.

CONCLUSION

You can make the training of your machine learning model especially the model that requires hyper-parameter fully automatic and achieve your desired accuracy without changing the hyper-parameters again and again manually.

Thank you for reading ;)

要查看或添加评论，请登录

Priyanshi Kaila的更多文章

CREATING A HIGH AVAILABILITY ARCHITECTURE BY USING AWS CLI

2020年10月29日

CREATING A HIGH AVAILABILITY ARCHITECTURE BY USING AWS CLI

In the real world, there can be situations when a dip in performance of a company's servers might occur from events…
DEPLOYING WORDPRESS AND MYSQL USING AWS RDS AND KUBERNETES

2020年10月21日

DEPLOYING WORDPRESS AND MYSQL USING AWS RDS AND KUBERNETES

Amazon RDS is the Relational Database Service offered as a web service by Amazon. It makes it easy to set-up and…

2 条评论
PROVIDING BEST SECURITY TO THE WEBSITE (WORDPRESS AND MYSQL) USING NAT GATEWAY

2020年10月20日

PROVIDING BEST SECURITY TO THE WEBSITE (WORDPRESS AND MYSQL) USING NAT GATEWAY

Technology has its own myths and facts. The more advanced the technology, the higher the chance for certain challenges.

1 条评论
PROVIDING BEST SECURITY TO THE WEBSITE (WORDPRESS AND MYSQL)

2020年10月19日

PROVIDING BEST SECURITY TO THE WEBSITE (WORDPRESS AND MYSQL)

Website security is the measures taken to secure a website from cyberattacks. In this sense, website security is an…

1 条评论
DEPLOYMENT USING AWS EFS AND TERRAFORM..

2020年10月14日

DEPLOYMENT USING AWS EFS AND TERRAFORM..

Amazon Elastic File System or popularly known as AWS EFS provides scalable file storage for use with Amazon EC2…
AUTOMATED WEBSITE DEPLOYMENT USING AWS AND TERRAFORM

2020年10月11日

AUTOMATED WEBSITE DEPLOYMENT USING AWS AND TERRAFORM

Although Deploying a website is an easy task, but it is a little time consuming. Usually developers repeat the same…
CREATING A LOAD BALANCER USING ANSIBLE ON AWS CLOUD

2020年9月30日

CREATING A LOAD BALANCER USING ANSIBLE ON AWS CLOUD

Ansible is an open-source automation tool, or platform, used for configuration management, application deployment…
CONFIGURING A WEBSERVER ON AWS USING ANSIBLE AND DEPLOYING A WEBPAGE ON IT

2020年9月12日

CONFIGURING A WEBSERVER ON AWS USING ANSIBLE AND DEPLOYING A WEBPAGE ON IT

Ansible is an open-source automation tool, or platform, used for configuration management, application deployment…
DYNAMIC SLAVE FOR JENKINS

2020年9月8日

DYNAMIC SLAVE FOR JENKINS

Jenkins is a popular CI / CD tool. Jenkins is also used because of its huge availability of plugins and features.

1 条评论
JENKINS AUTOMATION USING GROOVY

2020年9月2日

JENKINS AUTOMATION USING GROOVY

Jenkins features a Groovy script console which allows us to run Groovy scripts within the Jenkins master and create…

See all articles

MLOps : Integrating Machine Learning with DevOps

Priyanshi Kaila

SDE-II @ Walmart Global Tech. India

Priyanshi Kaila的更多文章

社区洞察

其他会员也浏览了

MLOps

WHAT IS MLOPS

Machine Learning, DevOps, and Data Engineering walk into a bar...

MLOps

A Deeper Look into Machine Learning Algorithms & Natural Language Understanding for Site Reliability Engineers

MLOps: Just DevOps for machine learning?

Deploying Machine Learning models in production

Understanding MLOps: The Future of Machine Learning Operations

Using Container For Machine Learning Application

Automation of Machine learning With Devops

Priyanshi Kaila的更多文章

CREATING A HIGH AVAILABILITY ARCHITECTURE BY USING AWS CLI

DEPLOYING WORDPRESS AND MYSQL USING AWS RDS AND KUBERNETES

PROVIDING BEST SECURITY TO THE WEBSITE (WORDPRESS AND MYSQL) USING NAT GATEWAY

PROVIDING BEST SECURITY TO THE WEBSITE (WORDPRESS AND MYSQL)

DEPLOYMENT USING AWS EFS AND TERRAFORM..

AUTOMATED WEBSITE DEPLOYMENT USING AWS AND TERRAFORM

CREATING A LOAD BALANCER USING ANSIBLE ON AWS CLOUD

CONFIGURING A WEBSERVER ON AWS USING ANSIBLE AND DEPLOYING A WEBPAGE ON IT

DYNAMIC SLAVE FOR JENKINS

JENKINS AUTOMATION USING GROOVY

社区洞察

其他会员也浏览了

MLOps

WHAT IS MLOPS

Machine Learning, DevOps, and Data Engineering walk into a bar...

MLOps

A Deeper Look into Machine Learning Algorithms & Natural Language Understanding for Site Reliability Engineers

MLOps: Just DevOps for machine learning?

Deploying Machine Learning models in production

Understanding MLOps: The Future of Machine Learning Operations

Using Container For Machine Learning Application

Automation of Machine learning With Devops