登录查看更多内容

Automation of Machine Learning model training by using DEVOPS,increasing accuracy of model by tweaking hyerparameter automatically

Asish Patnaik

Frontend Developer at Comviva

发布日期: 2020年7月1日

+ 关注

INTRODUCTION:

>>What is Machine Learning??

Machine learning (ML) is the study of computer algorithms that improve automatically through experience. It is seen as a subset of artificial intelligence. Machine learning algorithms build a mathematical model based on sample data, known as training data, in order to make predictions or decisions without being explicitly programmed to do so.Machine learning algorithms are used in a wide variety of applications, such as email filtering and computer vision, where it is difficult or infeasible to develop conventional algorithms to perform the needed tasks.

>>What is DEVOPS?

DevOps is a set of practices that combines software development (Dev) and IT operations (Ops). It aims to shorten the systems development life cycle and provide continuous delivery with high software quality.

>>What are Hyperparameters??

In machine learning, a hyperparameter is a parameter whose value is used to control the learning process. By contrast, the values of other parameters (typically node weights) are derived via training.

Task Description :

1. Create container image that’s has Python3 and Keras or numpy installed using dockerfile

2. When we launch this image, it should automatically starts train the model in the container.

3. Create a job chain of job1, job2, job3, job4 and job5 using build pipeline plugin in Jenkins

4. Job1 : Pull the Github repo automatically when some developers push repo to Github.

5. Job2 : By looking at the code or program file, Jenkins should automatically start the respective machine learning software installed interpreter install image container to deploy code and start training( eg. If code uses CNN, then Jenkins should start the container that has already installed all the softwares required for the cnn processing).

6. Job3 : Train your model and predict accuracy or metrics.

7. Job4 : if metrics accuracy is less than 80% , then tweak the machine learning model architecture.

8. Job5: Retrain the model or notify that the best model is being created

9. Create One extra job job6 for monitor : If container where app is running. fails due to any reason then this job should automatically start the container again from where the last trained model left

Pre-Requsites:

-Red Hat Linux on avirtual machine

-Jenkins and Docker should be installed on the machine and their services should be active

-ngrok software should be there in your system

-Here,I have implemented a CNN code,and used my own data for creating a code for model training,I have used cats vs Dogs data from Kaggle site: https://www.kaggle.com/c/dogs-vs-cats and implemented the code and need some additional software like git bash installed on your windows system

PROCESS:

1: Create a github repository and set it up in your git bash environment and change the post-commit hook so that it pushes code automatically when commited.You can check this step in one of my previous article,here's the link:

https://www.dhirubhai.net/pulse/integrating-jenkinsdocker-git-hub-creating-automated-web-patnaik

and check my github repo for the files I have uploaded(mail.py,code.py,accuracy.txt),github repo link: https://github.com/Pheonix-reaper/Task3_MLOPS

2: Open jenkins on window,using your redhat ip and port no in which jenkins is alloted,start ngrok and give tunnel to port 80,we get a link,use that link as a webhook for our github repo,so that our redhat system and github are connected.

3: In your,red hat system,create a new directory,inside ,it create a Dockerfile,specify what are the libraries you want and what tools you need in your docker os for this task

Now,use create an image for this by using docker build command :::: docker build -t <imagename> : <tag> <path of dockerfile>

4: Now,its time to create all the jobs in jenkins,create job1,add git scm trigger to automatically copy code in our redhat system from our repository,it can be done by:

Code:-

sudo cp * /mlops/

sudo sed -i '$ a\accuracy= int("%2d" % (model.history.history["acc"][no_of_epochs - 1]*100))\nfile1=open("accuracy.txt","w")\nfile1.write("{}".format(accuracy))' /mlops/code.py

5: Create another job,JOB 2,which checks if the code we have pushed to git hub is a CNN code or not,if it is a CNN code,it will automatically create a docker os for us for model training,else it will show an error message if it is not a CNN code.

6: Create another job,JOB 3 which trains the model,and finds the accuracy and stores it in a file called accuracy.txt

7: Install Downstream plugin to get build project extend action,Create a job,JOB 4 which checks if the model has the accuracy we need if yes,then it executes JOB5 which sends mail to Developer,if not,we tweak the hyperparameter,and call job2 again for model training.

Code:

acc=$(sudo cat "/root/accuracy.txt")
acc_req=85
if [ $acc -ge $acc_req ]
 then
   echo "Accuracy is good"
   exit 1
 else
   sudo sed -i 's/no_of_epochs = 1/no_of_epochs = 3/' /mlops/code.py


   sudo sed -i '/Flatten()/i \model.add(Convolution2D(filters=32,kernel_size=(k_size,k_size),activation="relu"))\nmodel.add(BatchNormalization())\nmodel.add(MaxPooling2D(pool_size= p_size))\nmodel.add(Dropout(0.25))' /mlops/code.py


   sudo sed -i '/sigmoid/i \model.add(Dense(units=256, activation="relu"))\nmodel.add(BatchNormalization())\nmodel.add(Dropout(0.25))' /mlops/code.py
fi

8: Create a job,JOB 5 which sends mail to developer when the model reaches maximum accuracy

9: Create a job,JOB 6 which monitors the container we have created if the container gets destroyed or stopped working,it deploys our container again.

10: Install pipeline Plugin,create a pipeline for the jobs we have created:

You can see that after creating the docker container in job 2,from job 3 to job 4 if in case our container fails to work,than Job6 is executed to deploy the container again

With this we have completed the task,it is a very powerful setup as it automatically finds out about the best code by which we can create an efficient model,this is very basic of how amazon sagemaker basically works,although amazon sagemaker is very efficient but the working principle is almost same.

I have explained each and every step I took for this task and the importance of this task. If you follow all the steps I have mentioned you can also create this setup.

要查看或添加评论，请登录

Asish Patnaik的更多文章

A Case Study on how Netflix uses AWS Services.

2020年9月22日

A Case Study on how Netflix uses AWS Services.

Netflix is a subscription-based streaming service which offers online streaming of a library of films and television…

1 条评论
Deploying WordPress site using Kubernetes and integrating it with a database by using Amazon RDS Service

2020年9月5日

Deploying WordPress site using Kubernetes and integrating it with a database by using Amazon RDS Service

- What is Amazon RDS Service? Amazon Relational Database Service (Amazon RDS) makes it easy to set up, operate, and…

4 条评论
Create a vpc,private and public subnet inside the vpc,use a bastion user to establish connection to the private subnet

2020年7月15日

Create a vpc,private and public subnet inside the vpc,use a bastion user to establish connection to the private subnet

Task Description : Perform task-3 with an additional feature to be added that is NAT Gateway to provide the internet…
Creating a complete cloud Infrastructure Using AWS EFS storage

2020年7月13日

Creating a complete cloud Infrastructure Using AWS EFS storage

Task Description: >>Perform the task-1 using EFS instead of EBS service on the AWS as, >>Create/launch Application…
Creating our Own VPC/Network,Creating two subnet for word press and SQL and integrating it by using Terraform

2020年7月12日

Creating our Own VPC/Network,Creating two subnet for word press and SQL and integrating it by using Terraform

Task: 1) Write a Infrastructure as code using terraform, which automatically create a VPC. 2) In that VPC we have to…

2 条评论
AWS EKS TASK

2020年7月10日

AWS EKS TASK

What is AWS EKS? EKS stands for Elastic Kubernetes Services is a fully managed kubernites services by Amazon.Through…
Using Dockerfile to install jenkins and integrating it with GIT and Docker to test and Deploy html or PHP webpage

2020年6月28日

Using Dockerfile to install jenkins and integrating it with GIT and Docker to test and Deploy html or PHP webpage

Description of the task : 1. Create container image that’s has Jenkins installed using dockerfile 2.
Integrating Jenkins,Docker and Git Hub and creating an Automated Web Development environment

2020年6月24日

Integrating Jenkins,Docker and Git Hub and creating an Automated Web Development environment

Task: JOB#1 If Developer push to dev branch then Jenkins will fetch from dev and deploy on dev-docker environment…
This is an article about how we can use terraform to create a cloud Infrastructure

2020年6月14日

This is an article about how we can use terraform to create a cloud Infrastructure

What is Terraform and how it is useful in creating cloud infrastructure? -->Terraform is a tool for building, changing,…
Face Recognition Using VGG 16 Model and Transfer Learning

2020年5月19日

Face Recognition Using VGG 16 Model and Transfer Learning

In this project,I have used a pre-trained weight from VGG 16 and added my own layer using Transfer Learning to create a…

See all articles

Automation of Machine Learning model training by using DEVOPS,increasing accuracy of model by tweaking hyerparameter automatically

Asish Patnaik

Frontend Developer at Comviva

INTRODUCTION:

>>What is Machine Learning??

>>What is DEVOPS?

>>What are Hyperparameters??

Task Description :

Pre-Requsites:

PROCESS:

Asish Patnaik的更多文章

社区洞察

其他会员也浏览了

The 7 Fundamental Principles of MLOps for Successful Application

AI Meets DevOps: A Powerhouse Combo Fueling the Next Tech Revolution

AI for DevOps Engineers - Part 2: Building AI Applications with LangChain

Revolutionary Generative AI Requirements Solution Copilot4DevOps Launched for Microsoft Azure DevOps

Building AI-powered Reason and Act (ReAct) SRE Agent with LLM (Ollama and OpenAI), Langchain and Complex Prompts

The Evolution of Machine Learning DevOps: Bridging the Gap Between Data Science and Engineering

How to Start MLOps as a DevOps Engineer

Advanced MLOps

?? The tool that will replace your DevOps/SRE/System Engineering team - DevOps GPT ??

48Day MLOps: Day 1: Overview of MLOps

INTRODUCTION:

>>What is Machine Learning??

>>What is DEVOPS?

>>What are Hyperparameters??

Task Description :

Pre-Requsites:

PROCESS:

Asish Patnaik的更多文章

A Case Study on how Netflix uses AWS Services.

Deploying WordPress site using Kubernetes and integrating it with a database by using Amazon RDS Service

Create a vpc,private and public subnet inside the vpc,use a bastion user to establish connection to the private subnet

Creating a complete cloud Infrastructure Using AWS EFS storage

Creating our Own VPC/Network,Creating two subnet for word press and SQL and integrating it by using Terraform

AWS EKS TASK

Using Dockerfile to install jenkins and integrating it with GIT and Docker to test and Deploy html or PHP webpage

Integrating Jenkins,Docker and Git Hub and creating an Automated Web Development environment

This is an article about how we can use terraform to create a cloud Infrastructure

Face Recognition Using VGG 16 Model and Transfer Learning

社区洞察

其他会员也浏览了

The 7 Fundamental Principles of MLOps for Successful Application

AI Meets DevOps: A Powerhouse Combo Fueling the Next Tech Revolution

AI for DevOps Engineers - Part 2: Building AI Applications with LangChain

Revolutionary Generative AI Requirements Solution Copilot4DevOps Launched for Microsoft Azure DevOps

Building AI-powered Reason and Act (ReAct) SRE Agent with LLM (Ollama and OpenAI), Langchain and Complex Prompts

The Evolution of Machine Learning DevOps: Bridging the Gap Between Data Science and Engineering

How to Start MLOps as a DevOps Engineer

Advanced MLOps

?? The tool that will replace your DevOps/SRE/System Engineering team - DevOps GPT ??

48Day MLOps: Day 1: Overview of MLOps