登录查看更多内容

MLOps - Automated Tuning: Automating Machine Learning using DevOps for Tuning Hyper Parameters

Jyotirmaya Vasaniwal

Coding is Love

发布日期: 2020年5月20日

Machine Learning is the trend of the time which now almost every CS/IT guy knows about or aspires to know.But the problem which lies in creating the best model is deciding the Hyper Parameters for the model.

What are Hyper Parameters ??

A hyperparameter is a parameter that is set before the learning process begins like number of layers , number of neurons in a layer etc.

These parameters have a direct impact on the accuracy of the model. Since these are set by us and not automatically decided by the machines , it becomes an absolute necessity to select the best values for hyper parameters which shall increase the overall accuracy of the model.

(Please check the cover image for a better understanding)

Now choosing the best values for this is not so easy and especially in deep learning , we need many hit and trials to finally find the best ones for our model and doing this manually is both tiring and time taking. So, to solve this problem of ours , our mentor Vimal Daga Sir gave us a unique project which required to use DevOps to automate the process of hit and trials and thus select the best Hyper Parameters.

Now since we are clear with the objective of the task , so lets begin with the procedure to implement it.

I have used 6 jobs in Jenkins to implement this which are as follows :

Job 1 : Pull GitHub Code

When the developer will push any code to GitHub , this job will copy that code into the local repository in our system. For this I have used Poll SCM to keep on checking the remote repository for any changes.

sudo cp -v * /home/jyotirmaya/ws/mlops1

The above code will transfer the code files copied to Jenkins from GitHub into my local repository /home/jyotirmaya/ws/mlops1 .

Job 2 : See Code and Launch

It will do the following tasks :

1)Check whether the code (stored in program.py) is of CNN or not (checked using program checkcode.py)

2)If the code is of CNN , execute its container from its image(convoimage:v13) created using Dockerfile.

The checkcode.py as you can see below is an extremely simple code using the basic concept that any CNN model will definitely have 2 words in it to implement their modules which are keras and conv2D. Thus if these words are present , the output of the program will be kerasCNN.

programfile = open('/home/jyotirmaya/ws/mlops1/program.py','r')
code = programfile.read()				

if 'keras' or 'tensorflow' in code:		
	if 'Conv2D' or 'Convolution' in code:				
		print('kerasCNN')
	else:
		print('not kerasCNN')
else:
	print('not deep learning')

As you can see in the below image , I compared the output of the above program in Jenkins and launched my container using the image I created using the Docker File.

Here is my Docker File Code

The Docker File as you can see here has just few lines of code but it has a size of 2.62 GB and it took me 13 versions to build the perfect version that shall satisfy my requirements.

The below code will run "python3 /mlops/program.py" as soon as the container is launched where /mlops/ is the path to access the code file program.py inside the docker container.The directory was created using the volume linking feature of docker as mlops folder is linked with the local repository stored in baseOS.

CMD [ "python3","/mlops/program.py" ]

The program.py code was actualy of LeNet for MNIST dataset but I modified the layers part.The data of the layers Convolve and Fully Connected which is actually are hyper parameters are now input through a file 'input.txt'.

convlayers = int(input())
first_layer_nfilter = int(input())
first_layer_filter_size = int(input())
first_layer_pool_size = int(input())


model.add(Conv2D(first_layer_nfilter, (first_layer_filter_size, first_layer_filter_size),
                 padding = "same", 
                 input_shape = input_shape))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size = (first_layer_pool_size, first_layer_pool_size)))


#Subsequent CRP sets
for i in range(1,convlayers):
	nfilters = int(input())
	filter_size = int(input())
	pool_size = int(input())

	model.add(Conv2D(nfilters, (filter_size, filter_size),padding = "same"))
	model.add(Activation("relu"))
	model.add(MaxPooling2D(pool_size = (pool_size, pool_size)))


# Fully connected layers (w/ RELU)
model.add(Flatten())


fc_input = int(input())

for i in range(0,fc_input):
	no_neurons = int(input())
	model.add(Dense(no_neurons))
	model.add(
    model.add(Activation("relu"))

I have done so because in the next executions of the code when there shall be a need to tweak the Hyper Parameters for improving accuracy, Jenkins shall simply run a program (tweaker.py) which shall change the contents of the input file without touching the main code file and the hyper parameters will change.

The program tweaker.py for me is the soul of this setup and it is from where I began this entire project. In Job 4 , there is a proper explanation of what it actually does.

Job 3 : Predict Accuracy

The task done by it very simple that is the accuracy along with the setup is deployed on the Apache Web Server so that user can directly access and see it using the following URL :

IPofSystem/display_matter.html

sudo cp /home/jyotirmaya/ws/mlops1/display_matter.html /var/www/html

Just write the above code in the Execute Shell of Jenkins job 3.

Job 4 : Analyse Accuracy and move

This job performs the following tasks :

1)Checks accuracy , if accuracy is less than required , then tweak the code using program tweaker.py and again start job 2 i.e. see code and launch to start the container and run the model once again.

2)If accuracy requirement is met , call job 5 i.e. model create success

Now lets see how tweaker.py tweaks the code...

When tweaker.py is called , it compares the accuracies old (initially 0) and new (gained from running the container) . If the accuracy has increased then it increases the value of the first hyper parameter(here number of filters) of the base convolve layer.

Also it changes the initial 0 accuracy to new accuracy received in the data.txt file for next build calculations.

As soon as the hyper parameter value is changed , the job2 is re run to see the accuracy.

Now , if the accuracy would have increased , it means that the value increased was good and can be increased further , so it increases that parameter's value further.

But if it finds that the accuracy has decreased , then our program tweaker.py changes the parameter's value to its initial value and now starts changing the value of the next hyper parameter (which is in our case is filter size).

In every call , it repeats this process until in that layer , no more hyper parameters can be increased and when such a case arises , it goes on to add another layer and do all the above processes once again in the new layer.

Here are the detailed images to support the written matter.

Above was the explanation of how tweaker.py actually works.

Now lets see the code used to implement this job in Jenkins

The above codes are :

if [[ "$(sudo cat /home/jyotirmaya/ws/mlops1/accuracy.txt)" < "0.9999999" ]]
then
echo "Tweaking The program"
sudo python3 /home/jyotirmaya/ws/mlops1/tweaker.py
curl 192.168.43.250:8080/view/Integrate%20Machine%20Learning%20with%20Jenkins/job/See%20code%20and%20Launch/build?token=tweakedNowRun
else
echo "Merge and Email"
curl 192.168.43.250:8080/job/Model%20Create%20Success/build?token=modelCreateSuccess

fi

Here 0.999999 is the target accuracy to be achieved to accept the model as successful.

The first curl command is to trigger job 2 since our hyper parameters are tweaked and ready to be tested.

The second curl command is to trigger job 5 on successful model creation.

Job 5 : Model Create Success

This is triggered when the required accuracy is met and the input file is mailed to the developer to help the developer know the correct value of the hyper parameters

sudo python3 /home/jyotirmaya/ws/mlops1/email.py

Just write the above command in the Execute Shell of the Jenkins job 5.

Job 6 : Restart Docker

This is a monitoring job called when job 2 fails that is due to nay reason the container kerasCNNos fails to complete execution.

It restarts the docker engine completely to make sure docker engine is working fine because in our setup , this can be the major reason for job 2 failure.

And then it triggers job 2 once again.

Using the above setup , I achieved an accuracy of 99.21% using 5 epoches per training within a few hours till the time I did not manually stopped the building .

No. of convolve layers : 2
Layer 1
No of filters : 128
Filter Size : 7
Pool Size : 2
Layer 2: 
No of filters : 2048
Filter Size : 2
Pool Size : 2
No. of FC Layers : 1
Neurons in Layer 1 : 10

Accuracy achieved : 0.9921000003814697

My GitHub Link for the above codes : https://github.com/JyotirmayaV/mlops1/tree/developer

Thank you guys for reading this.This project was truly a great learning experience and it taught me many great things especially it taught me the real meaning of MLOPS.....

Ratna Awasthi

4 年

Your approach is nice.Well done.Keep it up...

1 次回应

Vimal Daga

4 年

Very proud of u , appreciated

2 次回应

Shivam Gupta

4 年

Nicely Done!!! ????

1 次回应

Anant Jakhmola

Databases | Infrastructure | DBRE-OKTA

4 年

Very well implemented

1 次回应

Ram Krishan

Salesforce Developer

4 年

That's actually a very good implementation of it Kritik Sachdeva check him out

2 次回应

查看更多评论

要查看或添加评论，请登录

Jyotirmaya Vasaniwal的更多文章

Automating Jenkins with Docker and Github

2020年5月17日

Automating Jenkins with Docker and Github

The task 2 given by Vimal Sir finally gets completed. This task was a task which took my learning to another level.
MLOps Training Day 4

2020年4月11日

MLOps Training Day 4

I have already talked about the greatest workshop ever which is about to begin : MLOps. The Workshop has already…
MLOps - The Technology you need for the jobs of the coming future

2020年4月2日

MLOps - The Technology you need for the jobs of the coming future

Artificial Intelligence and Machine Learning : The field about which almost everyone belonging to the technical field…

MLOps - Automated Tuning: Automating Machine Learning using DevOps for Tuning Hyper Parameters

Jyotirmaya Vasaniwal

Coding is Love

Jyotirmaya Vasaniwal的更多文章

社区洞察

其他会员也浏览了

Components of MLOps

SRE and MLOps: The Path to Scalable, Reliable AI Operations

Why Automated Testing is Essential for Reliable MLOps Pipelines

After your machine learning production deployment, the real work begins! MLOps is here to set you up for the long run

MLops

Top 10 Trends Shaping the Future of Automated Development

May 13, 2024

Issue #6: Marvelous MLOps

Top 5 DevOps Trends in 2023

5 Simple Steps to Help You Win With Your Next Machine Learning Project

Jyotirmaya Vasaniwal的更多文章

Automating Jenkins with Docker and Github

MLOps Training Day 4

MLOps - The Technology you need for the jobs of the coming future

社区洞察

其他会员也浏览了

Components of MLOps

SRE and MLOps: The Path to Scalable, Reliable AI Operations

Why Automated Testing is Essential for Reliable MLOps Pipelines

After your machine learning production deployment, the real work begins! MLOps is here to set you up for the long run

MLops

Top 10 Trends Shaping the Future of Automated Development

May 13, 2024

Issue #6: Marvelous MLOps

Top 5 DevOps Trends in 2023

5 Simple Steps to Help You Win With Your Next Machine Learning Project