MLOps -  Automated Tuning: Automating Machine Learning using DevOps  for Tuning Hyper Parameters
https://deepai.org/machine-learning-glossary-and-terms/hyperparameter

MLOps - Automated Tuning: Automating Machine Learning using DevOps for Tuning Hyper Parameters

Machine Learning is the trend of the time which now almost every CS/IT guy knows about or aspires to know.But the problem which lies in creating the best model is deciding the Hyper Parameters for the model.

What are Hyper Parameters ??

A hyperparameter is a parameter that is set before the learning process begins like number of layers , number of neurons in a layer etc.

These parameters have a direct impact on the accuracy of the model. Since these are set by us and not automatically decided by the machines , it becomes an absolute necessity to select the best values for hyper parameters which shall increase the overall accuracy of the model.

(Please check the cover image for a better understanding)

Now choosing the best values for this is not so easy and especially in deep learning , we need many hit and trials to finally find the best ones for our model and doing this manually is both tiring and time taking. So, to solve this problem of ours , our mentor Vimal Daga Sir gave us a unique project which required to use DevOps to automate the process of hit and trials and thus select the best Hyper Parameters.

Now since we are clear with the objective of the task , so lets begin with the procedure to implement it.

I have used 6 jobs in Jenkins to implement this which are as follows :

Job 1 : Pull GitHub Code

When the developer will push any code to GitHub , this job will copy that code into the local repository in our system. For this I have used Poll SCM to keep on checking the remote repository for any changes.

No alt text provided for this image
No alt text provided for this image
sudo cp -v * /home/jyotirmaya/ws/mlops1

The above code will transfer the code files copied to Jenkins from GitHub into my local repository /home/jyotirmaya/ws/mlops1 .

Job 2 : See Code and Launch

It will do the following tasks :

1)Check whether the code (stored in program.py) is of CNN or not (checked using program checkcode.py)

2)If the code is of CNN , execute its container from its image(convoimage:v13) created using Dockerfile.

The checkcode.py as you can see below is an extremely simple code using the basic concept that any CNN model will definitely have 2 words in it to implement their modules which are keras and conv2D. Thus if these words are present , the output of the program will be kerasCNN.

programfile = open('/home/jyotirmaya/ws/mlops1/program.py','r')
code = programfile.read()				

if 'keras' or 'tensorflow' in code:		
	if 'Conv2D' or 'Convolution' in code:				
		print('kerasCNN')
	else:
		print('not kerasCNN')
else:
	print('not deep learning')

As you can see in the below image , I compared the output of the above program in Jenkins and launched my container using the image I created using the Docker File.

No alt text provided for this image

Here is my Docker File Code

No alt text provided for this image


The Docker File as you can see here has just few lines of code but it has a size of 2.62 GB and it took me 13 versions to build the perfect version that shall satisfy my requirements.

The below code will run "python3 /mlops/program.py" as soon as the container is launched where /mlops/ is the path to access the code file program.py inside the docker container.The directory was created using the volume linking feature of docker as mlops folder is linked with the local repository stored in baseOS.

CMD [ "python3","/mlops/program.py" ]

The program.py code was actualy of LeNet for MNIST dataset but I modified the layers part.The data of the layers Convolve and Fully Connected which is actually are hyper parameters are now input through a file 'input.txt'.

convlayers = int(input())
first_layer_nfilter = int(input())
first_layer_filter_size = int(input())
first_layer_pool_size = int(input())


model.add(Conv2D(first_layer_nfilter, (first_layer_filter_size, first_layer_filter_size),
                 padding = "same", 
                 input_shape = input_shape))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size = (first_layer_pool_size, first_layer_pool_size)))


#Subsequent CRP sets
for i in range(1,convlayers):
	nfilters = int(input())
	filter_size = int(input())
	pool_size = int(input())

	model.add(Conv2D(nfilters, (filter_size, filter_size),padding = "same"))
	model.add(Activation("relu"))
	model.add(MaxPooling2D(pool_size = (pool_size, pool_size)))


# Fully connected layers (w/ RELU)
model.add(Flatten())


fc_input = int(input())

for i in range(0,fc_input):
	no_neurons = int(input())
	model.add(Dense(no_neurons))
	model.add(
    model.add(Activation("relu"))
    


I have done so because in the next executions of the code when there shall be a need to tweak the Hyper Parameters for improving accuracy, Jenkins shall simply run a program (tweaker.py) which shall change the contents of the input file without touching the main code file and the hyper parameters will change.

The program tweaker.py for me is the soul of this setup and it is from where I began this entire project. In Job 4 , there is a proper explanation of what it actually does.

Job 3 : Predict Accuracy

The task done by it very simple that is the accuracy along with the setup is deployed on the Apache Web Server so that user can directly access and see it using the following URL :

IPofSystem/display_matter.html

sudo cp /home/jyotirmaya/ws/mlops1/display_matter.html /var/www/html

Just write the above code in the Execute Shell of Jenkins job 3.

Job 4 : Analyse Accuracy and move

This job performs the following tasks :

1)Checks accuracy , if accuracy is less than required , then tweak the code using program tweaker.py and again start job 2 i.e. see code and launch to start the container and run the model once again.

2)If accuracy requirement is met , call job 5 i.e. model create success

Now lets see how tweaker.py tweaks the code...

When tweaker.py is called , it compares the accuracies old (initially 0) and new (gained from running the container) . If the accuracy has increased then it increases the value of the first hyper parameter(here number of filters) of the base convolve layer.

Also it changes the initial 0 accuracy to new accuracy received in the data.txt file for next build calculations.

As soon as the hyper parameter value is changed , the job2 is re run to see the accuracy.

Now , if the accuracy would have increased , it means that the value increased was good and can be increased further , so it increases that parameter's value further.

But if it finds that the accuracy has decreased , then our program tweaker.py changes the parameter's value to its initial value and now starts changing the value of the next hyper parameter (which is in our case is filter size).

In every call , it repeats this process until in that layer , no more hyper parameters can be increased and when such a case arises , it goes on to add another layer and do all the above processes once again in the new layer.

Here are the detailed images to support the written matter.

No alt text provided for this image


No alt text provided for this image

Above was the explanation of how tweaker.py actually works.

Now lets see the code used to implement this job in Jenkins

No alt text provided for this image

The above codes are :

if [[ "$(sudo cat /home/jyotirmaya/ws/mlops1/accuracy.txt)" < "0.9999999" ]]
then
echo "Tweaking The program"
sudo python3 /home/jyotirmaya/ws/mlops1/tweaker.py
curl 192.168.43.250:8080/view/Integrate%20Machine%20Learning%20with%20Jenkins/job/See%20code%20and%20Launch/build?token=tweakedNowRun
else
echo "Merge and Email"
curl 192.168.43.250:8080/job/Model%20Create%20Success/build?token=modelCreateSuccess
fi

Here 0.999999 is the target accuracy to be achieved to accept the model as successful.

The first curl command is to trigger job 2 since our hyper parameters are tweaked and ready to be tested.

The second curl command is to trigger job 5 on successful model creation.

Job 5 : Model Create Success

This is triggered when the required accuracy is met and the input file is mailed to the developer to help the developer know the correct value of the hyper parameters

sudo python3 /home/jyotirmaya/ws/mlops1/email.py

Just write the above command in the Execute Shell of the Jenkins job 5.

Job 6 : Restart Docker

This is a monitoring job called when job 2 fails that is due to nay reason the container kerasCNNos fails to complete execution.

It restarts the docker engine completely to make sure docker engine is working fine because in our setup , this can be the major reason for job 2 failure.

And then it triggers job 2 once again.

No alt text provided for this image

Using the above setup , I achieved an accuracy of 99.21% using 5 epoches per training within a few hours till the time I did not manually stopped the building .

No. of convolve layers : 2
Layer 1
No of filters : 128
Filter Size : 7
Pool Size : 2
Layer 2: 
No of filters : 2048
Filter Size : 2
Pool Size : 2
No. of FC Layers : 1
Neurons in Layer 1 : 10
Accuracy achieved : 0.9921000003814697


My GitHub Link for the above codes : https://github.com/JyotirmayaV/mlops1/tree/developer

Thank you guys for reading this.This project was truly a great learning experience and it taught me many great things especially it taught me the real meaning of MLOPS.....























Your approach is nice.Well done.Keep it up...

Vimal Daga

World Record Holder | 2x TEDx Speaker | Philanthropist | Sr. Principal Consultant | Entrepreneur | Founder LW Informatics | Founder Hash13 pvt ltd | Founder IIEC

4 年

Very proud of u , appreciated

Shivam Gupta

Java | Spring Boot | Microservices | Backend Developer | Researcher | Machine Learning | MLH Best Social Good Hack Winner

4 年

Nicely Done!!! ????

Anant Jakhmola

Databases | Infrastructure | DBRE-OKTA

4 年

Very well implemented

Ram Krishan

Salesforce Developer

4 年

That's actually a very good implementation of it Kritik Sachdeva check him out

要查看或添加评论,请登录

Jyotirmaya Vasaniwal的更多文章

  • Automating Jenkins with Docker and Github

    Automating Jenkins with Docker and Github

    The task 2 given by Vimal Sir finally gets completed. This task was a task which took my learning to another level.

  • MLOps Training Day 4

    MLOps Training Day 4

    I have already talked about the greatest workshop ever which is about to begin : MLOps. The Workshop has already…

  • MLOps - The Technology you need for the jobs of the coming future

    MLOps - The Technology you need for the jobs of the coming future

    Artificial Intelligence and Machine Learning : The field about which almost everyone belonging to the technical field…

社区洞察

其他会员也浏览了