Integration of Machine Learning with DevOps for the auto-tuning of HyperParameters.
Every computer Engineer motives to make a human being's life easy and automated. Although Machine learning has made a lot of work easy and has reduced the burden on human generation, there are still some of the limitations that cannot be resolved by it. In the race of making a computer system artificially intelligent, we've come a long way and still have thousands of miles to reach.
DevOps is one such boon to the human generation that fits perfectly in this automated world. This article not only contains a slight overview of some of the amazing and interesting technologies, but also an integration of them in order to make any Deep Learning (a part of Machine Learning )project far more interesting.
Jumping on to the agenda of this article, I've given a detailed description of the task given by VIMAL DAGA sir based on a CNN model and integrated the model with Jenkins, git, and docker. We are supposed to build a pipeline of 5 jobs each doing their respective tasks.
Starting from the explanation of the headline itself, what a HYPERPARAMETER means?
Hyperparameter is that value in the machine learning model whose value is unknown to us but should be set by us before training the model. For example, We don't know the number of neurons to be used in a hidden layer, the learning that must be given for training the model but it is necessary to provide this information to the model before training it. There are no particular rules that must be kept in mind for specifying the hyperparameters because of which the process of hit and trial for hyperparameters becomes a tedious task.
The Machine Learning code is written is based on the dataset of cats and dogs (data source: kaggle.com). Since the number of images for model training are required in large number, therefore I've used the concept of IMAGEDATAGENERATOR. Also, I've used two CRP(Convolution+Relu+Pooling) layers for better accuracy of the model.
The job of Machine Learning is confined to the training of models and predicting results as and when required. Now comes the use of Git and Jenkins in the task. I've made a git repo in my system and initialized it which contains my dataset and the ML code file. Although I've pushed the file manually, there is an option to push the git repo automatically using "post-commit" hooks. As soon as the repo will be pushed on Github, it will be automatically pulled by JENKINS. The above-stated conditions are for the first job in Jenkins.
The second job, however, is not as smooth as the first one. According to this, if Jenkins detects the model pulled to be a CNN one, it would automatically launch an environment from the image that contains the required libraries. This image could be either pulled from docker hub or we could create our own image. I've created my image using a docker file about which we were given a session. The code in the workspace of Jenkins will be deployed on the container and since the image dataset is too large, we could simply mount it instead of copying it.
The main job, that is job3 has it all. In this job, we are asked to save the ML file and find the accuracy of the model. If the accuracy of the model is found to be less than 80%, it means that there is a place for tweaking of the model again and changes could be made to the ML code in order to increase the accuracy. The changes could list from changing the kernel size to increasing the number of filters to the addition of CRP.
The automation is incomplete if any program is unable to make the changes by itself. The fourth job handles this program. According to this job, as soon as the accuracy detection is done, it would first adjust the hyperparameters and then start the whole process from job1 to job4 all again until the accuracy reaches 80%. And if the accuracy is found to be 80%, a mail will be sent.
The fifth job is also very critical in the process of automation. There are many conditions where the container fails which creates a hindrance is an ongoing process. This job will run on a pre-defined schedule basis and check the health of the O.S. If it sees that the container/O.S. has terminated due to any reason, it would re-launch it. I've scheduled the job to run at every hour, every minute.
This was a complete description and one of a great example that shows the integration of Machine Learning with DevOps.