Integrating ML with DevOps
In this article I will explain my first MLOPS project which successfully completed under Vimal Daga Sir. In this project I will explain how we can integrate Machine Learning with DevOps. We will be using Git Bash, Github, Jenkins, Docker and some popular python libraries for Machine Learning like Keras, Tensorflow etc
Problem Statement : Generally in ML we have change our hyper parameters like no.of epochs , no. of layers, kernel size etc till we reach a satisfactory accuracy. This is a very tedious task because to achieve our target accuracy we have to do many hit and trails by changing various hyper parameters and validate each time. In this world of automation and agile we can't carry out this manually. And moreover the architecture for training each and every dataset is different. So to achieve this task we need to integrate multiple technologies.
*Task description*
1. Create container image that's has Python3 and Keras or numpy installed using dockerfile
2. When we launch this image, it should automatically starts train the model in the container.
3. Create a job chain of job1, job2, job3, job4 and job5 using build pipeline plugin in Jenkins
4. Job1 : Pull the Github repo automatically when some developers push repo to Github.
5. Job2 : By looking at the code or program file, Jenkins should automatically start the respective machine learning software installed interpreter install image container to deploy code and start training( eg. If code uses CNN, then Jenkins should start the container that has already installed all the softwares required for the cnn processing).
6. Job3 : Train your model and predict accuracy or metrics.
7. Job4 : if metrics accuracy is less than 80% , then tweak the machine learning model architecture.
8. Job5: Retrain the model or notify that the best model is being created
9. Create One extra job job6 for monitor : If container where app is running. fails due to any reason then this job should automatically start the container again from where the last trained model left
The technologies used in this project are
- Git and Github
- Jenkins
- Docker
So to achieve automation we need to create some jobs in Jenkins
So before we start the discussion there are some pre-requisites
- We need to have RHEL8 installed as VM on Windows with docker and Jenkins installed in it
- In the Jenkins we need to install Build pipeline and Email extension plugin
- We need have to Git Bash in our BaseOs i.e Windows
- Before starting the below process make sure to sure to use the below commands to setup the environment
#Run the code line by line systemctl start docker systemctl start jenkins systemctl stop firewalld
Lets start with our main discussion
- We need to create a docker image with the help of a Dockerfile which installs suitable libraries in a container and push it to docker hub
Make sure to create a docker repository and then run the following commands
#run the code line by line docker build -t keras:v1 . docker tag keras:v1 syedfaheem/mlops:v3 docker login docker push
So we have successfully pushed our image to the docker hub and this image can be used by anyone in the world as we have uploaded our image in the public registry
2. Giving power to Jenkins to run various commands and programs in RHEL 8
#Run this command from home directory gedit /etc/sudoers
Then add the highlighted text in that file in the same line number
jenkins ALL=(ALL) NOPASSWD: ALL
3. Creating a architecture of a ML model
I will be training a CNN model in the container.
Get the python code in my github account -->
4. Creating a github repo
Make sure you have a github account and make a repository
So Lets start our discussion on Jenkins job
I have made 5 jobs in Jenkins to achieve this task
Configuring Job1:
As soon as the developer push the code to the Github, Jekins will detect it and download the code in our base VM RHEL8 in the "mlops-project" directory
First login into the Jenkins WebUI with admin account
ii. Create a job by selecting a "new item" in the Dashboard page then configure it same as below by giving a valid Url of a Github repo
Configuring Job2:
After successful completion of Job1 then it will trigger Job2 then Jenkins will analyze the python code and if it is a CNN code it will first check any there is any container named "mlops" and if it is already it will terminate and launch another container and automatically start training the CNN model in the container.
Configuring Job3
After successfully training the model and calculating its accuracy it will trigger Job3 This job will check the whether the target accuracy is achieved or not. If not then again it will run or train the model till the accuracy is achieved.
Configuring Job4
After achieving the target accuracy Job3 will trigger Job4. In Job4 we will send an email to the developer or the team member stating "model accuracy is beyond 90%"
Now we will configure our Extended Email notification in the Post Build section
Now in the Advanced settings in the same section of Extended notification
Configuring Job5
This Job5 is triggered by Job2. This job will monitor whether the container is running. If not it will launch again
Pipeline View
This is Build Pipeline View. It can viewed if Build pipeline plugin is installed in the jenkins
Outputs
Job1
As soon as the developer push the code this job will run
Job2
After 1st Epoch
After 2nd Epoch
After 3rd Epoch
We achieved accuracy more than 94%, Validation accuracy more than 94% and this will trigger Job3 and Job5
Job3
As the accuracy is greater than 90% or 0.9 this will trigger Job4
Job4
This will send an email to the given recipient
Job5
Further scopes:
- Instead of running the container and training the model using local resources it would be better running these in Cloud like AWS , GCP or Azure. Using these we can train our model fast and get the output fast
- Instead using randint function to change the hyper parameters I want to change the hyper parameters intelligently so that we can reach the target much quick