Machine Learning Pipeline With Jenkins:MLOPS
The percentage of AI models created but never put into production in large enterprises has been estimated to be as much as 90% or more. With massive investments in data science teams, platforms, and infrastructure, the number of AI projects is dramatically increasing — along with the number of missed opportunities. Unfortunately, most projects are not showing the value that business leaders expect and are introducing new risks that need to be managed.
Solution to the above problem statement is Mlops:
MLOPS delivers the capabilities that Data Science and IT Ops teams need to work together to deploy, monitor, and manage machine learning models in production and to govern their use in production environments.
Based on the task given by our #VimalDaga sir to integrate machine learning and devops I have integrated multiple technologies like git docker machine learning and jenkins and built a build pipeline which automates the process of model training without the intervention of humans.
Task description:
1. Create container image that has Python3 and Keras or numpy installed using dockerfile
2. When we launch this image, it should automatically starts train the model in the container.
3. Create a job chain of job1, job2, job3, job4 and job5 using build pipeline plugin in Jenkins
4. Job1 : Pull the Github repo automatically when some developers push repo to Github.
5. Job2 : By looking at the code or program file, Jenkins should automatically start the respective machine learning software installed interpreter install image container to deploy code and start training( eg. If code uses CNN, then Jenkins should start the container that has already installed all the softwares required for the cnn processing).
6. Job3 : Train your model and predict accuracy or metrics.
7. Job4 : if metrics accuracy is less than 80% , then tweak the machine learning model architecture.
8. Job5: Retrain the model or notify that the best model is being created
9. Create One extra job job6 for monitor : If container where app is running. fails due to any reason then this job should automatically start the container again from where the last trained model left.
Creating Docker image:
We can build your own docker image by creating a dockerfile and we can install our respective packages and mention the os we want to keep as a base image. In my case I have used python:3.7 version, you can use any os of your choice and installed all the libraries required for training the cnn model.
this is my docker file and i have also installed some dependencies for open-cv and have overcommitted the memory as u may get issues while running your program because the container image doesn't allow to use extra space from your memory.
Integrating jenkins , docker and github :
Job 1: Here,as soon as developer pushes the code in github, jenkins will be triggered and it will copy the files in repository from github and store it in a folder in my operating system redhat 8.
Github link: https://github.com/shivamagarwal1999/Mlops-Project
Job 2: Here, job2 gets triggered as soon as job1 is build successfully.In job2, we are checking that if our model contains keras then, it is run by docker and -v is used to links all files in a folder in redhat 8 to the folder inside container and -w is the working directory inside container and then it starts training the model named model.py and cp means it will copy the accuracy.txt ,which contains accuracy of the model and store it in folder in redhat 8. If by any condition the job2 fails ,it will trigger job6.
Job 3: It gets triggered ,if job2 runs successfully.Here, We check the accuracy of the trained model,if the accuracy is above 90%,then it commits the accuracy.txt to github and here we used post-commit ,where if we commit any file ,it gets automatic pushes it to github and then,it will trigger job 5.But,if accuracy is less then 90%,then it will trigger job 4.
Job 4: It is triggered when job 3 fails to achieve required accuracy. Here, we go inside a folder in OS, and add 3 dense layers inside our model and this is done using a sed command which adds layers inside the file at perticular line. After layers been added it will again trigger job 2 ,to again launch a container and again train the model after changes done and find accuracy.
Job 5: It is triggered when the accuracy is above 90% and is successfully build by job 3.Here, in this job ,we notify the developer by sending a mail, that his model has been trained properly and gives accuracy above 90%.
Job 6: This job is triggered only when due to any reason job 2 gets failed.In this job, if the container in job2 fails any reason,then job6 will launch the container again and will train it and trigger job3 for checking accuracy of the model.
Full Life Cycle of MLOps:
Here, is the history of the jobs,how accordingly they are build.
This is all about this project.
Conclusion :
Used Jenkins , Git , Github , Docker to integrate Machine learning with DevOps. This kind of projects will reduce lots of time of a developer which he/she invest in trials and testings.
Associate @JP Morgan Chase and Co. | CIB Technology
4 年Great work bro.......