ç™»å½•æŸ¥çœ‹æ›´å¤šå†…å®¹

A project on complete automation of DL model auto-tuning:-

Saranya Chattopadhyay

Full Stack Developer - DevSecOps @IBM ? DevOps Practitioner ? Ex Intern @CommVault, HighRadius ? 2x GCP, 1x Microsoft, 1x RedHat Certified Engineer

å‘å¸ƒæ—¥æœŸ: 2020å¹´5æœˆ26æ—¥

+ å…³æ³¨

Hey all! Ever thought of the idea that how cool it would have been if our ML / DL model was capable of auto-tuning itself in order to achieve even greater accuracy, in simple words what we mean by tweaking a model? Well that's what me and my friend Saptarsi have deployed in this end-to-end automation project. We would request you all to just give a quick read to this article where we have tried to cover all the concepts used in the project.

Problem Statement:-

1. Create container image thatâ€™s has Python:3 and keras or numpy installed using Dockerfile

2. When we launch this image, it should automatically start to train the model in the container.

3. Create a job chain of job1, job2, job3, job4 and job5 in Jenkins :

i) Job1 : Pull the Github repo automatically when some developers push repo to Github.

ii) Job2 : By looking at the code or program file, Jenkins should automatically start the respective machine learning software installed interpreter install image container to deploy code and start training( ex. If code uses CNN, then Jenkins should start the container that has already installed all the softwares required for the CNN processing).

iii) Job3 : Predict accuracy or metrics from the trained model.

iv) Job4 : if metrics accuracy is less than 80% , then tweak the machine learning model architecture.

v) Job5 : Retrain the model or notify that the best model is being created.

4. Create one extra job Job6 for monitor : If container where app is running. fails due to any reason then this job should automatically start the container again from where the last trained model left.

Solution performed:-

CONFIGURING THE IMAGES AND JOBS

Let's start with building the docker images in Redhat8. For this, we are gonna use Dockerfile. We will build two images - one which will the environment of ANN and another which will have environment for typical ML.

Docker file for ANN is -

Dockerfile for ML is -

We will build the Dockerfiles using the docker build command and finally when our images are ready, we can view them using docker images command. (Note : In the screenshot, our created images are having the names ml:v1 and cnn:v1. P.S-Please bear with the fact that the image has been named cnn:v1 although it has the environment of ANN :-p)

Switching to Windows10, let's write the python codes required in the use-case. We will require one DL code and another code for sending mail when required accuracy reached. For the DL code, we chose the very famous MNIST Handwritten Digits dataset. Once the codes are written in a mail_code.py file and the initial accuracy has been noted in Accuracy.txt, they will be uploaded in the github (you can go through them via the link provided in this post). Along with this, we will also create a post-commit file so that any changes made on local repo can be automatically pushed to the github - yet another automation.

Initially, the github repository would look somewhat like this-

Now, our codes are on the SCM system and thus, we are all set to start with the Jenkins jobs.

Configuration of Job1: pull_code: This job will be pulling the code from github whenever there is a push (for this it will have a look into github every minute) and copy the contents into /root/mlops_tweaktask folder of Redhat8. The configuration will be as follows:-

Configuration of Job2: launch_image: This job will launch the respective image, i.e. ml:v1 if the code is of typical ML or cnn:v1 if the code is of ANN, and accordingly execute the model_code.py file to train the model and note it in Accuracy.txt file. The configuration will be as follows:-

Configuration of Job3: check_accuracy: This job will read the Accuracy.txt file and get the accuracy of the model. The benchmark accuracy set is 96%. If the accuracy obtained is less than 96%, the build of the job will fail and trigger tweaking of the model. If the required accuracy is achieved, the job will succeed and add and commit the better accuracy to github followed by triggering the sending of success notification to the developer. Since this job will be accessing github, we need to give the github credentials in the SCM tab. The configuration will be as follows:-

Configuration of Job4: tweak_job: Frankly, this job is the most brain-storming one and requires too much of research regarding how to tweak the model. It will build if the 3rd job of checking accuracy fails. Finally, we could come up with the lines of code that can fulfill our requirements. After tweaking, the tweaked model will be added to github by the job itself and trigger building of Job2 again for retraining. Here, we again need to provide the github credentials in the SCM tab. Other configurations are as follows:-

After the commits of better accuracy and tweaked mode, the github repository will look somewhat like this-

Configuration of Job5: send_mail: Once the Job3 is successful, i.e. when required accuracy is achieved, this job will send a notification email to the developer regarding the success. The configuration will be as follows:-

Configuration of Job6: monitoring_job: This job will be doing the task of kubernetes. If there is a problem with running of the image, i.e. if the environment fails, this job will relaunch the respective image and resume the chain again. The configuration will be as follows:-

WORKING OF THE PROJECT

The initial architecture used in the model yielded us an accuracy of 94% and our benchmark set was 96%. Thus the job chain was initiated. Initial architecture used is as follows:-

After successful running of the chain, our model was tweaked and yielded an accuracy of 97%, thus achieving the benchmark. The tweaked architecture is as follows:-

After all the jobs run successful, the Jenkins Dashboard would look like this:-

Mail sent on successful model training and accuracy achievement.

Thus, our end-to-end automation of DL code auto-tuning was successful. This idea can be of great use in industry, as improper accuracy of models can create a setback in proper predictions and many more use-cases. Implementation of automated auto-tuning can reduce be very much faster as manual tweaking of model and training it again and again can be really time-taking. If this automation is deployed on platforms like AWS Cloud, even problems of RAM and CPU consumption of the machine can be resolved.

We thank our mentor Mr. Vimal Daga Sir from the bottom of our heart for giving us the opportunity to implement this wonderful automation task. We could really learn and clear many concepts through this task.

A PROJECT BY SAPTARSI ROY AND SARANYA CHATTOPADHYAY

Priyansi .

4 å¹´

Wow this is pretty darn good!

èµž

å›žå¤

Saptarsi Roy

Software Engineer 1 at Dell Technologies || Google Certified Associate Cloud Engineer || Redhat certified Engineer || EX-180 Certified || ARTH-2020 Learner @LW

4 å¹´

Well done

èµž

å›žå¤

æŸ¥çœ‹æ›´å¤šè¯„è®º

è¦æŸ¥çœ‹æˆ–æ·»åŠ è¯„è®ºï¼Œè¯·ç™»å½•

Saranya Chattopadhyayçš„æ›´å¤šæ–‡ç«

Deploying webserver on Kubernetes using Jenkins coding file

2020å¹´8æœˆ10æ—¥

Deploying webserver on Kubernetes using Jenkins coding file

Hey everyone, this is an article on how to deploy a webserver on the top of Kubernetes, using the Jenkinsfile approach.â€¦

2 æ¡è¯„è®º
Deployment of Prometheus and Grafana on the top of Kubernetes

2020å¹´8æœˆ5æ—¥

Deployment of Prometheus and Grafana on the top of Kubernetes

Hello all! How about a bit of log monitoring using the widely used monitoring tool in the DevOps world - Prometheus andâ€¦

1 æ¡è¯„è®º
Dynamic Jenkins cluster along with Kubernetes

2020å¹´8æœˆ4æ—¥

Dynamic Jenkins cluster along with Kubernetes

Hello world! Here goes our next article..
A simple audio player app using Dart and Flutter

2020å¹´8æœˆ3æ—¥

A simple audio player app using Dart and Flutter

Hola everyone!! Music is no doubt one of the best healers in the world. It has magic in itself.

2 æ¡è¯„è®º
CI/CD Pipeline using integration of Jenkins and Kubernetes

2020å¹´7æœˆ30æ—¥

CI/CD Pipeline using integration of Jenkins and Kubernetes

Hello All! How about a bit of "out of the way integration" for making a CI/CD pipeline both persistent and automated?â€¦

1 æ¡è¯„è®º
Deploy OwnCloud webapp on AWS using EKS

2020å¹´7æœˆ12æ—¥

Deploy OwnCloud webapp on AWS using EKS

Hey guys!! I am back with another article based on the Elastic Kubernetes Service (EKS) under the umbrella of AWS. Aâ€¦

6 æ¡è¯„è®º
Jenkins automation using Docker

2020å¹´6æœˆ26æ—¥

Jenkins automation using Docker

Hi guys!! Here's my another article based on an automation where we can launch Jenkins inside a container using Docker.â€¦

4 æ¡è¯„è®º
Deployment of webpage on AWS using HCL - Terraform

2020å¹´6æœˆ16æ—¥

Deployment of webpage on AWS using HCL - Terraform

Hey guys!! Currently under a training on Hybrid Multi Cloud, my first basic project on the same has been successfullyâ€¦

4 æ¡è¯„è®º
DevOps Automation Homework

2020å¹´5æœˆ7æ—¥

DevOps Automation Homework

Currently pursuing a summer training on DevOps Assembly Lines, the monotony of lockdown has been possibly wiped out forâ€¦

4 æ¡è¯„è®º

See all articles

A project on complete automation of DL model auto-tuning:-

Saranya Chattopadhyay

Full Stack Developer - DevSecOps @IBM ? DevOps Practitioner ? Ex Intern @CommVault, HighRadius ? 2x GCP, 1x Microsoft, 1x RedHat Certified Engineer

We thank our mentor Mr. Vimal Daga Sir from the bottom of our heart for giving us the opportunity to implement this wonderful automation task. We could really learn and clear many concepts through this task.

Saranya Chattopadhyayçš„æ›´å¤šæ–‡ç«

ç¤¾åŒºæ´žå¯Ÿ

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†

How to effectively implement CI/CD in Machine Learning Pipelines

Why Automated Testing is Essential for Reliable MLOps Pipelines

Top 5 AI-Powered VS Code Extensions for Coding & Testing in 2025

EDITION 12: Top 4 Pitfalls to Avoid in Modern Observability

Meet KaneAI: The AI That Makes Software Testing Easy for Everyone!

AI & Engineering Effectiveness: Market Landscape

AI Broke Test Automation

AutoGenesisAgent: Self-Generating Multi-Agent Systems for Complex Tasks

Fuzz testing - Automated Injection of Invalid Data

End-to-End (E2E) Process in Machine Learning

We thank our mentor Mr. Vimal Daga Sir from the bottom of our heart for giving us the opportunity to implement this wonderful automation task. We could really learn and clear many concepts through this task.

Saranya Chattopadhyayçš„æ›´å¤šæ–‡ç«

Deploying webserver on Kubernetes using Jenkins coding file

Deployment of Prometheus and Grafana on the top of Kubernetes

Dynamic Jenkins cluster along with Kubernetes

A simple audio player app using Dart and Flutter

CI/CD Pipeline using integration of Jenkins and Kubernetes

Deploy OwnCloud webapp on AWS using EKS

Jenkins automation using Docker

Deployment of webpage on AWS using HCL - Terraform

DevOps Automation Homework

ç¤¾åŒºæ´žå¯Ÿ

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†

How to effectively implement CI/CD in Machine Learning Pipelines

Why Automated Testing is Essential for Reliable MLOps Pipelines

Top 5 AI-Powered VS Code Extensions for Coding & Testing in 2025

EDITION 12: Top 4 Pitfalls to Avoid in Modern Observability

Meet KaneAI: The AI That Makes Software Testing Easy for Everyone!

AI & Engineering Effectiveness: Market Landscape

AI Broke Test Automation

AutoGenesisAgent: Self-Generating Multi-Agent Systems for Complex Tasks

Fuzz testing - Automated Injection of Invalid Data

End-to-End (E2E) Process in Machine Learning

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†