MLOPS: integration between machine learning and devops
Vedansh Shrivastava
Devops Engineer | Cloud Architect | Software Engineer | ECE Graduate | Application Developer |
Hello everyone I am Vedansh Shrivastava here we will discuss something about mlops which can solve so many use cases or the problems faced in machine learning models.
So Lets Start:-
Problem Statement :
1. Create container image that’s has Python3 and Keras or numpy installed using dockerfile
2. When we launch this image, it should automatically starts train the model in the container.
3. Create a job chain of job1, job2, job3, job4 and job5 using build pipeline plugin in Jenkins
4. Job1 : Pull the Github repo automatically when some developers push repo to Github.
5. Job2 : By looking at the code or program file, Jenkins should automatically start the respective machine learning software installed interpreter install image container to deploy code and start training( eg. If code uses CNN, then Jenkins should start the container that has already installed all the software required for the cnn processing).
6. Job3 : Train your model and predict accuracy or metrics.
7. Job4 : if metrics accuracy is less than 80% , then tweak the machine learning model architecture.
8. Job5: Retrain the model or notify that the best model is being created
9. Create One extra job job6 for monitor : If container where app is running. fails due to any reason then this job should automatically start the container again from where the last trained model left
SOLUTION :
Before rushing to the solution we must know some keywords and some problems in machine learning . They are :-
HYPER-PARAMETER : They are the parameters which we cant determine with the help of any formula or algorithm but they are adjustable we can change them to achieve higher accuracy or different result . Such as no. of epochs , Size of filter , no. of neurons etc.
1)Machine Learning model are good in prediction but the accuracy of the model is a problem , we have to achieve highest accuracy possible but there are many hyper-parameters when they are changed they may or may not affect the accuracy the model they may increase or decrease the accuracy of model .
2)So to achieve a best set of hyper-parameters is so hard for humans and it is very time and resource consuming , So we have created an architecture that will do this work fully automated .
Now going to solution : -
1. So according to problem statement we will create docker images in which we will install the python interpreter and the modules respective for the code . If we have CNN code we will install pyhton , keras , tensorflow and more needed library . If the code is Linear Regression then we will install numpy , sklearn etc.
2. Now we get to our Jenkins now we will start building the job chain :-
Job1: It is the simplest and first job jenkins just have to go to the github and download the code provided by the developer and it will copy the whole code in a folder in workspace .We have our python code also in this folder which we will discuss ahead .
Here both the codes are created by same developer so he knows about the code but if the machine learning code will be created by some other so he will set some restrictions or the must write things for it like we are gonna need accuracy . we can save accuracy in some file all we need which is must in the code is the info regarding the accuracy. This code we need at the end of file .
scores = model.evaluate(X_test , y_test , verbose=1) print("test loss" , scores[0]) print("teat accuracy" , scores[1]) accuracy_file = open('/mlops/accuracy.txt' , 'w') accuracy_file.write(str(scores[1])) accuracy_file.close()
Job2: By looking at the code we have to identify which type of code it is we can use many ways for this but one way that i used is we can write a python programme which will read the code and find some specific string in the code if that string is present then we can identify it . I have used simple file handling in python . If the code is CNN then jenkins will identify it and it will launch the respective os for it with correct docker image . It will also copy the model code in the volume attached .
Job3: There we will train our model and our code must have some parts which will take the accuracy and store it in a file namely "accuracy.txt'' and this file must be created in the same directory in which we have model and all data. Now this will do all the magic for to understand you need to see code.
mport pandas as pd import numpy as np from keras.models import Sequential from keras.layers import Dense from keras.layers import Convolution2D from keras.layers import MaxPooling2D from keras.layers import Flatten model=Sequential() model.add(Convolution2D(filters=32,kernel_size=(3,3),activation='relu',input_shape=(64,64,3))) model.add(MaxPooling2D(pool_size=(2,2))) model.add(Flatten()) model.add(Dense(units=128,activation='relu')) model.add(Dense(units=64,activation='relu')) model.add(Dense(units=32,activation='relu')) model.add(Dense(units=16,activation='relu')) model.add(Dense(units=8,activation='relu')) model.add(Dense(units=1,activation='sigmoid')) print(model.summary()) model.compile(optimizer='adam',loss='binary_crossentropy',metrics=['accuracy']) from keras_preprocessing.image import ImageDataGenerator train_datagen = ImageDataGenerator( rescale=1./255, shear_range=0.2, zoom_range=0.2, horizontal_flip=True) training_set = train_datagen.flow_from_directory( '/model_files/images/images/train/', target_size=(64, 64), batch_size=32, class_mode='binary') test_set = train_datagen.flow_from_directory( '/model_files/images/images/test/', target_size=(64, 64), batch_size=32, class_mode='binary') model.fit(training_set, steps_per_epoch=100, epochs=10, validation_data=test_set, validation_steps=800) scores = model.evaluate(test_set , verbose=1) print("test loss" , scores[0]) print("test accuracy" , scores[1]) accuracy_file = open('/mlops/test_accuracy.txt' , 'w') accuracy_file.write(str(scores[1])) accuracy_file.close() test_accuracy = scores[1] import os if test_accuracy < 0.80 : print("Accuracy is less than 80 Run accuracy File") os.system("sudo python3 accuracy.py") else: print("Just got the Accuracy Greater Than 80 [Accuracy :- {}]".format(test_accuracy)) model.save("mlops_task3.h5")
This is the code for the model we are gonna train model from this code , in the end you can see that if the test_accuracy is less then 80% then it will run the another code that is accuracy.py it doesn't do much but it will open the test_accuracy.txt file and read the value if it is less then 80% then this file will run another file that is code_changer this file will change the code according to need and tune some hyper-parameters and then it will again run the model and train it.
import os f=open('/model_files/test_accuracy.txt') content=f.read() accuracy=int(content[-47:-45]) print("Accuracy is ",accuracy) if accuracy < 80: print("Accuracy is less than 80 Running Hyper Parameter File") os.system("python3 /model_files/code_changer.py") os.system("python3 /model_files/model.py") else: print("Just got the Accuracy Greater Than 80 [Accuracy :{}]".format(accuracy))
So like this the process will be like a loop until the minimum accuracy it attained as soon as the 80% accuracy is attained the model will get saved from the name of "mlops_task3.h5" and the process will stop.
CONSOLE OUTPUT:
Job4: This is very simple job it will only notify the developer that model is ready . I have used Email notification so as job3 will be build stable this job will fail and we will get the email notification at the email that we have given.
Job5: This is last and final job after this our model will be completed , this job will go and check that the containers that we have used for model training are running or not if they are not running then it will notify the developer.
Now we can talk about the pyhton code our code is not so great it have very basic python file handling .
BUILD PIPELINE VIEW
Reader.py :
we just used normal file handling first we loaded model.py file then we open it in read mode and store data on ram then searched for some special strings such as in Linear model the word ''Linear" will be there for sure , in KNN code the word ''Kneighbour" will be there for sure similarly in the CNN code the string "keras" or "tensorflow" and "Conv2D" will be there in this way we can identify the code.
dev_code = open('root/task3_data/model.py') #as normal file handling this code will open the file and store in dev_code code ?= dev_code.read() #this have just a read function which will read tha whole code and store it in code if 'sklearn' or 'pandas' and 'LinearRegression' in code : #we are using if function on the readed file and find some specific words like sklearn or pandas and LinearRegression if these words are found then them we will say if is LinearModel print('LinearModel') elif 'KneighborsClassifier' : #similarly this will find the word KneighborsClassifier and if it found then it is KNN_MODEL print('KNN_CODE') elif 'keras' or tensorflow in code : #similar if 'Conv2D' in code : print('CNN_CODE') else : print('NOT_CNN') else :
code_changer.py :
In this code we used different approach in different type of models we are gonna discuss mainly CNN in this we just played with the hyper parameter we have increased the no of epochs mostly and increased the no. of neurons and changed some other parameters.
import re file=open("/model_files/model.py") content=file.read() lines = content.split("\n") ep=0 kernel_1=0 kernel_2=0 pool_size_1=0 pool_size_2=0 epoch="epochs=*\d{1,3}" pattern2=r"filters=*\d{1,3}" pattern3=r"kernel_size=\(\d{1,3},\d{1,3}\)" pattern4=r"pool_size=\(\d{1,3},\d{1,3}\)" ep=0 kernel_1=0 kernel_2=0 pool_size_1=0 pool_size_2=0 check=0 index=0 layers = [] count=0 for i in range(len(lines)): if 'model.add(Convolution2D(' in lines[i] and 'model.add(MaxPooling2D(' in lines[i+1]: print(lines[i]) print(lines[i+1]) layers.extend([lines[i],lines[i+1]]) count+=1 index=i+2 print("Count is ",count) layers=[layers[-2],layers[-1]] if count < 3: print("Added One CRP Layer") print("Now total number of CRP Layers is ",count+1) lines.insert(index,layers[0]) lines.insert(index+1,layers[1]) elif count ==3: for i in range(len(lines)): if 'model.add(Convolution2D(' in lines[i]: kernel_size=lines[i].index('kernel_size') filters=re.findall(pattern2,lines[i]) kernel=re.findall(pattern3,lines[i]) temp_line=lines[i] if len(filters)>0: new_filter=int(filters[0].split("=")[-1])+5 if len(kernel)>0: a=kernel[0].split("=")[-1] b=a.split(",") kernel_1=int(b[0].split('(')[1])+2 kernel_2=int(b[1].split(')')[0])+2 if check > 0: lines[i]="model.add(Convolution2D(filters={},kernel_size=({},{}),activation='relu'))".format(new_filter,kernel_1,kernel_2) check+=1 elif 'model.add(MaxPooling2D(' in lines[i]: pool_size=lines[i].index('pool_size') result4=re.findall(pattern4,lines[i]) if len(result4)>0: a=result4[0].split("=")[-1] b=a.split(",") pool_size_1=str(int(b[0].split('(')[1])+2) pool_size_2=str(int(b[1].split(')')[0])+2) lines[i] = "model.add(MaxPooling2D(pool_size=({},{})))".format(pool_size_1,pool_size_2) elif 'epochs' in lines[i]: result=re.findall(epoch,lines[i]) if len(result)>0: ep=int(result[0].split("=")[-1])+5 lines[i]="epochs={},".format(ep) print("Insert CRP layer at Index {} if Number of CRP Layers are less than 3".format(index)) print("Layers ",layers) file.close() for line in lines: print(line) content="" for new_line in lines: content=content+new_line+"\n" file=open("/model_files/model.py","w") file.write(content) file.close()