Model deployment through an API
Okay, Let's just face it! Every newcomer in data science has faced this common problem. We worked so hard to learn new models or techniques and build so many awesome machine learning model tweaking different parameters but when it comes for deployment or show our model to others we lack the skills to share it. In my opinion, everyone should have the skill to share at least the basics, because it will give us the confidence and give other people the tools to use our model.
Today we will discuss how to expose a model as an API using auto-generating ui swagger. Swagger is the most widely used tooling ecosystem for developing APIs with the OpenAPI Specification. Swagger consists of both open sources as well as professional tools, catering to almost every need and use case.
Step 1: Train the model and pickle it using python pickle object
To expose our model as an API, we have to do it in two steps. First, we have to train our model and make that file as a binary format through python pickle object then use that object to get input from the user to analyze the data through our model.
# importing libraries
import pandas as pd
import numpy as np
import pickle
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import confusion_matrix, classification_report, accuracy_score
# importing data
data = pd.read_csv('binary.csv')
# checking the head of the data
admit gre gpa rank
0 380 3.61 3
1 660 3.67 3
# checking the structure of the data
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 400 entries, 0 to 399
Data columns (total 4 columns):
admit 400 non-null int64
gre 400 non-null int64
gpa 400 non-null float64
rank 400 non-null int64
dtypes: float64(1), int64(3)
memory usage: 12.6 KB
# spiliting train and test data
X= data[['gre','gpa','rank']]
y= data['admit']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30,
# traing the data with random forest
model = RandomForestClassifier(n_estimators=250), y_train)
RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
max_depth=None, max_features='auto', max_leaf_nodes=None,
min_impurity_decrease=0.0, min_impurity_split=None,
min_samples_leaf=1, min_samples_split=2,
min_weight_fraction_leaf=0.0, n_estimators=250, n_jobs=None,
oob_score=False, random_state=None, verbose=0,
# evaluating the model
prediction = model.predict(X_test)
print(confusion_matrix(y_test, prediction))
print(classification_report(y_test, prediction))
print(accuracy_score(y_test, prediction))
# confusion matrix
[[59 12]
[33 16]]
# classification_report
precision recall f1-score support
0 0.64 0.83 0.72 71
1 0.57 0.33 0.42 49
micro avg 0.62 0.62 0.62 120
macro avg 0.61 0.58 0.57 120
weighted avg 0.61 0.62 0.60 120
# overall accuracy
# using pickle object for creating the binary file
with open('./Model/lr.pkl', 'wb') as model_pkl:
pickle.dump(model, model_pkl, protocol=2)
As you can see we have created a very simple Random Forest model and the overall accuracy is quite low but we will take that as it's not our main objective today. Our model takes 3 parameters Gpa score, Gre score and School Rank to analyse if one student will be admitted or not.
Step 2: Create an API app for our model to expose
# importing libraries
import pickle
from flask import Flask, request
from flasgger import Swagger
import numpy as np
import pandas as pd
# setting the pickled file path
with open('./Model/lr.pkl', 'rb') as model_file:
model= pickle.load(model_file)
# initiating the API app and swaggerify it
app = Flask(__name__)
swagger = Swagger(app)
# creating an interface and define the function
def predict_admit():
"""Example endpoint returning a prediction of admit
- name: gre_score
in: query
type: number
required: true
- name: gpa_score
in: query
type: number
required: true
- name: rank_score
in: query
type: number
required: true
description: "0: not admitted, 1: admitted"
# taking user input
gre_score= request.args.get("gre_score")
gpa_score= request.args.get("gpa_score")
rank_score= request.args.get("rank_score")
# predicting the input and returning it
prediction = model.predict(np.array([[gre_score,gpa_score,rank_score]]))
return str(prediction)
# open the API through port number 5000
if __name__ == '__main__':'', port=5000)
# app running at port number 5000
* Serving Flask app "__main__" (lazy loading)
* Environment: production
WARNING: Do not use the development server in a production environment.
Use a production WSGI server instead.
* Debug mode: off
* Running on (Press CTRL+C to quit)
That's it! We have actually created an API for our model to expose to other enviroment.
Let's run our API:
Image 1: Overall interface of our API
Image 2: Getting input from the user
Image 3: Publishing the result using our Random Forest model
Happy learing!
Note: This should not be treated as a production level deployment and it will not work with multimedia files.