登录查看更多内容

Model deployment through an API

Firoj Kawser Jubayer

Manager, Scorecard/Model at Standard Chartered Bank

发布日期: 2019年1月6日

Okay, Let's just face it! Every newcomer in data science has faced this common problem. We worked so hard to learn new models or techniques and build so many awesome machine learning model tweaking different parameters but when it comes for deployment or show our model to others we lack the skills to share it. In my opinion, everyone should have the skill to share at least the basics, because it will give us the confidence and give other people the tools to use our model.

Today we will discuss how to expose a model as an API using auto-generating ui swagger. Swagger is the most widely used tooling ecosystem for developing APIs with the OpenAPI Specification. Swagger consists of both open sources as well as professional tools, catering to almost every need and use case.

Step 1: Train the model and pickle it using python pickle object

To expose our model as an API, we have to do it in two steps. First, we have to train our model and make that file as a binary format through python pickle object then use that object to get input from the user to analyze the data through our model.

# importing libraries

import pandas as pd 
import numpy as np
import pickle
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import confusion_matrix, classification_report, accuracy_score

# importing data

data = pd.read_csv('binary.csv')

# checking the head of the data

data.head(2)

admit	gre	gpa	rank
   0	380	3.61	3
   1	660	3.67	3

# checking the structure of the data

data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 400 entries, 0 to 399
Data columns (total 4 columns):
admit    400 non-null int64
gre      400 non-null int64
gpa      400 non-null float64
rank     400 non-null int64
dtypes: float64(1), int64(3)
memory usage: 12.6 KB

# spiliting train and test data

X= data[['gre','gpa','rank']]
y= data['admit']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30, 
                                                          random_state=222)

# traing the data with random forest

model = RandomForestClassifier(n_estimators=250)
model.fit(X_train, y_train)

RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
            max_depth=None, max_features='auto', max_leaf_nodes=None,
            min_impurity_decrease=0.0, min_impurity_split=None,
            min_samples_leaf=1, min_samples_split=2,
            min_weight_fraction_leaf=0.0, n_estimators=250, n_jobs=None,
            oob_score=False, random_state=None, verbose=0,
            warm_start=False)

# evaluating the model

prediction = model.predict(X_test)

print(confusion_matrix(y_test, prediction))
print('\n')
print(classification_report(y_test, prediction))
print('\n')
print(accuracy_score(y_test, prediction))

# confusion matrix

[[59 12]
 [33 16]]

# classification_report

              precision    recall  f1-score   support

           0       0.64      0.83      0.72        71
           1       0.57      0.33      0.42        49

   micro avg       0.62      0.62      0.62       120
   macro avg       0.61      0.58      0.57       120
weighted avg       0.61      0.62      0.60       120


# overall accuracy

0.625

# using pickle object for creating the binary file

with open('./Model/lr.pkl', 'wb') as model_pkl:
    pickle.dump(model, model_pkl, protocol=2)

As you can see we have created a very simple Random Forest model and the overall accuracy is quite low but we will take that as it's not our main objective today. Our model takes 3 parameters Gpa score, Gre score and School Rank to analyse if one student will be admitted or not.

Step 2: Create an API app for our model to expose

# importing libraries

import pickle
from flask import Flask, request
from flasgger import Swagger
import numpy as np
import pandas as pd

# setting the pickled file path

with open('./Model/lr.pkl', 'rb') as model_file:
    model= pickle.load(model_file)

# initiating the API app and swaggerify it

app = Flask(__name__)
swagger = Swagger(app)

# creating an interface and define the function

@app.route('/predict')
def predict_admit():
    
    """Example endpoint returning a prediction of admit
    ---
    parameters:
      - name: gre_score
        in: query
        type: number
        required: true
      - name: gpa_score
        in: query
        type: number
        required: true
      - name: rank_score
        in: query
        type: number
        required: true
    responses:
      200:
        description: "0: not admitted, 1: admitted"
    """

# taking user input

    gre_score= request.args.get("gre_score")
    gpa_score= request.args.get("gpa_score")
    rank_score= request.args.get("rank_score")

# predicting the input and returning it
  
    prediction = model.predict(np.array([[gre_score,gpa_score,rank_score]]))
    return str(prediction)

# open the API through port number 5000

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5000)

# app running at port number 5000

 * Serving Flask app "__main__" (lazy loading)
 * Environment: production
   WARNING: Do not use the development server in a production environment.
   Use a production WSGI server instead.
 * Debug mode: off
 * Running on https://0.0.0.0:5000/ (Press CTRL+C to quit)

That's it! We have actually created an API for our model to expose to other enviroment.

Let's run our API:

Image 1: Overall interface of our API

Image 2: Getting input from the user

Image 3: Publishing the result using our Random Forest model

Happy learing!

Note: This should not be treated as a production level deployment and it will not work with multimedia files.

Model deployment through an API

Firoj Kawser Jubayer

Manager, Scorecard/Model at Standard Chartered Bank

Step 1: Train the model and pickle it using python pickle object

Step 2: Create an API app for our model to expose

Let's run our API:

Image 1: Overall interface of our API

Image 2: Getting input from the user

Image 3: Publishing the result using our Random Forest model

更多精彩文章

社区洞察

其他会员也浏览了

Guide to Churn Prediction : Part 5— Graphical analysis

?? Big Data in Construction. Part 1-2: First Dataset. Tika OCR. Extracting content and metadata.

Exploratory Data Analysis Using D-Tale Library

Essential Tools for Aspiring Data Scientists: Your Path to Success

Z-Order: Visualization and Implementation

Choosing Your Companion for Data and AI Journey: Jupyter Notebook vs. Dataiku DSS. Part 2.

A Data Science Framework: To Achieve 99% Accuracy using Python

Seaborn

From Raw Data to Insights using Python Pandas

Time Series Vectors in Neo4j

Step 1: Train the model and pickle it using python pickle object

Step 2: Create an API app for our model to expose

Let's run our API:

Image 1: Overall interface of our API

Image 2: Getting input from the user

Image 3: Publishing the result using our Random Forest model

Deep learning with Keras (TensorFlow)

2019年1月12日

Data visualization with R

2018年12月28日

社区洞察

其他会员也浏览了

Guide to Churn Prediction : Part 5— Graphical analysis

?? Big Data in Construction. Part 1-2: First Dataset. Tika OCR. Extracting content and metadata.

Exploratory Data Analysis Using D-Tale Library

Essential Tools for Aspiring Data Scientists: Your Path to Success

Z-Order: Visualization and Implementation

Choosing Your Companion for Data and AI Journey: Jupyter Notebook vs. Dataiku DSS. Part 2.

A Data Science Framework: To Achieve 99% Accuracy using Python

Seaborn

From Raw Data to Insights using Python Pandas

Time Series Vectors in Neo4j