Logistic Regression
This is a follow up tutorial on my previous post linear regression on my road to understanding machine learning. As a summary, linear regression is where we try to fit a straight line to a bunch of data having an independent variable such as time against a dependent variable in our case the sales profit. From the straight line, can we then predict at a certain time what the profit will be like.
Logistic Regression is concerned with classification. From our previous post, I can say that sales profit below K400 is bad(0) according to our business rules and above K400 is good(1). In other words, what we a doing is separating our points to two discrete values, it is either good or bad profit and in binary 0 0r 1.
days,sales,productivity
0,0,0
5,100,0
10,300,0
15,4000,1
20,8000,1
25,9000,1
To view our data points, let's plot them.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn import linear_model as lm
#Linear Regression is simply fitting a straight line to the data, a best fit line.
#read file csv
df=pd.read_csv('sales_classification.csv')
#plot the data
plt.xlabel('sales')
plt.ylabel('productivity')
plt.scatter(df.sales,df.productivity,color="red",marker="+")
plt.show()
The resulting plot.
With linear regression, the formula for the straight line is shown below but for logistic regression, the curve that we are trying to draw is a sigmoid curve as shown;
领英推荐
##Linear
y= mx + c
##Logistic
y(x) = 1 / 1 + e-x(e raised to power of -x) where e is Euler's number
And doing the actual fitting, training and prediction. A prediction for a K399 and K10000 is done yielding the classifications.
x_sales=df[["sales"]]
y_productivity=df[["productivity"]]
model = lm.LogisticRegression()
# #train the model or fitting a line to the data
model.fit(x_sales,y_productivity)
#make the prediction for K399 sales profit
print("Predicted class of a sales profit of K399 ..",model.predict([[399]]))
print("Predicted class of a sales profit of K10000 ..",model.predict([[10000]]))
The results.
Predicted class of a sales profit of K399 .. [0]
Predicted class of a sales profit of K10000 .. [1]
Note: Please checkout DigitalSreeni's youtube channel if I am confusing you.
Senior Lecturer & Researcher in Geospatial Sciences | GIS and Remote Sensing Expert | Specializing in Urban-Rural Infrastructure, Disaster-Hazard Management, Climate, Energy, and Ecosystem Research.
1 年That's Great.
Remote Sensing Scientist
1 年Good stuff. Glad to see you learning!
ICT Lab Manager at Surveying and Land Studies Dept(UNITECH)
1 年My goal is to use ML for drone image segmentation, that is why I am starting with traditional ML and this will lead to deep learning, I believe.
Product Manager at Credit Corporation
1 年Used clustering for some campaign we wanted to try out. All in all the method segmented and provide proper boundaries on the segments that it created. It's a really good technique. The main take away I got from it was there should be some proper Reporting set up to measure the effects of clustering vs business as usual type setup