登录查看更多内容

GEOMETRIC INTUITION OF LOGISTIC REGRESSION

Manasa B Rao

Software Engineer Trainee at Opteamix | Learning, Innovating, and Growing in Tech

发布日期: 2023年12月5日

By term logistic regression you might think this to be a regression problem. But this is not a regression problem. It is a classification problem.?

????Logistic regression is a statistical model that uses a logistic function to model a binary dependent variable. In geometric interpretation terms, Logistic regression tries to find a line or plane which best separates the two classes. Logistic regression works with a dataset that is almost or perfectly linear separable

If the plane π passes through the origin , then

plane(π) = w^Tx.

Overall we have to find the w and b which corresponds to the plane such that the plane π separates the positive and negative points.

Suppose we take a data point Xi which is our query point and we have to find a distance of that point from the plane(π). So the distance di is written as:

di=w^Txi/||w||

If w is a unit vector and plane passes through the origin,

di = w^Tx.

So this is the distance of the point Xi from the plane but how would you determine that the current distance of the point is considered as positive or negative?.

If Xi and normal w are lying on the same side or in the same direction then we considered that the distance di>0 and yi is positive.

If Xj and normal w are lying on the opposite side or in the opposite direction then we considered that the distance dj<0 and yi is negative.

CLASSIFIER

If w^Tx>0 then yi = +1

If w^Tx <0 then yi = -1

There are 4 cases that we can consider:

case 1 : If our class label is positive i.e. yi = positive and Xi and w lie on the same side i.e. w^Tx >0, then the classifier is predicted that the class is also positive means its prediction is true. Here we can see that yi*w^Tx > o since positive *positive = Positive >0.

case 2 : If our class label is negativetive i.e. yi = negative and Xi and w lie on the opposite side i.e. w^Tx <0, then the classifier is predicted that the class is also negative means its prediction is true. Here we can see that yi*w^Tx > o since negative*negative = Positive >0.

领英推荐

Linear Regression - Part Three - GLM - Generalised…

Ajit Jaokar 7 个月前

Mastering Logistic Regression

Vinay Kumar Sharma 7 个月前

A Comprehensive Overview of Regression Methods

Utpal Dutta 8 个月前

case 3 : If our class label is positive i.e. yi = positive and Xi and w lie on the opposite side i.e. w^Tx <0, then the classifier is predicted that the class is negative means its prediction is false. Here we can see that yi*w^Tx < o since positive *negative = negative < 0.

case 4 : If our class label is negative i.e. yi = negative and Xi and w lie on the same side i.e. w^Tx >0, then the classifier is predicted that the class is positive means its prediction is false. Here we can see that yi*w^Tx < o since positive *negative = negative < 0.

We want the classifier to be very good. i.e. we want maximum number of correctly classified point. we want as many points as possible to have yi*w^Tx>0. Given a training dataset (xi,yi) is fixed. we want to take w so that it maximizes yi*w^Txi.

HOW OUTLIER WILL IMPACT THE MODEL

Suppose we take an example of two planes π1 and π2 that are used to separate the 2 class label data points positive and negative.

There is one outlier in the dataset. If we find yi*w^Txi for π1 we get a negative value and for π2 we get positive value. So we conclude that π2 is a better classifier which is not true. Plane π1 classifies more accurately than plane π2. So such outlier impact more on our model.

To preserve the model from outlier we have to modify our optimal function.

MODIFYING OPTIMAL FUNCTION USING SQUASHING

To modify our optimal function we will use the squashing technique. The idea is that :

If the distance of a point from plane is small then we will use it as it is.
If the distance of a point from plane is large then we will convert it into smaller value.

We will use some function over our optimal function for preserving the model from such outliers.

we will use the sigmoid function to optimize our equation.

The sigmoid function is written as :

σ(x) = 1/(1 + e^?x)

Maximum value of sigmoid function is 1.

Minimum value of sigmoid function is 0.

Our optimal function will be :

GEOMETRIC INTUITION OF LOGISTIC REGRESSION

Manasa B Rao

Software Engineer Trainee at Opteamix | Learning, Innovating, and Growing in Tech

领英推荐

社区洞察

其他会员也浏览了

Logistic Regression: Basics, Obscurities and its Membership as a Classifier

ILustrating Logistic Regression : Binary Classification

The Day, Linear Regression fails - Example 1

Evaluation of logistic regression model ( Must read for all )

How to Interpret the Intercept in 6 Linear Regression Examples

Logistic Regression

Logistic Regression Models for Multinomial and Ordinal Variables

What is a Logit Function and Why Use Logistic Regression?

Ridge Regression and Lasso Regression.

Linear Regression (Less Linear Than You Might Think)