登录查看更多内容

Linear Regression: How to find line of best fit ?

Debi Prasad Rath

@AmazeDataAI- Technical Architect | Machine Learning | Deep Learning | NLP | Gen AI | Azure | AWS | Databricks

发布日期: 2023年11月11日

Hi connections. Trust you are doing well. In this article we will be discussing about "how to find the best fit line" in case of linear regression. This is in continuation of our linear regression article series. Let us get started.

A quick recap, in the earlier post we have discussed about linear regression model specifics in an overview. Well, it turns that every time model is used to predict there will be an error associated with it. Error is nothing but difference between predicted values and actual values of target/response variable. Recollect that the task is to fit a line or build a model with least errors. But before that let us understand definition of error ?

loss/error = y_predicted - y_actual # equation 1
loss/error = |y_predicted - y_actual| # equation 2
loss/error = (y_predicted - y_actual) ** 2 # equation 3

NOTE:- this interpretation (equation 1) of error will not make sense in context of model error analysis. This is due to fact that postive error and negative error might cancel out. So, what is next ? Let us redefine error. <take absolute error>

NOTE:- equation 2 definition of error will also not make sense. Why? Taking absolute terms sounds an effective solution, but one or mode models might produce the same error. So what is next ? Let us redefine error. <take squared error>

NOTE:- equation 3 error definition will make sense in true terms. Squared error for any data point and total sum of squared errors for each model would differ from one to the other. In this way, we are able to remove shortcomings of earlier approaches (equation 1 and 2)

By convention , the term sum of squared error (SSE) is redefined to find the best fit of line, and this algorithm is known as "Ordinary Least Squares" as OLS regression. Precisely, the model that has least sum of squared error will be considered as "line of best fit".

Ajit Jaokar 2 个月前

Linear Regression - part one

Ajit Jaokar 3 个月前

Logistic regression can replicate multiple parametric…

Adrian Olszewski 8 个月前

I know, this is a lot to grasp. But with time it will be easy going. In short you can say that line of best fit works by minimizing sum of squared errors between observed value and predicted value to arrive at a geometric equation. I hope this makes sense. Now that we introduced one new algorithm "OLS" to find the line of best fit a new question comes up in mind. That is , how does "OLS" work in order to get least squares? The short answer to this is "cost function"

A cost function is used to find the line of best fit as there might be many lines which represents same X(s) --> y mapping relation. The line with minimum sum of squared error will be the line of best fit. In other words cost function is also known as the objective function to miimize SSE. Recollect the equation of linear regression which contains unknown beta terms which needs to be calculated optimally such that cost function is minimum. Let us define the cost function of linear regression as mentioned below,

cost_function = 1/num_rows * ((Beta0 + Beta1 * X1) - y) ** 2

In case of linear regression mean squared cost function is used, average squared error between predicted values and observed values to find optimal values of predicted(yi) and observed/actual (yi). In order to find optimal values of beta terms cost function will be running iteratively such that it reaches to a global mimimum. Mathematically, minimum of cost function can be achieved by finding its gradient/derivative with respect to respective beta terms and equate it 0.

d/dbeta1 (cost_function) = 0  # get beta1 value
d/dbeta0 (cost_function) = 0  # get beta0 value

This algorithm is known as "gradient descent",one of most widely used algorithm behind latest innovation in all of machine learning and deep learning. Intuitively gradient descent tries to update values of beta terms unless there is an optimal solution of cost function. Today we have covered a lot. We will continue from here in the next post on "gradient descent".

I hope you have enjoyed reading this article and learnt something new. Please feel free to add or comment on this post. Take care. Happy learning.

Linear Regression: How to find line of best fit ?

Debi Prasad Rath

@AmazeDataAI- Technical Architect | Machine Learning | Deep Learning | NLP | Gen AI | Azure | AWS | Databricks

领英推荐

更多精彩文章

社区洞察

其他会员也浏览了

A Comprehensive Overview of Regression Methods

Logistic Regression: Basics, Obscurities and its Membership as a Classifier

ILustrating Logistic Regression : Binary Classification

The Day, Linear Regression fails - Example 1

Logistic Regression