登录查看更多内容

Linear Regression

Ishika Garg

Consultant - AI/ML Developer | Genpact

发布日期: 2024年8月16日

Today, we’re diving into the math behind one of the most fundamental models in machine learning: linear regression. This is the first model you’ll typically learn when starting out in the field.

DEFINATION —

Linear Regression is a way to find a straight line that best matches a set of data points. It helps predict one value based on another by showing the relationship between them. The formula of linear regression –

? = c + mx

where, ? -> predicted value

m -> slope; it tells how much the predicted value changes for each unit in the input value

c -> Intercept; it tells you the predicted value when the input variable is ‘0’

x -> data points

Notations used in the Research paper are –

In case of more than 1 input feature, the formula will be -

Linear regression is used for:

Supervised Learning?—?It’s a type of machine learning where the model is trained on labeled data.
Regression Problems?—?It predicts a continuous outcome, like forecasting sales or estimating prices.

IMPORTANT TERMINOLOGIES –

Residual Error: It’s the difference between the actual value and what the model predicts. If the residual error is ‘0’, it may indicate that the model is overfitting.
Cost Function: It measures the error between predicted and actual values. The cost function varies depending on the model. In linear regression, the cost function used is Mean Squared Error.

Where J(θ) → cost function,

m -> number of training samples,

h(θ) -> predicted value,

y -> actual value

3. Repeat Convergence Algorithm?—?It is an iterative process that repeatedly updates the model parameters (θ)?to reduce the cost function until it reaches the global minimum.

where α is the learning rate
      ?/?θ1  J(θ1)   is derivate of cost function, i.e., slope

CASE I?—?In this case, the slope is +ve ->

θ?:= θ1 – α(+ve)

So, the value of θ1 decreases.

CASE II?—?In this case, the slope is -ve ->

θ1?:= θ1 – α(-ve)

θ1?:= θ1 + α(+ve)

So, the value of θ1 increases.

CASE III?—?In case, the slope is nearly 0 ->

θ1?:= θ1 – α(0)

θ1?:= θ1 – 0

So, the value of θ1 will be unchanged.

4. Learning Rate?—?It is a hyperparameter that controls the step size at each iteration while moving toward a minimum of the cost function. It determines how quickly or slowly the model updates its parameters (weights) during gradient descent.

CASE I?—?Learning Rate is too high?-?The model may overshoot the minimum, leading to divergence.

领英推荐

Linear Regression

Raj Kishore Agrawal 6 个月前

Unlocking the Power of Machine Learning: The Right…

Dr.Manish Kumar Jain 6 个月前

Linear Regression is one of the most widely used…

Rashmi Priya 8 个月前

CASE II?—?Learning Rate is low?-?The model converges very slowly, taking more time to reach the optimal solution.

How θ value is calculated?

Let’s break down the calculation of θ step by step.

To keep it simple, we’ll use basic data points: (1,1), (2,2), and (3,3), with an initial θ0 = 0 and α = 0.1

STEP 1?—?Define the hypothesis function and cost function for simple linear regression:

Hypothesis Function:

Cost Function:

STEP 2?—?Assume a Random Value for θ1 and Calculate the Cost Function

Let’s assume θ1 = 0:

STEP 3?—?Minimize the Cost Function Using Gradient Descent:

Iteration I -?θ1 = 0:

Iteration II?-?θ1 = 0.467:

Iteration III?-?θ1 = 0.716:

STEP 4?—?Continue repeating Step 5 until the cost function hits its lowest value.

Interesting Questions —

Q?—?Why do we divide the MSE by 2 in the cost function?

A?—?Let’s look at both scenarios:

Situation I?—?Cost function without 1/2:

When calculating the gradient, an extra factor of 2 (which appears after taking the derivative) makes the math a bit messier.

Situation II— Cost function with 1/2:

When we divide the cost function by 2, the extra factor of 2 (which appears after taking the derivative) cancels out, simplifying the gradient descent process and making calculations easier.

Q?—?Why do we take the derivative of the cost function while updating θ?

A?—?We take the derivative to find the slope of the cost function, which helps us adjust the model’s parameters to minimize errors and improve the model.

Reference —

https://www.youtube.com/watch?v=jerPVDaHbEA&list=PLTDARY42LDV7WGmlzZtY-w9pemyPrKNUZ&index=2

Finally —

I hope this blog clarifies linear regression for you!

Got a particular ML topic you’re curious about? Drop your suggestions in the comments, and I’ll do my best to cover them. Thanks for reading!

Feel free to hit me up on LinkedIn. Coffee’s on me (virtually, of course) ??

Giovanni Sisinna

??Portfolio-Program-Project Management, Technological Innovation, Management Consulting, Generative AI, Artificial Intelligence??AI Advisor | Director Program Management @ISA | Partner @YOURgroup

5 个月

Great explanation, Ishika Garg. Linear regression is foundational for predictive modeling.

1 次回应

Harpreet Kaur

Immediate joiner | SOC Analyst | #Open to work | Basics of Networking and Cyber Security

7 个月

Interesting

2 次回应

Giovanni Sisinna

??Portfolio-Program-Project Management, Technological Innovation, Management Consulting, Generative AI, Artificial Intelligence??AI Advisor | Director Program Management @ISA | Partner @YOURgroup

7 个月

Insightful overview of linear regression fundamentals! Understanding the cost function's role and the learning rate's impact is crucial for optimizing models efficiently. Thanks for sharing, Ishika Garg!

2 次回应

Adarsh Srivastav

SDE @Amazon | Data Structures and Algorithms | Java | AWS

7 个月

Thanks for Sharing Ishika Garg

1 次回应

Dhanushant Bishnoi

7 个月

Good info , thanks for sharing it with proper implementation steps

1 次回应

查看更多评论

要查看或添加评论，请登录

Ishika Garg的更多文章

SVD — Single Value Decomposition

2025年1月9日

SVD — Single Value Decomposition

Today, we embark on an exciting journey into the world of Singular Value Decomposition (SVD) — a fundamental concept in…

8 条评论
RAG

2024年7月18日

RAG

RAG stands for Retrieval-Augmented Generation. It’s a game-changer when working with LLMs.

6 条评论
Vector Database

2024年7月4日

Vector Database

In the world of databases, we’re all familiar with traditional databases like RDBMS. But have you heard about vector…

9 条评论
Transformers

2024年6月7日

Transformers

We’re exploring the realm of Deep Learning, focusing on the pivotal role that “transformers” play in driving…

23 条评论
LLM Models

2024年5月31日

LLM Models

LLMs are a category of foundation models trained on large amounts of data (such as books, articles, etc.), enabling…

14 条评论
Foundation Model

2024年5月23日

Foundation Model

FOUNDATION MODEL is a versatile machine learning model that has been pre-trained on a vast amount of unlabelled, and…

6 条评论

See all articles

Linear Regression

Ishika Garg

Consultant - AI/ML Developer | Genpact

领英推荐

Ishika Garg的更多文章

社区洞察

其他会员也浏览了

Matrix Operations in Linear Regression

Why Mean Squared Error for Linear Regression?

What Is Regression In Machine Learning?

What Is Polynomial Regression in Machine Learning?

Logictic or Linear Regression? Are they same? Look alike? Haha it's not

Machine Learning Algorithms Everyone Should Know:

7 common mistakes when doing Machine Learning