Simple Linear Regression

Simple Linear Regression

Every journey has a starting point and if we talk about Machine learning algorithms then the starting point is considered LINEAR REGRESSION, so let’s try to understand it.

Linear regression:?

1] What is linear regression

2] How does it work?

3] Understanding the concept, slopes, and intercepts

What is Linear Regression?

  • Linear Regression is a method that helps us predict certain numbers using the data that we gave it.

Every model does that, what’s different here? Let's find out.


Can it solve any problem?

  • Linear regression cannot solve every problem, it can solve problems whose output is continuous (meaning measurable and not countable) in nature.

Example :?

  • What is the car price based on its fuel type or any other factor
  • What will be my CGPA if I studied for certain hours
  • What would be my package if I got a 7 CGPA?


??????????There are some assumptions for Linear Regression :?

  • Independent variables should be linearly related to the dependent variable

We can know this by visualizing the data (scatter-plot)

Example: If the increase in study hours increases CGPA.

  • There should be minimum variance - meaning data should not be very far from the mean (average of the column) because internally Linear Regression uses mean and values far from the mean could impact the result.


In this article we will focus on Simple Linear Regression, now let’s dive into how it works.

When we work with Linear Regression, we need to find the best-fit line.

What is a best-fit line?

If the line covers maximum points then it’s a best-fit line, a line which makes less error.

As we all know and have studied what is the formula for line,?

Line : y = mx + b

Mathematical intuition: we need to find the value of m (slope) and b (intercept) and the x is (size of mobile).

What does m means?

Let’s say we have data that includes

Size of mobile and its price, depending on the size of the mobile the price changing

According to LR, we will do: m*size + b

m = weightage, how important is m here to determine the price

If the number of m is high, then it suggests that On size the price of the mobile depends a lot and vice versa.

b = It kind of supports M, if the value of x is very low or is 0.

The question that arises here is, if both solve the problem, then what is the difference?

  • Ordinary least square works well when there are less number of dimensions (columns)
  • When there are a large number of dimensions, Gradient descent is much more efficient than others.

Concepts in mathematical terms

Let’s try to see what Slope and Intercept are in mathematical terms

Slope: It tells us how much the value of y will change if we increase x (this is for each data point).

Yi = each data point in Y

Y_bar = mean of column y

Xi = each data point in X

X_bar = mean of column x?

Intercept: The intercept is where the line crosses the axis as shown in the above image

Y_bar = mean of column y

X_bar = mean of column x

M = whatever value we have from the above calculation

We talked about the best fit-line and as we know the best fit-line is a line which makes less error but how do we find errors?

So, the error is nothing but the difference between the actual value and the predicted value (which we got after applying the m and b values to the equation).

We keep changing the value of m and b until we don’t get a line which gives us less error.

Linear regression with code

Let’s try to put this into code without using any library

1] Importing required libraries :

2] Reading the data :?

Placement data which contains CGPA and Package

3] Fetching X and Y

4] Splitting the data into training and testing set

5] Creating a class to bundle all this into a single box, initialized m and b with 0

6] Created a fit method that will take the data and use the formula mentioned above for m and b

7] Created a prediction method that will take a test point and apply the value of m and b we got?

Formula : y = mx + b

8] Created an object for the class, using which we can call or use all the methods (fit and predict)

Using the fit method which applies data points to the formula

Predicting using the prediction method


This was a very simple explanation for linear regression for us to get started, we saw how slope and intercept work with simple explanation and also with math.


Indeed Inspiring Infotech的更多文章

