Everything that you should know about Linear Regression in python

Everything that you should know about Linear Regression in python

Data is the most powerful weapon in today’s world more than 2.5 quintillion bytes of data produced every single day. Over the last two years, we have generated more than 90% of world data.

Every sector uses data as its most important tool to grow its business. Every industry wants to integrate artificial intelligence in their business. Machine learning and Data Science technologies are in big demand, more than 1 million jobs are going to be created in the next 5 years.

Linear regression is one of the basic?statistical algorithms?in machine learning.

In this tutorial, you will learn about linear regression and it’s a various implementation in python.

After reading this blog post, you will be able to answer all of the following questions.

Some other blog post that you may want to read is


What is linear regression

Linear regression is a statistical model that inspects the linear relationship between two (Simple Linear Regression ) or more (Multiple Linear Regression) variables which are dependent variable and independent variables.

The term linear relationship means if one variable(or more) goes up then the other variable goes down and vice-versa is also true.

Let us understand it by an example- In a company XYZ, John salary is directly proportional to the no of hours did he work, This shows that a positive linear relationship between Ramesh salary and no of hours he works.

The price of laptops decreases throughout times, it shows us the negative linear relationship between laptop price and time.

Let us understand a little bit of math behind linear regression

The equation of linear regression is

                           Y = mX + b        

where

  • Y is an output variable,
  • X is the input variable- the variables we are using to make predictions,
  • m is the slope which determines the effect of x on y,
  • and b is the bias which means how much our prediction is differing from the actual output.

As we have seen in the previous blog post, one of the assumptions of regression is the output variable must be continuous for making a prediction.

In the regression, we always trying to minimize our error by finding the “line of best fit”. This is the line that tells us about the minimal error between our prediction and the actual output. In the given plot, we are trying to minimize our length of black lines as close as to the data points. For minimizing our error we use mean squared error also called the residual sum of squares.?

You can check out the following?article?written by Patrick and his team that will clearly explain the math behind linear regression.


[click to continue reading...]


要查看或添加评论,请登录

Abhishek Kumar Singh的更多文章

社区洞察

其他会员也浏览了