Tadhana jili slot login register,GoTyme kiosk near me.REGISTER NOW GET FREE 888 PESOS REWARDS!

Linear Regression

Linear and logistic regressions are the forms of algorithms students learn at the very first as the part of statistics and data science learning path. However, there are so many forms of regressions, which are used depending on the context and type of the problem. However, linear regression is considered an essential concept of data science and machine learning. In this document, we will explain linear regression and how to perform this in a python environment.?

Linear Regression in Data Science and Machine Learning

In data science, the concepts of Linear Regression are taught in Statistics and in Machine Learning as part of the Supervised Machine Learning Methods.?

Problem Statements:-

A restaurant chain wants to understand future revenue and profits.?
What can be the property rates in Gurugram for the next 5-Years from now?
How many customers will place orders on our web platform this Diwali?
What will be the future sales in the coming festive season??

Can you think of some more problems where simple predictions are required??

Exactly, there comes the linear regression play its role? The above problem statements can be addressed with the help of linear regression.

So what is linear regression??

If we simply search on Google 'what is linear regression', wikipedia.com gives one line answer: “Linear regression is the most basic and commonly used predictive analysis”.?

In simple language, it can be explained that Linear Regression is the simplest form of predictive analysis which uses one set of variables to predict the value of another.?

Dependent and Independent Variables:?

The variable which we want to predict is known as the dependent variable and the variables which are used to predict the other variable are known as independent variables.

The regression equation:?

The linear regression predicts the dependent variable by estimating the coefficients of the independent variables through a linear equation:

Yi = B0 + B1Xi +Ei

Where?

Yi is the independent variable

B0 is the Constant?

B1 is the Slope

Xi is the independent variable?

Ei is the random error

Graph of the linear regression:?

Random Errors AKA Residuals:

Random errors are also known as residuals which can be calculated by summing up the values found after subtracting actual values from the predicted values.

Ei=Ypredicted-Yactual?

Where Ypredicted = B0 + B1Xi

The best fit line in linear regression?

As we can see in the above graph taking the independent variable on X-axis and dependent on the Y-axis, we can plot a scatter plot and the best fit line is the line which finds the trend in the plot having the minimum sum of the errors.

The Evaluation Metrics:?

Evaluation metrics are used to assess the strength of the linear regression model. The evaluation metrics can tell how accurate our model can predict with respect to the actual observed values. There are two main metrics used to evaluate a regression model.

R-Squared or Coefficient of Determination: The value of the R-squared ranges between 0 to 1. The higher the value the more our model fits the data. It explains how well our model has captured the variance of the data.

Mathematically it is represented as follows:

???????????R2 = 1 – ( RSS/TSS )?

Where RSS stands for Residual Sum of Squares and TSS stands for Total Sum of Squares

RSS is measured by finding the difference between expected and actual output by the following formula?

TSS is measured by finding the sum of errors in the data points of the target variable. Mathematically it is represented as follows:?

2. Root Mean Square Value: It is the square root of the variance of the residuals and is represented mathematically by the following formula:

Linear Regression Assumptions

Linearity: Relationship between the X independent variable and the Y dependent variable should be linear
Independence: Observations should be independent of each other. There should not be a correlation between the observations.
Homoscadesity: Variance of the residuals should be the same given any value of the X variables
Normality: The residual means should be equal to zero or near zero to follow the normality.?

Overfitting and Underfitting in Linear Regression

Overfitting in Linear Regression: When the model starts fitting itself to the noise of the data and not much significant variables that it affects the model performance on the unseen future data and test data, then it is called overfitting.?

Dealing with Overfitting

The following are the methods of dealing with overfitting in linear regression:?

Cross Validation?
Regularization?
If the variables are lesser then add more with cleaner data?
If the variables are more then remove some with feature selection

Underfitting in Linear Regression: When our regression model learns lesser by ignoring some of the variable data points and doesn’t fit well that it affects the performance of the prediction then this is called underfitting.

Methods to Deal with Underfitting?

Increase the model complexity to fit well with the data?
Remove noise from the data?
Increase variables and data points?

Bias Variance Trade-Off in Linear Regression?

Bias: Bias is defined as the simplified assumptions made by the model by which it can predict the target variable easily
Variance: Variance is the amount that the target variable estimate will change given the new training data.
The Trade-Off: Our regression model has to find the balance between bias and variance, as bias and variance have an inverse relationship. This means an increase in bias will decrease the variance and vice versa.?

Steps to Perform Linear Regression in Python

Install the python?
Open the notebook
Import the NumPy, pandas, matplotlib.pyplot and sklearn libraries.
Read the data file
Make a data frame
Perform Exploratory data analysis with Numpy, Pandas, and Matplotlib
Split data in dependent and independent variables?
Split the data in train and test
Perform the Linear Regression?
Check for the model performance?
Check for under and overfitting?
Tune the model to improve the performance?
Perform the prediction?

Qualitative Questions:

What are the evaluation metrics?
Where do we use regularization techniques?
What are the applications of regression analysis?
List five use cases of regression analysis.
What is Bias and Variance Trade-Off?
What are underfitting and overfitting?
What is the error term in the regression equation?
What are the regression assumptions?

Coding Questions?

How to split the train test data in a python environment?
What is the popular python Library used for ML in python?
Perform the linear regression on sklearn Boston House Prices data.
Assess the model performance of the Boston Price prediction.

References:

Analytics Vidya?
Boston University?
Wikipedia?
Kaggle??

Teaching Note: Linear Regression Explained

Brijesh Kumar Awasthi

Technology Consultant | Data Science | AI | UGC-NET | ISB | Google PMP

领英推荐

更多精彩文章

社区洞察

其他会员也浏览了

The Ultimate guide to AI, Data Science & Machine Learning, Articles, Cheatsheets and Tutorials ALL in one place

Machine Learning Project: Heart Attack Prediction Analysis

Train and Evaluate Classification Models with Scikit-learn to Predict Categories

Platforms for Machine Learning, AI, & Data Science Best Practices

Future-Proof Your Career: Key Data Science Skills for the AI Era

Exploring Scikit-Learn in 10 Examples

Roadmap to Becoming a Data Scientist: A Step-by-Step Guide

Best resources to get started with machine learning and AI

These books will help you learn machine learning

Implementing Machine Learning: Tools and Techniques

领英推荐

Emphasizing Project Scope to Avoid Scope Creep During Review Meetings

2024年9月10日

Next Gen Computing: What You Need to Know

2023年5月9日

Caselet: How Data Science Helped BlueSmart Cab in Improving Customer Satisfaction

2023年5月5日

ChatGPT-3 vs. ChatGPT-4

2023年4月30日

How to Become a Data Engineer in 2023

2023年1月5日

Why is it called Support Vector Machine(SVM)?

2022年6月10日

HOW I CREATED 2-MORE?LEADERS

2022年2月19日

Energy Sector: Who Will Be The Next "Reliance" Of India?

2022年2月4日

AI is All About Right Content in Internet Business

2021年12月16日

You Need a Change Agent for Digital Transformation

2021年12月6日