登录查看更多内容

Regression: Evaluation Metrics/Loss Functions

Mahesh S.

USA Foundation Research Fellow | Ph.D. Candidate in Computer Science with Hands-on Experience with ML Projects | Ex-Full Stack Software Developer | Like Leveraging Advanced ML Techniques to Solve Real-World Problems

发布日期: 2024年6月15日

+ 关注

A beginner-friendly introduction to the Evaluation Metrics of Regression.

Whenever we create a model, we need to check if our model is working correctly.

Since we created a model to predict the price of a house in our last article, we want to ensure that the model predicts correctly.

What does it mean to predict correctly?

Predicting correctly means ensuring the predicted value is the same as the actual value. For example, if the price of the house is 400,000 and our model predicted 400,000, then our model is correct. This would be our the perfect world.

However, the world is sometimes flawed, and the model we create often predicts incorrectly.

For example, it might predict the price is 399,000 instead of 400,000, which is pretty close to 400,000, so we can say that the model is good. But when it predicts 300,000, which is not even close to 400,000, we say our model is not doing well.

So, the goodness of our model depends on how close the predictions are to the original price.

Closer to the original price means the differences between the actual and predicted price are close to 0.

For example:

400,000 - 399,000 = 1000 (better prediction)

400,000 - 300,000 = 100,000 (not a good prediction)

i.e., 1000 is closer to 0 than 100,000

However, in Machine Learning, our model needs a lot of data to understand the relationship between the features and the output class.

This means our actual and predicted values table looks like this:

How can we evaluate our model in this case?

Well, it turns out we can do it by finding the sum of the differences between actual and predicted values.

Let's understand what I am talking about here.

Let's say we have two machine learning model--Model 1 and Model 2.

Prediction of Model 1

Predictions of Model 2

In Model 2, I increased the predicted price to be closer to the actual value,

This resulted in a smaller sum of differences in Model 2 compared to Model 1 (71,000 is less than 102,000), indicating that the improved model (Model 2) has smaller differences.

Two things we learned till now.

The predicted value should be close to the actual value.
The sum of differences between the predicted values is less in a better model.

Note: This difference is called an error (for obvious reasons). It is also called a?Loss,?and the function that calculates it is called the Loss Function. And when we use it to evaluate our model, we call it Evaluation Metrics.

While creating a Machine Learning model, we aim to decrease this error as much as possible.

When our model predicts the exact prices, the sum of errors becomes 0.

So, The sum of errors can be as low as 0.

But we have one problem now,

what if we increase the size of our dataset

Let's say we increased the dataset, and our Model 2 predicted this.

Suddenly, our sum of errors is 121,000, far more than Model 1(102,000). Does this mean our model 1 is better now?

No, Model 2 is still superior, but we need to adjust our method of calculating the error. To address this issue, instead of comparing the sum of errors, we need to compare the average of the sum of errors.

The method is to compare the mean of the sum of errors/losses/differences

For Model 1, with 3 observations, the total error is 102,000, making the average (102,000/3) = 34000.

As for Model 2, with 4 observations, the total error is 121,000, resulting in an average of (121,000/4) = 30,250.

Interpreting this, we can say that Model 2 predicts the house price with a mean error of 30,250, while Model 1's mean error is 34000. Since we want a lower average error, Model 2 is the better choice.

领英推荐

Linear Regression - Part Three - GLM - Generalised…

Ajit Jaokar 6 个月前

A Comprehensive Overview of Regression Methods

Utpal Dutta 7 个月前

What is a Logit Function and Why Use Logistic…

Karen Grace-Martin 3 个月前

What we understood till now:

Two things we learned till now.

The predicted value should be close to the actual value.
The sum of differences between the predicted values is less in a better model.
We compare the mean of the sum of losses to compare the models. The lower the mean, the better the model

Before we move on to the actual functions, let's consider one more example.

These are the predictions of Model 1. Let's increase the prediction of the 1st observation to be higher than the actual value.

Our difference will be negative when the predicted price increases, resulting in a lower sum. (see first observation)

Now, when we calculate the mean, 50,000/3 equals 16,666.66.

This value is lower than that of Model 2. However, it's apparent that Model 1's predictions are not superior to those of Model 2.

The issue arises from the difference in the first observation. When calculating the differences, we have to take into account that some differences can be negative. These negatives should be converted to positives before finding the sum and mean.

So, (actual-predicted) should be positive. We can make the value positive by taking the absolute value of the differences or squaring them.

But let me tell you, it is worth it to understand the loss functions.

Let's look at different loss functions now.

Mean Absolute Error (MAE)

I am sure you guessed what would happen in this loss function.

We find a?mean?of the sum of the?absolute .

Python Implementation:

from sklearn.metrics import mean_absolute_error

mean_absolute_error(y_actual, y_pred)

Mean Squared Error(MSE)

You guessed it right: We find a?mean?of the sum of the?squares of errors.

Python Implementation:

from sklearn.metrics import mean_squared_error

mean_squared_error(y_actual, y_pred)

Root Mean Square Error(RMSE)

We first find the mean of the sum of the squares of the errors, and then we calculate the square root of the mean.

Python Implementation:

Note: root_mean_squared_error, is added in sklearn 1.4 version.

from sklearn.metrics import root_mean_squared_error

root_mean_squared_error(y_actual, y_pred)

These three metrics/loss functions are enough for now to start learning and have a strong base for regression problems.

I have added these Python codes to the Kaggle notebook.

There are more functions like this, and they try to fix different problems. I will be writing about them individually in later articles.

And yes, there are other evaluation metrics for classification models: accuracy, precision, and recall are some examples.

Thank you so much for reading this article. I would really appreciate your feedback, which will help me improve future articles.

Next, we will talk about another algorithm: Decision Tree.

Stay tuned.

Also, here are the links for the previous two articles:

References and Further Materials

Penmetsa, C. (2024, January 2). Machine learning using Scikit-Learn (sklearn) — Evaluating Regression model using metrics | CodeNx. Medium. https://medium.com/codenx/machine-learning-using-scikit-learn-sklearn-evaluating-regression-model-using-metrics-0414107a7e22
Grover, P. (2021, October 5). 5 Regression loss functions all machine learners should know. Medium. https://heartbeat.comet.ml/5-regression-loss-functions-all-machine-learners-should-know-4fb140e9d4b0
Ciampiconi, L., Elwood, A., Leonardi, M., Mohamed, A., & Rozza, A. (2023). A survey and taxonomy of loss functions in machine learning. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2301.05579

Mishahal Palakuniyil

Data Leader | Data Analytics and Engineering Specialist | Driving Operational Excellence through Analytics

7 个月

This was a great read with a clear explanation of mean squared error and mean absolute error's origin story ??

要查看或添加评论，请登录

Mahesh S.的更多文章

Gradient Boosting: Introduction, Implementation, and Mathematics behind it - For Classification

2024年9月28日

Gradient Boosting: Introduction, Implementation, and Mathematics behind it - For Classification

A detailed beginner friendly introduction and an implementation in Python. Gradient Boosting(GB) is an ensemble…
Gradient Boosting: Introduction, Implementation, and Mathematics behind it.

2024年9月8日

Gradient Boosting: Introduction, Implementation, and Mathematics behind it.

A beginner-friendly introduction and an implementation in Python. Introduction Gradient Boosting is powerful and one of…
Linear Time Complexity Explained

2024年8月10日

Linear Time Complexity Explained

Understanding Big O Notation. Have you ever written a for loop? let’s refresh our memory How many times do you think…

4 条评论
AdaBoost: Introduction, Implementation and Mathematics behind it.

2024年7月27日

AdaBoost: Introduction, Implementation and Mathematics behind it.

A beginner-friendly introduction and an implementation in Python Introduction AdaBoost is one of the first ensemble…

2 条评论
Boosting: Introduction

2024年7月7日

Boosting: Introduction

Machine learning is rapidly evolving, and so is the available data. One of the main challenges of Machine Learning is…

1 条评论
Random Forest: Introduction & Implementation in Python

2024年6月29日

Random Forest: Introduction & Implementation in Python

As always, let's start with a question. Have you ever been in a situation where you needed the opinion of more than one…

4 条评论
Decision Trees: Introduction

2024年6月22日

Decision Trees: Introduction

A beginner friendly introduction to Decision Trees Continuing our House Price Example: Imagine you are planning to buy…

4 条评论
Linear Regression: Introduction

2024年6月8日

Linear Regression: Introduction

Let’s start with a question: Have you ever wondered how the Price of a house is predicted? Or have you ever tried to…

6 条评论
Why Should I Learn from the Beginning?

2024年5月31日

Why Should I Learn from the Beginning?

And have a strong foundation. (from an AI/ML perspective) When you start learning Machine Learning, one of the first…

2 条评论

See all articles

Regression: Evaluation Metrics/Loss Functions

Mahesh S.

USA Foundation Research Fellow | Ph.D. Candidate in Computer Science with Hands-on Experience with ML Projects | Ex-Full Stack Software Developer | Like Leveraging Advanced ML Techniques to Solve Real-World Problems

A beginner-friendly introduction to the Evaluation Metrics of Regression.

领英推荐

Mean Absolute Error (MAE)

Mean Squared Error(MSE)

Root Mean Square Error(RMSE)

Mahesh S.的更多文章

社区洞察

其他会员也浏览了

What is a Logit Function and Why Use Logistic Regression?

Understanding Multiple Linear Regression: A Comprehensive Guide

Key Metrics for Evaluating Regression Models [Part 1]

Regression Analysis - What, Why, and How

Multiple Regression in R: Multiple Variables, Interactions, Graphing, and Assessment

Huber Regression: Outliers Under Control

Regularization - Regression essentials 101

Linear regression. What is it and how can it be useful?

Variable Importance & Bootstrapping in Regression

A beginner-friendly introduction to the Evaluation Metrics of Regression.

领英推荐

Mean Absolute Error (MAE)

Mean Squared Error(MSE)

Root Mean Square Error(RMSE)

Mahesh S.的更多文章

Gradient Boosting: Introduction, Implementation, and Mathematics behind it - For Classification

Gradient Boosting: Introduction, Implementation, and Mathematics behind it.

Linear Time Complexity Explained

AdaBoost: Introduction, Implementation and Mathematics behind it.

Boosting: Introduction

Random Forest: Introduction & Implementation in Python

Decision Trees: Introduction

Linear Regression: Introduction

Why Should I Learn from the Beginning?

社区洞察

其他会员也浏览了

What is a Logit Function and Why Use Logistic Regression?

Understanding Multiple Linear Regression: A Comprehensive Guide

Key Metrics for Evaluating Regression Models [Part 1]

Regression Analysis - What, Why, and How

Multiple Regression in R: Multiple Variables, Interactions, Graphing, and Assessment

Huber Regression: Outliers Under Control

Regularization - Regression essentials 101

Linear regression. What is it and how can it be useful?

Variable Importance & Bootstrapping in Regression