登录查看更多内容

Bias and Variance and Its Trade Off

Mrityunjay Pathak

Using Statistics and Machine Learning to solve Business Problems!

发布日期: 2023年11月20日

There are various ways to evaluate a machine-learning model. Bias and Variance are one such way to help us in parameter tuning and deciding better-fitted models among several models.

What is Bias?

Bias is simply defined as the inability of the model because of that there is some difference or error occurring between the model’s predicted value and the actual value.
These differences between actual or expected values and the predicted values are known as error or bias error or error due to bias.
Bias is a systematic error that occurs due to wrong assumptions in the machine learning process.

Mathematical Formula

Let Y be the true value of a parameter, and let Y^ be an estimator of Y based on a sample of data. Then, the bias of the estimator Y^ is given by :

Bias(Y^) = E(Y^) - Y

where E(Y^) is the expected value of the estimator Y^.
It is the measurement of the model that how well it fits the data.
Low Bias : Low bias value means fewer assumptions are taken to build the target function. In this case, the model will closely match the training dataset.
High Bias : High bias value means more assumptions are taken to build the target function. In this case, the model will not match the training dataset closely.

Note

The high-bias model will not be able to capture the dataset trend.
It is considered as the underfitting model which has a high error rate. It is due to a very simplified algorithm.

What is Variance?

Variance is the measure of spread in data from its mean position.
In machine learning variance is the amount by which the performance of a predictive model changes when it is trained on different subsets of the training data.
More specifically, variance is the variability of the model that how much it is sensitive to another subset of the training dataset.

Mathematical Formula

领英推荐

Handling Outliers in ML: Best Practices for Robust…

Iain Brown PhD 1 年前

Machine Learning Unleashed: Transforming Business Data…

Eric D. Brown, DSc 7 个月前

K-Nearest Neighbors (KNN) Algorithm for…

Vrata Tech Solutions (VTS) 1 年前

Let Y be the actual values of the target variable, and Y^ be the predicted values of the target variable. Then the variance of a model can be measured as the expected value of the square of the difference between predicted values and the expected value of the predicted values.

Variance = E[(Y^ - E[Y^])^2]

where E[Y^] is the expected value of the predicted values.
Low Variance : Low variance means that the model is less sensitive to changes in the training data and can produce consistent estimates of the target function with different subsets of data from the same distribution. This is the case of underfitting when the model fails to generalize on both training and test data.
High Variance : High variance means that the model is very sensitive to changes in the training data and can result in significant changes in the estimate of the target function when trained on different subsets of data from the same distribution. This is the case of overfitting when the model performs well on the training data but poorly on new, unseen test data. It fits the training data too closely that it fails on the new training dataset.

Different Combinations of Bias-Variance

There can be four combinations between bias and variance.

High Bias, Low Variance : A model with high bias and low variance is said to be underfitting.
High Variance, Low Bias : A model with high variance and low bias is said to be overfitting.
High-Bias, High-Variance : A model has both high bias and high variance, which means that the model is not able to capture the underlying patterns in the data (high bias) and is also too sensitive to changes in the training data (high variance). As a result, the model will produce inconsistent and inaccurate predictions on average.
Low Bias, Low Variance : A model that has low bias and low variance means that the model is able to capture the underlying patterns in the data (low bias) and is not too sensitive to changes in the training data (low variance). This is the ideal scenario for a machine learning model, as it is able to generalize well to new, unseen data and produce consistent and accurate predictions. But in practice, it’s not possible.

Now we know that the ideal case will be Low Bias and Low Variance, but in practice, it is not possible. So, we trade off between Bias and variance to achieve a balanced bias and variance.

Bias Variance Trade-off

If the algorithm is too simple then it may be on high bias and low variance condition and thus is error-prone. If algorithms fit too complex then it may be on high variance and low bias.

In the latter condition, the new entries will not perform well. Well, there is something between both of these conditions, known as a Trade-off or Bias Variance Trade-off.
We try to optimize the value of the total error for the model by using the Bias-Variance Trade-off.

Total Error = Bias^2 + Variance + Inducible Error

The best fit will be given by the hypothesis on the trade-off point. The error to complexity graph to show trade-off is given as :

This is referred to as the best point chosen for the training of the algorithm which gives low error in training as well as testing data.

要查看或添加评论，请登录

Mrityunjay Pathak的更多文章

Machine Learning Mathematics??

2023年11月17日

Machine Learning Mathematics??

Machine Learning is the field of study that gives computers the capability to learn without being explicitly…
How to Modify your GitHub Profile Readme File as your Portfolio

2023年10月25日

How to Modify your GitHub Profile Readme File as your Portfolio

What if you don't have a personal portfolio website? No worries! You can transform your GitHub README.md into a…

4 条评论
Data Science Resources

2023年8月13日

Data Science Resources

Are you starting your journey into the world of Data Science? Here's a curated list of top resources to master various…
25 Python Sets Questions with Solution

2023年8月13日

25 Python Sets Questions with Solution

25 Python Sets Coding Questions along with Explanations for each. Let's get started ↓ Question 1: Write a Python…
25 Python Tuple Questions with Solution

2023年8月11日

25 Python Tuple Questions with Solution

25 Python Tuple Coding Questions along with Explanations for each. Let's get started ↓ Question 1: Find the length of a…
25 Python Dictionary Questions and Solutions

2023年8月9日

25 Python Dictionary Questions and Solutions

25 Python Dictionary Coding Questions along with Explanations for each. Let's get started ↓ Question 1: Create an empty…
25 Python List Questions with Solution

2023年8月7日

25 Python List Questions with Solution

25 Python List Coding Questions along with Explanations for each. Let's get started ↓ Question: Given a list nums, find…

2 条评论
25 Python String Questions with Solution

2023年8月5日

25 Python String Questions with Solution

25 Python Strings Coding Questions along with Explanations for each. Let's get started ↓ Write a Python program to…

3 条评论
25 Python Loop Coding Questions

2023年8月4日

25 Python Loop Coding Questions

25 Python Loop Coding Questions along with Explanations for each. Let's get started ↓ Print numbers from 1 to 10 using…

3 条评论
25 Basic Python I/O Coding Questions

2023年8月3日

25 Basic Python I/O Coding Questions

25 Basic Python I/O Coding Questions along with Explanations for each. Let's get started ↓ 1.

2 条评论

See all articles

Bias and Variance and Its Trade Off

Mrityunjay Pathak

Using Statistics and Machine Learning to solve Business Problems!

领英推荐

Mrityunjay Pathak的更多文章

社区洞察

其他会员也浏览了

Making Sense of Data Features

Machine Learning Monitoring, Part 5: Why You Should Care About Data and Concept Drift

Unlocking the Secrets of Data with Distance-Based Models and EDA

Understanding the Bias-Variance Tradeoff: Balancing Model Performance in Machine Learning

Standardization and Normalization Techniques in Machine Learning - Part 07

Enhancing Model Performance: The Role of Regularization Techniques

Unlocking Model Performance: Navigating the Key Factors for Success in Machine Learning

Model Fine-Tuning

How to choose the right model

Data Optimizations Techniques in the Machine Learning

领英推荐

Mrityunjay Pathak的更多文章

Machine Learning Mathematics??

How to Modify your GitHub Profile Readme File as your Portfolio

Data Science Resources

25 Python Sets Questions with Solution

25 Python Tuple Questions with Solution

25 Python Dictionary Questions and Solutions

25 Python List Questions with Solution

25 Python String Questions with Solution

25 Python Loop Coding Questions

25 Basic Python I/O Coding Questions

社区洞察

其他会员也浏览了

Making Sense of Data Features

Machine Learning Monitoring, Part 5: Why You Should Care About Data and Concept Drift

Unlocking the Secrets of Data with Distance-Based Models and EDA

Understanding the Bias-Variance Tradeoff: Balancing Model Performance in Machine Learning

Standardization and Normalization Techniques in Machine Learning - Part 07

Enhancing Model Performance: The Role of Regularization Techniques

Unlocking Model Performance: Navigating the Key Factors for Success in Machine Learning

Model Fine-Tuning

How to choose the right model

Data Optimizations Techniques in the Machine Learning