登录查看更多内容

What is bias, variance and bias-variance trade-off in Machine Learning?

Sunil Kumar Cheruku

Lead Data Scientist | Data Engineering | Analytics Engineering| Generative AI | Python Developer

发布日期: 2019年11月3日

A machine learning model’s performance is considered good based on its prediction and how well it generalizes on an independent test dataset.

Generalized errors is defined as follows.

Generalized error = Reducible error + Irreducible error

Irreducible errors cannot be reduced no matter what algorithm used and what data you used.

A reducible error has two components bias and variance.

Generalized error = bias^2 + variance + Irreducible error

Bias:

Bias is the simplifying assumptions made by a model to make the target function easier to learn.

Bias is how far are the predicted values from the actual values. If the average predicted values are far off from the actual values then the bias is high.

Low Bias: Predicting less assumption about Target Function
High Bias: Predicting more assumption about Target Function

Examples of low-bias machine learning algorithms include Decision Trees, k-Nearest Neighbors and Support Vector Machines.

Examples of high-bias machine learning algorithms include Linear Regression, Linear Discriminant Analysis, and Logistic Regression.

Variance:

Variance is the amount that the estimate of the target function will change if different training data was used.

Variance tells us how scattered is the predicted value from the actual value.

Low Variance: Predicting small changes to the estimate of the target function with changes to the training dataset.
High Variance: Predicting large changes to the estimate of the target function with changes to the training dataset.

Examples of low-variance machine learning algorithms include Linear Regression, Linear Discriminant Analysis, and Logistic Regression.

Examples of high-variance machine learning algorithms include Decision Trees, k-Nearest Neighbors and Support Vector Machines.

Overfitting:

High variance leads to a model undergone overfitting, the model works well on train data over test data.

We can identify the high variance - train error is less and test error is high.

Underfitting:

High bias leads to a model undergone underfitting, the model works well for test data over train data.

We can identify the high bias using - test error is less and train error is high.

Below is the bull's eye diagram for different cases of bias and variance.

How to reduce the bias:

we need to add more features to the model.
Reduce regularize value.
Make a model as complex (change it to higher-order – polynomial).

How to reduce the variance:

Clean the data before fitting the model.
Pour more data to the model.
Increase the regularize value.
Reduce the features.

Bias - Variance Trade-off:

While creating a model we have to tune the hyperparameters in such a way that it can trade of bias and variance.

If the bias of model is high and the variance of the model is low - less efficient model

If the bias of the model is low and the variance of the model is high - less efficient model

If the bias of the model and variance both are trade-off - high efficient model

要查看或添加评论，请登录

Sunil Kumar Cheruku的更多文章

Activation functions in Deep Learning

2020年12月6日

Activation functions in Deep Learning

An activation function is a mathematical function that is added to a neural network in order to achieve the network to…
Significance of Bias value in Machine learning.

2020年10月19日

Significance of Bias value in Machine learning.

Lets take linear equation y = mx + c. 'c' is called 'Intercept' or 'Bias'.
Things to know on the KNN Algorithm.

2020年10月2日

Things to know on the KNN Algorithm.

KNN is a distance-based Machine learning algorithm used for supervised learning (both Regression and Classification)…

1 条评论
Generators VS Iterators in Python.

2020年4月23日

Generators VS Iterators in Python.

Lists, tuples, and dictionaries are iterable that use iterators and traverse element by element. list1 = [10, 20, 30…
The difference between physical servers, Virtual Machines, and Containers.

2020年4月19日

The difference between physical servers, Virtual Machines, and Containers.

If we go back to the 19th century there if we want to run any application, we have to install the physical server and…
Git, Github, and Git Commands.

2020年4月16日

Git, Github, and Git Commands.

Git: Git is an open-source version control system that was started by Linus Torvalds—the same person who created Linux.…
File operations in python.

2020年4月12日

File operations in python.

In data manipulation and analysis, you will often come across reading and writing files. Python can handle files of…
Mutable vs Immutable.

2020年4月9日

Mutable vs Immutable.

Introduction We have been hearing about mutable and immutable all along. Simply put, objects which can be modified…
Hack to modify the String(We know that String is immutable).

2020年4月7日

Hack to modify the String(We know that String is immutable).

Important Property An important property of strings is that they are immutable, i.e.
Deep Learning with PyTorch 101 - How Perceptron handles miss classified points.

2020年2月13日

Deep Learning with PyTorch 101 - How Perceptron handles miss classified points.

In the last article, we have seen what is perceptron and how it works. We have used our mathematical knowledge to…

1 条评论

See all articles

What is bias, variance and bias-variance trade-off in Machine Learning?

Sunil Kumar Cheruku

Lead Data Scientist | Data Engineering | Analytics Engineering| Generative AI | Python Developer

Bias:

Variance:

Overfitting:

Underfitting:

How to reduce the bias:

How to reduce the variance:

Bias - Variance Trade-off:

Sunil Kumar Cheruku的更多文章

社区洞察

其他会员也浏览了

Decision Tree

Unveiling the Art of Feature Selection in Machine Learning

ML Algorithms You Can Use to Predict & Classify

Strategic Evaluation and Comparison of Machine Learning Models

An Executive’s View: Overview of major Machine Learning Algorithms

A Tour of Machine Learning Algorithms

Ensemble Methods in Machine Learning: Boosting and Bagging

Regularization..

TOP 10 MACHINE LEARNING ALGORITHMS

Understanding Machine Learning Algorithms: A Comprehensive Guide

Bias:

Variance:

Overfitting:

Underfitting:

How to reduce the bias:

How to reduce the variance:

Bias - Variance Trade-off:

Sunil Kumar Cheruku的更多文章

Activation functions in Deep Learning

Significance of Bias value in Machine learning.

Things to know on the KNN Algorithm.

Generators VS Iterators in Python.

The difference between physical servers, Virtual Machines, and Containers.

Git, Github, and Git Commands.

File operations in python.

Mutable vs Immutable.

Hack to modify the String(We know that String is immutable).

Deep Learning with PyTorch 101 - How Perceptron handles miss classified points.

社区洞察

其他会员也浏览了

Decision Tree

Unveiling the Art of Feature Selection in Machine Learning

ML Algorithms You Can Use to Predict & Classify

Strategic Evaluation and Comparison of Machine Learning Models

An Executive’s View: Overview of major Machine Learning Algorithms

A Tour of Machine Learning Algorithms

Ensemble Methods in Machine Learning: Boosting and Bagging

Regularization..

TOP 10 MACHINE LEARNING ALGORITHMS

Understanding Machine Learning Algorithms: A Comprehensive Guide