What is bias, variance and bias-variance trade-off in Machine Learning?
Bias-Variance Tradeoff

What is bias, variance and bias-variance trade-off in Machine Learning?

A machine learning model’s performance is considered good based on its prediction and how well it generalizes on an independent test dataset.

Generalized errors is defined as follows.

Generalized error = Reducible error + Irreducible error

Irreducible errors cannot be reduced no matter what algorithm used and what data you used.

A reducible error has two components bias and variance.

Generalized error = bias^2 + variance + Irreducible error

Bias:

Bias is the simplifying assumptions made by a model to make the target function easier to learn.

Bias is how far are the predicted values from the actual values. If the average predicted values are far off from the actual values then the bias is high.

  • Low Bias: Predicting less assumption about Target Function
  • High Bias: Predicting more assumption about Target Function

Examples of low-bias machine learning algorithms include Decision Trees, k-Nearest Neighbors and Support Vector Machines.

Examples of high-bias machine learning algorithms include Linear Regression, Linear Discriminant Analysis, and Logistic Regression.

Variance:

Variance is the amount that the estimate of the target function will change if different training data was used.

Variance tells us how scattered is the predicted value from the actual value.

  • Low Variance: Predicting small changes to the estimate of the target function with changes to the training dataset.
  • High Variance: Predicting large changes to the estimate of the target function with changes to the training dataset.

Examples of low-variance machine learning algorithms include Linear Regression, Linear Discriminant Analysis, and Logistic Regression.

Examples of high-variance machine learning algorithms include Decision Trees, k-Nearest Neighbors and Support Vector Machines.

No alt text provided for this image

Overfitting:

High variance leads to a model undergone overfitting, the model works well on train data over test data.

We can identify the high variance - train error is less and test error is high.

Underfitting:

High bias leads to a model undergone underfitting, the model works well for test data over train data.

We can identify the high bias using - test error is less and train error is high.

Below is the bull's eye diagram for different cases of bias and variance.

No alt text provided for this image

How to reduce the bias:

  1. we need to add more features to the model.
  2. Reduce regularize value.
  3. Make a model as complex (change it to higher-order – polynomial).

How to reduce the variance:

  1. Clean the data before fitting the model.
  2. Pour more data to the model.
  3. Increase the regularize value.
  4. Reduce the features.

Bias - Variance Trade-off:

While creating a model we have to tune the hyperparameters in such a way that it can trade of bias and variance.

If the bias of model is high and the variance of the model is low - less efficient model

If the bias of the model is low and the variance of the model is high - less efficient model

If the bias of the model and variance both are trade-off - high efficient model


要查看或添加评论,请登录

Sunil Kumar Cheruku的更多文章

社区洞察

其他会员也浏览了