登录查看更多内容

Your Machine Learning (ML) Model is Wrong, now what?

Bassel Haidar

| AI | Data | Automation |

发布日期: 2020年12月27日

In ML learning parlance, the bias-variance trade-off means that the model must find a happy medium between underfitting and overfitting.

If the ML model has a high bias, it will underfit because it did not learn enough about the relationship between the model's features and labels. In contrast, a high variance model has overlearned or memorized the training data and is incapable of generalizing on unseen data, resulting in an overfit model.

As ML practitioners, we are seekers of the low bias and low variance ML Model, the "Holy Grail." By increasing model complexity, we decrease bias but increase variance while reducing model complexity, we increase bias but decrease variance. The bias-variance trade-off is a perfect illustration of the no free lunch theorem.

To improve model performance, we need a super algorithm or meta-algorithms that combine several ML models providing access to the proverbial bias/variance "dial knob," leading us to a quick synopsis of the ensemble methods.

Bagging (bootstrap aggregating, i.e., Random Forest) reduces high variance. How? Assuming that each model in the ensemble will not make the same errors on the test data set, by averaging individual models' predictions, the errors cancel out, yielding better predictions, which is quite akin to asking the audience (wisdom of the crowd).
Boosting (i.e., XGBoost, AdaBoost, GBM) constructs an ensemble model with more capacity than the individual member models. It reduces bias more than variance. Each successive model focuses "the learning" on the examples the previous model got wrong.
Stacking is similar to boosting but uses different ML models and combines their outputs to feed a secondary ML model to produce a prediction. It decreases variance but also controls high bias.

Back to the no free lunch theorem, although ensemble learning allows us to regulate the bias-variance trade-off, it also increases training time (i.e., compute resources), design time (i.e., which model to choose and types of architecture), and decreases model interpretability.

Jimmy Haidar

Low Voltage Systems Contractor

3 年

This is almost like a formula to life.. moderation.

2 次回应

Carlos Mercado

economics, ai, and crypto research @ flipside

3 年

Variance: biased toward training data. Bias: high variance in training data. Don't you love ML vocabulary? ??

1 次回应

查看更多评论

要查看或添加评论，请登录

查看全部

Your Machine Learning (ML) Model is Wrong, now what?

Bassel Haidar

| AI | Data | Automation |

更多精彩文章

社区洞察

其他会员也浏览了

Regularization in Machine Learning

What Are Learning Algorithms?

Bluff the Bots - A glossary of key terms in Machine Learning

An Introduction to Machine Learning

Hyperparameter optimization in Machine Learning Part-1: Algorithms

The Easiest way to understand Machine Learning

Machine Learning-The Next Big Thing in Technology!

The No Free Lunch Theorem

Ensemble Methods in Machine Learning

Using Q-Reinforced Learning to Generate Digital Logic

AI's Next Step in Reasoning

2024年10月3日

The Rise of Compound AI Systems

2024年7月19日

Talk to your PDF Documents

2023年5月31日

Transform your Data Landscape with the Superheroes of Data Management

2023年5月9日

Large datasets, slow queries, now what?

2023年5月3日

Nothing is Ever Beyond Your Reach

2022年12月23日

Move over Judge Judy AI is here

2022年6月6日