登录查看更多内容

What Is Your Model Hiding? A Tutorial on Evaluating ML Models

Emeli Dral

Co-founder and CTO Evidently AI | Machine Learning Instructor w/100K+ students

发布日期: 2021年4月22日

Imagine you trained a machine learning model. Maybe, a couple of candidates to choose from.

You ran them on the test set and got some quality estimates. Overall, they perform as well as they can, given the limited data at hand.

Now, it is time to decide if any of them is good enough for production use. How to evaluate and compare your models beyond the standard performance checks?

In this tutorial, we will walk through an example of how to assess your model in more detail.

Case in point: predicting employee attrition

We will be working with a fictional dataset from a Kaggle competition. The goal is to identify which employees are likely to leave the company soon.

Let's assume we ran our fair share of experiments. We tried out different models, tuned hyperparameters, made interval assessments in cross-validation.

We ended up with two technically sound models that look equally well.

Next, we checked their performance on the test set. Here is what we got:

A Random Forest model with a ROC AUC score of 0.795
A Gradient Boosting model a ROC AUC score of 0.803

Both our models seem fine. Much better than a random split, so we definitely have some signal in the data.

The ROC AUC scores are close. Given that it is just a single-point estimate, we can assume the performance is about the same.

Which of the two should we pick?

Same quality, different qualities

In the complete tutorial, we look at the models in more detail and visualize their performance using Evidently library.

For example, we discover how our first model makes only a few very confident predictions. The second gives us more opportunity to adjust the decision threshold and take advantage of the precision-recall trade-off.

Depending on the use case, one can work better than the other.

Read on for the full tutorial with code: https://evidentlyai.com/blog/tutorial-2-model-evaluation-hr-attrition

要查看或添加评论，请登录

Emeli Dral的更多文章

How to detect, evaluate and visualize historical drifts in the data

2021年8月10日

How to detect, evaluate and visualize historical drifts in the data

We often talk about detecting drift on live data. The goal is then to check if the current distributions deviate from…
To retrain, or not to retrain? Let's get analytical about ML model updates

2021年6月23日

To retrain, or not to retrain? Let's get analytical about ML model updates

Is it time to retrain your machine learning model? Even though data science is all about… data, the answer to this…

1 条评论
New Release: Performance Reports for Classification Models in Production

2021年3月2日

New Release: Performance Reports for Classification Models in Production

A new report in the performance tabs family: now, you can use Evidently to summarize and explore the quality of…

1 条评论
How to break a model in 20 days. A tutorial on production model analytics

2021年2月18日

How to break a model in 20 days. A tutorial on production model analytics

It is a story of how we trained a model, simulated production use, and analyzed its gradual decay. Into the wild: a…

2 条评论
New Release: How To Analyze The Performance of Regression Models in Production?

2021年2月5日

New Release: How To Analyze The Performance of Regression Models in Production?

We’ve just released one more report. You can now use Evidently open-source library to analyse the performance of…
New Release: Analyze Target and Prediction Drift in Machine Learning Models

2020年12月30日

New Release: Analyze Target and Prediction Drift in Machine Learning Models

Our second report is released! Now, you can use Evidently to explore the changes in your target function and model…

2 条评论
Introducing Evidently 0.0.1 Release: Open-Source Tool To Analyze Data Drift

2020年12月4日

Introducing Evidently 0.0.1 Release: Open-Source Tool To Analyze Data Drift

We are excited to announce our first release. You can now use Evidently open-source python package to estimate and…
Machine Learning Monitoring, Part 4. How To Track Data Quality and Data Integrity

2020年10月29日

Machine Learning Monitoring, Part 4. How To Track Data Quality and Data Integrity

This blog is a part of the Machine Learning Monitoring series. In our previous posts, we discussed Why Model Monitoring…
Machine Learning Monitoring, Part 3: What Can Go Wrong With Your Data?

2020年9月11日

Machine Learning Monitoring, Part 3: What Can Go Wrong With Your Data?

This blog is a part of the Machine Learning Monitoring series. Be sure to check Part 1 on Why Monitoring Matters and…

1 条评论

See all articles

What Is Your Model Hiding? A Tutorial on Evaluating ML Models

Emeli Dral

Co-founder and CTO Evidently AI | Machine Learning Instructor w/100K+ students

Case in point: predicting employee attrition

Same quality, different qualities

Emeli Dral的更多文章

社区洞察

其他会员也浏览了

What is Feature Scaling?

Why Calculate Accuracy and AUC both in ML Experiment?

FEATURE SELECTION IN ML.

XGBoost - What is and why it reins all ML algorithms

Understanding Regularization Techniques in Machine Learning: L1, L2, Dropout, Data Augmentation, and Early Stopping

Understanding Regularization Techniques in Machine Learning

Use Cases of Machine Learning with Brutal Practicality - When to Use and When Not to Use

Understanding Partial Dependence Plots: Importance and Applications

Accuracy vs Precision

Bias and Variance in Action: Insights from Large Language Models

Case in point: predicting employee attrition

Same quality, different qualities

Emeli Dral的更多文章

How to detect, evaluate and visualize historical drifts in the data

To retrain, or not to retrain? Let's get analytical about ML model updates

New Release: Performance Reports for Classification Models in Production

How to break a model in 20 days. A tutorial on production model analytics

New Release: How To Analyze The Performance of Regression Models in Production?

New Release: Analyze Target and Prediction Drift in Machine Learning Models

Introducing Evidently 0.0.1 Release: Open-Source Tool To Analyze Data Drift

Machine Learning Monitoring, Part 4. How To Track Data Quality and Data Integrity

Machine Learning Monitoring, Part 3: What Can Go Wrong With Your Data?

社区洞察

其他会员也浏览了

What is Feature Scaling?

Why Calculate Accuracy and AUC both in ML Experiment?

FEATURE SELECTION IN ML.

XGBoost - What is and why it reins all ML algorithms

Understanding Regularization Techniques in Machine Learning: L1, L2, Dropout, Data Augmentation, and Early Stopping

Understanding Regularization Techniques in Machine Learning

Use Cases of Machine Learning with Brutal Practicality - When to Use and When Not to Use

Understanding Partial Dependence Plots: Importance and Applications

Accuracy vs Precision

Bias and Variance in Action: Insights from Large Language Models