Evaluations Metrics
Sanjay Lalwani
Data Scientist at Siemens | Speaker | Ex-Infoscian | AI Educator | Azure*2
You have done all data cleaning, applied different EDA and feature engineering techniques, and used a different model to train, but how to decide if the model is doing good or bad. That’s where different evaluation metrics come into the picture. Selecting an appropriate evaluation metric is important because it can impact your selection of a model or decide whether to put your model into production. Let's see the below 2 examples,
Evaluation metrics for Classification
In this post, I will be focussing on Evaluation metrics for Classification, in the upcoming post I will add details on Regression evaluation metrics.
For understanding Classification evaluation metrics, let’s first understand the confusion metric,
Confusion Metric:
We will take the Youtube Kids example, where you are trying to classify whether a video is Kids safe or not. Whether it is fine to watch a particular video or not.
Confusion Matrix has 4 values,
True Positive:
True Negative:
False Positive:
False Negative:
Now let's try to understand different evaluation metrics,
1. Accuracy:
Accuracy = (TP + TN) / (TP + FP + FN + TN)
2. Precision:
Precision = TP / (TP + FP)
3. Recall:
Recall= TP / (TP + FN)
4. F1 Score:
F1 Score = ( 2 * Precision * Recall) / (Precision + Recall)
领英推荐
5. Specificity:
Specificity = TN / (TN + FP)
One last point,
Precision-Recall Tradeoff:?It is expected to have both Precision and Recall high percentage. But if you go for Higher Precision, your Recall lowers, and vice versa. Ex. For Youtube Kidsafe False Positive is a concern, we don’t want Kids to see Violent videos, but it may cause missing some Kidsafe videos like cartoons.
Evaluation metrics for Regression
1. Mean Absolute Error(MAE)
2. Mean Squared Error(MSE)
3. Root Mean Squared Error(RMSE)
4. R squared
R-squared =1 — (sum of squares of residual)/ (total sum of squares)
5. Adjusted R-Squared
Adjusted R-Squared = 1 — ((1- R-Squared)(n-1)/(n-k-1))
where n = number of data points in the dataset
k = number of independent variables
Evaluation metrics for Unsupervised — Clustering
1. Rand Index or Rand measure
RI = (Number of agreeing pairs) / (Total number of pairs)
2. Adjusted Rand Index
ARI = (Rand Index — Expected_RI) / (max(RI) — Expected_RI)