ML binary classification Models Evaluation metrics

ML binary classification Models Evaluation metrics

Evaluating an ML model is not easy task. There are lot of data with distribution need to be prepared with proper annotation for testing. Human error is not ignorable when we annotate these data. Whenever we test our model (during or after training), we can use these data. In running system, we may observe performance of model for some time where we collect and annotate inputs and check inference to see how the model is performing. When we say something as good or bad, we talk with numbers and these numbers have different meanings in different types of models. For example, in regression, we like to see what is the error the model gives for a given input. In binary classification, we may need multiple inputs to find the ration of good and bad results. In this article I will share some metrics commonly used for binary classification model using confusion matrix.

Confusion matrix

Confusion matrix is a table layout to visualize the performance of a model. Each row of the matrix represents instance in actual class and each column represents instance in predicted class or vice versa. In binary classification, we like to get the answer as True and False. In this matrix, four values utilized to understand the performance.

  1. Case 1: True Positive (TP): Model rightly classified Trus instance.
  2. Case 2: False Positive (FP): Model classified as True but actually it is negative.
  3. Case 3: True Negative (TN): Model rightly classified False instance
  4. Case 4: False Negative (FN): Model classified as False but actually it is True

Following table is a possible structure of a confusion matrix.

Structure (picture taken from

An example can be as follows

Example (picture taken from

So, if models result match with case 1 and case 3 then we are in perfect world. But in reality, model will give result as case 2 and case 4 (due to probabilistic nature of ML).

Metrices

Here we are. What is the accuracy of the model? Simple, the fraction of accurate answer over total population(?). Hmm, we have total population is sum of instances of all cases. And accurate? Yes, you are right. Accurate as in case 1 and case 3. So, here we have the metric Accuracy which is (TP+TN)/(TP+FP+TN+FN). In our case it is (6+3)/12 = 0.75. Ok, now we have a question. We want to know how many True instances are precisely identified. That means, model said positive but how many of them are actually positive i.e. how accurate the positive predictions are. This is another metric which is called Precision calculated as TP/(TP + FP). Higher the value means we have less false positive. Now, we look for another question. Among total positive instances, how much positives recalled by the model. And we have another metrics called Recall calculated as TP/(TP + FN). High the value means we have less false negative. When data are imbalanced, or the cost of false positive and false negative is different then we need to understand the performance of the model with both Precision and Recall. Hence, we need some balance among these two values. So, we can average of these. The mathematical formula to get average of two ration value is harmonic mean which is being calculated for Precision and Recall as 2/(1/Precision+1/Recall) which is known as F-1 Score. A high value means model has high precision and high recall.

So we have following metrices using confusion matrix

Accuracy = (TP+TN)/(TP+FP+TN+FN)

Precision = TP/(TP + FP)

Precision = TP/(TP + FP)

F-1 Score = 2/(1/Precision+1/Recall) = 2 x Precision x Recall / (Precision + Recall)

Each of these metrics has some meaning and give idea about the performance of the model. But these result are dependent on data. Accuracy are are simple to calculate and intuitive for binary classification but will mislead for imbalanced data. For imbalanced data, we may choose any of Precision, Recall or F1 Score depending on our target (minimizing false positive or false negative).

References

https://www.v7labs.com/blog/performance-metrics-in-machine-learning

https://en.wikipedia.org/wiki/Confusion_matrix

要查看或添加评论,请登录

Khaled Hussain的更多文章

  • How to motivate less confident employees

    How to motivate less confident employees

    I asked chatgpt to understand how it has been train to explain types of employees based on motivation. Actually, I had…

  • Importance of string manipulation in data project

    Importance of string manipulation in data project

    Anyone who worked in programming has some knowledge of string. It is sequence of characters ended by null.

  • HSV and color plate generation

    HSV and color plate generation

    We talk about color by name, their brightness, color mixing etc. Do it really matter on RGB.

  • Choosing right color for presentation

    Choosing right color for presentation

    In my recent works, I was involved in preparing presentations for management who are not technical people. During this…

  • Color Theory - In Brief

    Color Theory - In Brief

    There are many terminologies used in color theory. In this article I am presenting some basic terminologies in sho.

  • The philosophy of color

    The philosophy of color

    Color is important in our life in art, culture, design, presentation and in many ways. We use color to make the view…

  • How to choose right ML algorithm

    How to choose right ML algorithm

    There are lot in literature about ML and ML algorithms. Specially, after neural network and deep learning in the show…

  • Acronyms like words in workplace

    Acronyms like words in workplace

    TEAMWORK: Build a cohesive team with TEAMWORK: Trust, Empower, Achieve, Motivate, Work together LEAD: Lead with…

  • Brain structure

    Brain structure

    Note: This article is mostly motivated by Saul Mcleod, PhD, BSc (Hons) Psychology, MRes, PhD, University of Manchester…

  • Cognitive growth of kids

    Cognitive growth of kids

    Building a nation means preparing its people. Kids are next generation of the nation and we need to take care of them.

社区洞察

其他会员也浏览了