Cyber Attacks and The Confusion Matrix of ML
image by pixabay

Cyber Attacks and The Confusion Matrix of ML

What is Confusion Matrix?

A confusion matrix is a numeral matrix or table which used for the visualization of the performance of an algorithm in machine learning models and different kinds of statistical classifications. In machine learning

Let's have an example of a confusion matrix for a binary classification machine learning model which has a target value in form of YES/NO. Suppose our model is predicting for 200 students or observations, whether they will pass an exam or not. For instance, the following is the confusion matrix which we got for 200 test observations from the above ML model.

Let's discuss what our model has predicted

No alt text provided for this image


Each row of the confusion matrix represents the instances in an actual class while each column represents the instances in a predicted class. -Wikipedia

As above excerpt on Wikipedia mentions the sole terminology of the confusion matrix. The above statement states that the values predicted by the module are mentioned by the columns, whereas The actual values are distributed over the rows. From the above excerpt, we can conclude the data present in the above confusion matrix are in four classes,

  1. Predicted[YES]|Actual[YES]: This class highlights the predictions which are predicted as YES and in reality, they are YES.So, this prediction comes under accurate predictions. These are the one which is predicted "will pass" and actually passed the exam.
  2. Predicted[YES]|Actual[NO]: This class is representing those data instances that are predicted as YES but in reality, they are NO. This class comes under a prediction which is having error or which are wrong. These are the ones which are predicted "will pass", but actually failed in the exam.
  3. Predicted[NO]|Actual[YES]: This class includes the predictions predicted as NO but which are YES in the real world. This class also comes under wrong predictions. But this class of prediction is very saviors or critical than others cause it is predicting a negative result for the data instance which is positive in the real world. These are the one which is predicted "will fail", but actually passed the exam.
  4. Predicted[NO]|Actual[NO]: These are the data predictions predicted as NO and in actuality, they are NO. As this class predicts accurate prediction it comes under accurate predictions. These are the one which is predicted "will fail" and actually failed in the exam.

Terminology of Confusion Matrix

No alt text provided for this image


The above image shows the terminology of what a Confusion Matrix predicts. As discussed in the above example, the four classes are termed True Positive, False Positive, False Negative, True Negative in mathematical language.

  1. True Positives (TP): These are cases in which we predicted positive results and outcome in the real world is positive.
  2. False Positives (FP): This is the class that is predicted as positive but in reality they are negative.
  3. False Negative (FN): These cases are those which are predicted as negative by actually they are positive.
  4. True Negative (TN): This class is of those cases which are predicted as negative and in reality are negative.

What is Cyber Attack?

cyberattack is any offensive maneuver that targets computer information systems, infrastructures, computer networks, or personal computer devices. -Wikipedia

As above excerpt from the Wikipedia describes A cyber attack is an attack on the servers or computer which are in the public or private domain of the internet. Most of the time the attackers seek to expose, damage, alter, disable or try stealing the data present in the system or changing the system configuration. All these activities are done in an unauthorized manner. This cyberattack is also known as cybercrime.

Cyber Attack And The Machine Learning

To prevent the cyberattack, organizations take many preventive steps against the attackers and always tries to secure and strengthen their system and security. It has been common practice among many organizations to having a lot of data to handle. They have to handle huge servers and storage or data centers. Human efforts are slow and as well as ineffective in some aspects while processing huge data. As manual efforts always create some errors, we need to find some automatic ways to work in this scenario. Machine Learning is very useful to the teams to manage the servers, to keep them safe.

By combining Machine learning with human intelligence we can achieve great things along with great speed. Based on historical data of attacks, threats, patterns of bugs, which are happened previously, Machine learning models can be created and trained to detect the attacks. KDD99 is one of the famous models which is been used for the same purpose. If a similar pattern of activities repeats on the system then the Machine Learning model can predict whether an attack will happen or not. If the machine learning model successfully predicts the attack then it can be prevented.

Roll of Confusion Matrix

Here the role of the confusion matrix comes into importance. If the table mentioned below is the Confusion Matrix for is the Machine Learning Model, we can conclude it in the same way in four classes.

No alt text provided for this image


  1. True Positive (TP): Attack detected as positive when it is actually an attack.
  2. False Positive (FP): Attack detected as positive when it is actually normal scenario or not an attack.
  3. False Negative (FN): Situation detected as negative when it is actually an attack. 
  4. True Negative (TN): Negative detected when it is actually a normal situation or not an attack.

Now, here two types of errors occur

Type I error

False Positive (FT) is the class that shows this type of error. In these errors, the attack is detected as positive by the ML model, but in reality, there is no attack. This is not a critical type of error. If some scenario is detected as an attack but if it is not, it won't do anything disastrous.

Type II error

True Negative (TN) is the class that shows the second type of error. In these errors, the attack is not detected and the scenario is classified as a normal situation, but in reality, it is an active attack. This is a critical kind of error. This is the time when the ML model has failed seriously to detect the attack. This error is considered very serious, cause in such a prediction as the attack is not predicted by the ML model no preventive measure would be taken.


Thanks for reading this article

要查看或添加评论,请登录

Vinod Patil的更多文章

  • JavaScript in the Industry

    JavaScript in the Industry

    JavaScript JavaScript is a programming language that allows you to implement complex features on web pages. Every time…

社区洞察

其他会员也浏览了