ç™»å½•æŸ¥çœ‹æ›´å¤šå†…å®¹

ç‚¹å‡»â€œç»§ç»åŠ å…¥æˆ–ç™»å½•â€ï¼Œå³è¡¨ç¤ºæ‚¨åŒæ„éµå®ˆé¢†è‹±çš„ã€Šç”¨æˆ·åè®®ã€‹ã€ã€Šéšç§æ”¿ç–ã€‹åŠã€ŠCookie æ”¿ç–ã€‹ã€‚

CYBER SECURITY AND CONFUSION FUNCTION.

Shubham Pandit

engineering student

å‘å¸ƒæ—¥æœŸ: 2021å¹´6æœˆ5æ—¥

+ å…³æ³¨

confusion matrix

When we get the data, after data cleaning, pre-processing and wrangling, the first step we do is to feed it to an outstanding model and of course, get output in probabilities. But how can we measure the effectiveness of our model? Better the effectiveness, better the performance and thatâ€™s what we want. And it is where the Confusion matrix comes into the limelight. Confusion Matrix is a performance measurement for machine learning classification.

A confusion matrix is a table that is used to determine the performance of a classification model. We compare the predicted values for test data with the true values known to us. By this, we know how many cases are classified correctly and how many are classified incorrectly. The table below shows the structure of confusion matrix.

confusion matrix

Letâ€™s understand the terms used here:

In two-class problem, such as attack state, we assign the event normal as â€œpositiveâ€ and anomaly as â€œnegativeâ€œ.
â€œTrue Positiveâ€ for correctly predicted event values.
â€œFalse Positiveâ€ for incorrectly predicted event values.
â€œTrue Negativeâ€ for correctly predicted no-event values.
â€œFalse Negativeâ€ for incorrectly predicted no-event values.

Confusion matrices have two types of errors: Type I and Type II

Now lets see these terms and their significance under the light of cyber attack prediction for better understanding.

IDS or Intrusion Detection System checks for any malicious activity on the system. It monitors the packets coming over internet using some ML model and predicts whether it is normal or an anomaly.

Lets say our model created the following confusion matrix for total of 165 packets it examined.

A total of 165 packets were analyzed by our model in IDS system which have been classified in the above confusion matrix.

â€œPositiveâ€ -> Model predicted no attack.
â€œNegativeâ€ -> Model predicted attack.
True Negative: Out of 55 times for which model predicted attack will take place, 50 predictions were â€˜Trueâ€™ which means 50 times attack actually took place. Due to prediction, Security Operations Centre (SOC) will receive notification and can prevent the attack.
False Negative: Out of 55 times for which model predicted attack will take place, 5 times the attack didnâ€™t happen. This can be considered as â€œFalse Alarmâ€ and also Type II error.
True Positive: The model predicted 110 times that attack wouldnâ€™t take place, out of which 100 times actually no attack happened. These are the correct predictions.
False Positive: 10 times the attack actually took place when the model had predicted that no attack will happen. It is also called as Type I error.

Type I error:

Type I error (False Positive)

This type of error can prove to be very dangerous. Our system predicted no attack but in real attack takes place, in that case no notification would have reached the security team and nothing can be done to prevent it. The False Positive cases above fall in this category and thus one of the aim of model is to minimize this value.

Type II error:

Type II error â€” False Alarm (False Negative)

This type of error are not very dangerous as our system is protected in reality but model predicted an attack. the team would get notified and check for any malicious activity. This doesnâ€™t cause any harm. They can be termed as False Alarm.

Which one to use and where?

This is the most common question that arises while modeling the Data and the solution lies in the problemâ€™s statement domain. Consider these two cases:

1. Suppose you are predicting whether the person will get a cardiac arrest. In this scenario, you canâ€™t afford any misclassification and all the predictions made should be accurate. With that said, the cost of False Negatives is high, so the person was prone to attack but was predicted as safe. These cases should be avoided. In these situations, we need a model with high recall.

2. Suppose a search engine provided random results that are all predicted as positive by the model, then there is very little possibility that the user will rely on it. Therefore, in this scenario, we need a model with high precision so that user experience improves, and the website grows in the right direction.

CYBER CRIME

Cybercrime is criminal activity that either targets or uses a computer, a computer network or a networked device.

Most, but not all, cybercrime is committed by cybercriminals or hackers who want to make money. Cybercrime is carried out by individuals or organizations.

Some cybercriminals are organized, use advanced techniques and are highly technically skilled. Others are novice hackers.

In the present world, cybercrime offenses are happening at an alarming rate. As the use of the Internet is increasing many offenders, make use of this as a means of communication in order to commit a crime. Cybercrime will cost nearly $6 trillion per annum by 2021 as per the cybersecurity ventures report in 2020. For illegal activities, cybercriminals utilize any network computing devices as a primary means of communication with a victimsâ€™ devices, so attackers get profit in terms of finance, publicity and others by exploiting the vulnerabilities over the system. Cybercrimes are steadily increasing daily.

Security analytics with the association of data analytic approaches help us for analyzing and classifying offenses from India-based integrated data that may be either structured or unstructured. The main strength of this work is testing analysis reports, which classify the offenses accurately with 99 percent accuracy.

This is a list of rates that are often computed from a confusion matrix for a binary classifier:

Accuracy: Overall, how often is the classifier correct?
(TP+TN)/total = (100+50)/165 = 0.91
Misclassification Rate: Overall, how often is it wrong?
(FP+FN)/total = (10+5)/165 = 0.09
equivalent to 1 minus Accuracy
also known as "Error Rate"
True Positive Rate: When it's actually yes, how often does it predict yes?
TP/actual yes = 100/105 = 0.95
also known as "Sensitivity" or "Recall"
False Positive Rate: When it's actually no, how often does it predict yes?
FP/actual no = 10/60 = 0.17
True Negative Rate: When it's actually no, how often does it predict no?
TN/actual no = 50/60 = 0.83
equivalent to 1 minus False Positive Rate
also known as "Specificity"
Precision: When it predicts yes, how often is it correct?
TP/predicted yes = 100/110 = 0.91
Prevalence: How often does the yes condition actually occur in our sample?
actual yes/total = 105/165 = 0.64

Thanks for holding up and reading the article

Hope to see you all in my next article.

è¦æŸ¥çœ‹æˆ–æ·»åŠ è¯„è®ºï¼Œè¯·ç™»å½•

Shubham Panditçš„æ›´å¤šæ–‡ç«

Object Recognition using CNN model

2021å¹´8æœˆ12æ—¥

Object Recognition using CNN model

?? In this task : ??Create a model that will detect a car in a live stream or video and recognize characters on theâ€¦
Industry Usecase of JavaScript

2021å¹´8æœˆ12æ—¥

Industry Usecase of JavaScript

Introduction JavaScript is a programming language used primarily by Web browsers to create a dynamic and interactiveâ€¦
Javascript Integration with Docker

2021å¹´8æœˆ12æ—¥

Javascript Integration with Docker

Python CGI with Docker (Task 7) In this project, I have integrated python with Docker !! Python is one of the mostâ€¦
Kubernetes Integration with Python-CGI

2021å¹´8æœˆ12æ—¥

Kubernetes Integration with Python-CGI

LW_DATE_28_06_2021_Task 09 Kubernetes Kubernetes (also known as k8s or â€œkubeâ€) is an open source containerâ€¦
K-Means clustering

2021å¹´8æœˆ12æ—¥

K-Means clustering

Task Description Create a blog/article/video about explaining k mean clustering and its real usecase in the securityâ€¦
Face Recognizer

2021å¹´8æœˆ12æ—¥

Face Recognizer

Thursday, June 24, 2021 Hello Fellas!!! This post a task given by Mr. Vimal Daga sir during my summer internshipâ€¦

See all articles

confusion matrix

Type I error:

Type II error:

Type II error â€” False Alarm (False Negative)

Which one to use and where?

CYBER CRIME

Thanks for holding up and reading the article

Hope to see you all in my next article.

Shubham Panditçš„æ›´å¤šæ–‡ç«

Object Recognition using CNN model

Industry Usecase of JavaScript

Javascript Integration with Docker

Kubernetes Integration with Python-CGI

K-Means clustering

Face Recognizer

ç¤¾åŒºæ´žå¯Ÿ