Confusion Matrix & Cyber Crime

Confusion Matrix & Cyber Crime

A confusion matrix is a good and reliable metric to use with classification problems. It is used to prove that the model is good or bad for different classes and their different impact. For example, if the model needs to catch classes of one particular class more than the other, we can create that measure from the confusion matrix. Let’s understand this by the example of two classes 0 and 1. There are four possible scenarios can happen while prediction:

  1. Class is 1 and our model predicted 1 – That’s correct!
  2. Class is 1 and our model predicted 0 – Not good.
  3. Class is 0 and our model predicted 1 – Again not good.
  4. Class is 0 and our model predicted 0 – Correct!

We can bind all these scenarios in a matrix-like this


No alt text provided for this image

If we consider a 3-class problem, this matrix would be 9×9. Let’s create a few more terms

  • Now if our model predicts class 1, we call it Positive and if it predicted 0, we call it Negative.
  • If the model predicted correctly, we mark it as True, if it failed we mark it as False.

So many words. Now let’s use Positive, Negative, True and False combination to create the above metric

No alt text provided for this image

Now understand each term

  • True Positive – When the actual class of a data point is 1 and model predicted 1. (Model is truly saying positive, you can trust ?? )
  • False Negative – When the actual class of data point is 1 and model predicted 0. (Model is falsely saying Negative, not reliable ?)
  • False Positive – When the actual class of data point is 0 and model predicted 1. (Model is falsely saying Positive, not reliable ?)
  • True Negative – When the actual class of a data point is 0 and model predicted 0. (Model is truly saying negative, trustworthy ?? )

Any great model put very high numbers (or ratio) in True Positive and True Negative compare to other ones. But it is not always possible to optimize everything. So, we prioritize our needs to make the model more useful for businesses.

What can we learn from this?

A valid question arises that what we can do with this matrix. There are some important terminologies based on this:


  • Precision: It is the portion of values that are identified by the model as correct and are relevant to the problem statement solution. We can also quote this as values, which are a portion of the total positive results given by the model and are positive. Therefore, we can give its formula as TP/ (TP + FP).
No alt text provided for this image
No alt text provided for this image


  • Recall: It is the portion of values that are correctly identified as positive by the model. It is also termed as True Positive Rate or Sensitivity. Its formula comes out to be TP/ (TP+FN).
No alt text provided for this image


  • F-1 Score: It is the harmonic mean of Precision and Recall. It means that if we were to compare two models, then this metric will suppress the extreme values and consider both False Positives and False Negatives at the same time. It can be quoted as 2*Precision*Recall/ (Precision+Recall).
No alt text provided for this image


  • Accuracy: It is the portion of values that are identified correctly irrespective of whether they are positives or negatives. It means that all True positives and True negatives are included in this. The formula for this is (TP+TN)/ (TP+TN+FP+FN).
No alt text provided for this image


Out of all the terms, precision and recall are most widely used. Their tradeoff is a useful measure of the success of a prediction. The desired model is supposed to have high precision and high recall, but this is only in perfectly separable data. In practical use cases, the data is highly unorganized and imbalanced.

Which one to use and where?

This is the most common question that arises while modeling the Data and the solution lies in the problem’s statement domain. Consider these two cases:

1. Suppose you are predicting whether the person will get a cardiac arrest. In this scenario, you can’t afford any misclassification and all the predictions made should be accurate. With that said, the cost of False Negatives is high, so the person was prone to attack but was predicted as safe. These cases should be avoided. In these situations, we need a model with high recall.

2. Suppose a search engine provided random results that are all predicted as positive by the model, then there is very little possibility that the user will rely on it. Therefore, in this scenario, we need a model with high precision so that user experience improves, and the website grows in the right direction.

No alt text provided for this image


What is cybercrime?

Cybercrime is criminal activity that either targets or uses a computer, a computer network or a networked device.

Most, but not all, cybercrime is committed by cybercriminals or hackers who want to make money. Cybercrime is carried out by individuals or organizations.

Some cybercriminals are organized, use advanced techniques and are highly technically skilled. Others are novice hackers.

Rarely, cybercrime aims to damage computers for reasons other than profit. These could be political or personal.

Example :

In a recent sensational cybercrime, a 16-year-old student of the Air Force Bal Bharti School in new Delhi was arrested for having created a pornographic website. The case which otherwise would have gathered dust in court, was quickly capped by the juvenile welfare board, who granted bail. The student was also rusticated from school. Yet another fallout of the e-volution happened at Indore, when a group of school students spliced photographs of girls from their school with nude pictures downloaded from the net. No arrests were made. The reason being that enforcement agencies had no clear-cut idea about the definition of cybercrime and the laws under which it should be tried. Though the information technology bill 2000, dealing with cyberlaws, has been passed by the Lok Sabha, Rajya Sabha and the president of India, there is still to emerge an awareness of the same. Prof. M. S. Raste, principal of Symbiosis Law College admits that "a major portion of the judicial fraternity has no idea about cyberlaws and their applications". Even if some of the younger lawyers like advocate jitendra patil are net-savvy and understand the implications of cybercrime, there remains a cloud of confusion when it comes to the matter of jurisdiction. "What if the crime has been committed by a website developer who is not a resident of india? How does one register a crime against an unknown person who has hacked into a confidential site? Such questions need immediate clarification before legal experts can give thought to the enactment of the cyberlaws," opines prof raste. Incidentally, Symbiosis Law College will be starting a one-year diploma course in cyberlaws. Cybercrimes, as Deepak Shikarpur, it chairman of the Mahratta Chamber of Commerce, Industries and Agriculture (MCCIA), explains, is not restricted to pornography. "Some of the major felonies on the rise are those related to e-commerce. the digital economy is moving at lightning speed and has changed everything, including relationships inside and outside a company's four walls," he says. Agrees Ujwal Marathe, a chartered accountant specialising in it audit who has co-authored a book on cyberlaws with Shikarpur and Sarita Bhave". The kind of crimes which have surfaced in the financial sphere of the internet relate to hacking into banking transactions, theft of intellectual property rights, misuse of credit card numbers and misappropriations conducted by employees," informs Marathe. Here again, Indian cyberlaws are being seen as too "open-ended" to curb the adventurous spirit of cyber-thieves. "There are specific issues of jurisdiction which need to be settled," advises Marathe. However, cyberlaws do give special powers to the police force. A police officer not below the rank of Deputy Superintendent of Police (DSP), can investigate into a cybercrime and seize computer equipment from a company or cybercafe without the need for a search or arrest warrant. Deputy Commissioner of Police (DCP) sanjay varma states that some senior officers in his team are quite well aware of cybercrimes and cyberlaws, having attended workshops. "But we are yet to investigate into one and get a feel of the procedures because there has been no complaint registered so far. Complaints may not be forthcoming because the public does not know that the police has been equipped to deal with cybercrimes," he says. There was a complaint in the past but that was by pune-based rohas nagpal who filed a case against the website rediff.com for providing access to pornographic sites. "But that was before the cyberlaws came into existence and hence, it was filed under a section of the indian penal code," informs nagpal. The case is stuck in the high court. Prior to that, two pune junior college students had successfully hacked into the vsnl site and opened accounts of its subscribers, but no action was taken against them because they had confessed to doing it as a test of their skills. The biggest problem as of now is for the enforcement agencies like the police to understand how to trace the culprits. "Just about anyone can commit a crime from a cybercafe," says marathe. In such a case, it would be almost impossible to arrive at an identification. DCP Varma says that should such a situation arrive, "the help of experts would be taken." The only solution, as suggested by prof marathe, is that the government should step on a promotion drive of these new laws. till then, the damage will continue, laws or no laws.

Still, have a query, feel free to ask in the comment box.

Thank you !... #keeplearning #keepsharing

Akash Taralekar

Executive - New projects and technology Chemical Engineer Galaxy surfactants LTD.

3 年

Interesting.

要查看或添加评论,请登录

Balaji Pandhawale的更多文章

社区洞察

其他会员也浏览了