K-Means Clustering in Security Domain
K-Means Clustering:-
K-means clustering is one of the simplest and popular unsupervised machine learning algorithms. Typically, unsupervised algorithms make inferences from datasets using only input vectors without referring to known, or labeled, outcomes.
It allows us to cluster the data into different groups and is a convenient way to discover the categories of groups in the unlabeled dataset on its own without the need for any training. It is a centroid-based algorithm, where each cluster is associated with a centroid. The main aim of this algorithm is to minimize the sum of distances between the data point and their corresponding clusters.
How it Works:-
The k-means?clustering?algorithm mainly performs two tasks:
Cyber Crime:-
A cybercrime detection is a?process of investigating, analyzing, and recovering critical forensic digital data from the networks involved in the attack,?this could be the Internet and/or a local network—to identify the authors of the digital crime and their true intentions.
???Crime Analysis:-
The procedure is:
1. First, we take the crime dataset
2. Filter dataset according to requirement and create a new dataset which has attributes according to analysis to be done.
3. Read the excel file of the crime dataset and apply “Replace Missing value operator” on it and execute the operation.
4. Perform “Normalize operator” on the resultant dataset and execute the operation.
5. Perform k means clustering on the resultant dataset formed after normalization and execute the operation.
6. From the plot view of the result plot data between crimes and get the required cluster.
7. Analysis can be done on the cluster formed.
???Crime Detection:-
Using the steps mentioned in the flow chart, we can create a cluster.
For eg., we have some dataset, and after analyzing it — we got the following cluster.
The cluster is been found by repeating the process iteratively until our centroids become static. The algorithm has converged recalculating distances, reassigning cases until clusters result in no change. This is the final solution. The two different clusters are labeled with two different colors blue and red. Cluster 1 is blue in color and cluster 2 is red in color. The position of the centroids change given by yellow color.
Conclusion:-
The K-Means Clustering was able to identify the crime patterns from a large number of crimes making the job for crime detectives easier.?The proposed idea has promising value?in the current complex crime and can be used as an effective tool by crime detectives and enforcement of law organizations for crime detection.
Thank You??