Use of K-mean clustering in security domain
Yash Indane
Tech Enthusiast | Integrating Technologies | 1x AWS Certified | 6x Microsoft Certified | Cloud Computing | DevOps
Summer Task-10 & ARTH Task 42
Github ->
What is K-means Clustering?
K-means clustering is one of the simplest and popular unsupervised machine learning algorithms. In other words, the K-means algorithm?identifies k number of centroids, and then allocates every data point to the nearest cluster, while keeping the centroids as small as possible.
Implementation of K-means clustering ->
First get the optimal number of K that is the number of optimal clusters, for getting this we can use dendrograms or visualization of the plot of MSE Vs K.
Let's look at a plot between MSE Vs K
Elbow point is the point from where the mean square error starts to decrease gradually. This point also indicates the optimal number of clusters to be present in the given data points.
Clustering ->
Optimal clusters = 3
Using K-means clustering in Security ->
Internet security has been one of the most important problems in the world. Anomaly detection is the basic method to defend new attack in intrusion detection. Network intrusion detection is the process of monitoring the events occurring in a computing system or network and analyzing them for signs of intrusions, defined as attempts to compromise the confidentiality. A wide variety of data mining techniques have been applied to intrusion detections. In data mining, clustering is the most important unsupervised learning process used to find the structures or patterns in a collection of unlabeled data. We use the K-means algorithm to cluster and analyze the data in this paper. Computer simulations show that this method can detect unknown intrusions efficiently in the real network connections.