Machine learning for anomaly detection

Machine learning for anomaly detection

In data mining, anomaly detection is referred to the identification of items or events that do not conform to an expected pattern or to other items present in a dataset. Typically, these anomalous items have the potential of getting translated into some kind of problems such as structural defects, errors or frauds. Using machine learning for anomaly detection helps in enhancing the speed of detection.

Intrusions are those activities that can damage information systems. Intrusion detection has been gaining broad attention. Anomaly detection can be a key for solving intrusions, as while detecting anomalies, perturbations of normal behavior indicate a presence of intended or unintended induced attacks, defects, faults, and so on. Implementing machine learning algorithms will provide companies with a simple yet effective approach for detecting and classifying these anomalies. Machine learning algorithms have the ability to learn from data and make predictions based on that data. Machine learning for anomaly detection includes techniques that provide a promising alternative for detection and classification of anomalies based on an initially large set of features.

Take a look at the two machine learning techniques that can enable effective anomaly detection:

Supervised Machine Learning for Anomaly Detection

This method requires a labeled training set that contains both normal and anomalous samples for constructing the predictive model. Theoretically, supervised methods are believed to provide better detection rate than unsupervised methods. The most common supervised algorithms are supervised neural networks, parameterization of training model, support vector machine learning, k-nearest neighbors, Bayesian networks and decision trees. K-nearest neighbor (k-NN) is one of the most conventional nonparametric techniques that are used in supervised learning for anomaly detection. It calculates the approximate distances between different points on the input vectors and then assigns the unlabeled point to the class of its K-nearest neighbors. The Bayesian network is another popular model that can encode probabilistic relationships among variables interest. This technique is generally used for anomaly detection in combination with statistical schemes. These supervised techniques have several advantages, including the capability of encoding interdependencies between variables and of predicting events, along with the ability to incorporate both prior knowledge and data.

Unsupervised Machine Learning for Anomaly Detection

These techniques do not require training data. They are based on two basic assumptions. First, they presume that most of the network connections are normal traffic and only a small amount of percentage is abnormal. Second, they anticipate that malicious traffic is statistically different from normal traffic. Based on these two assumptions, data groups of similar instances that appear frequently are assumed to be normal traffic and those data groups that are infrequent are considered to be malicious. The most common unsupervised algorithms are self-organizing maps (SOM), K-means, C-means, expectation-maximization meta-algorithm (EM), adaptive resonance theory (ART), and one-class support vector machine. One popular technique is the self-organizing map (SOM). The main objective of the SOM is to reduce the dimension of data visualization.

Machine learning techniques are now receiving considerable attention among the anomaly detection researchers to address the weaknesses of knowledge base detection techniques.

Anomaly detection can effectively help in catching the fraud, discovering strange activity in large and complex Big Data sets. This can prove to be useful in areas such as banking security, natural sciences, medicine, and marketing, which are prone to malicious activities. With the machine, a learning organization can intensify search and increase effectiveness of their digital business initiatives.

Very informative!!

回复
Muqtader MBA

Business Analyst -IT

7 年

Thanks post...quite informative.

回复

Detection invervent caveat filling optimum biosciences with Neuro sciences and nano feel with n to the power of n zenned to zed fed

回复

Sukh Chen vish kalra

回复

要查看或添加评论,请登录

Naveen Joshi的更多文章

社区洞察

其他会员也浏览了