Paper Made Easy: A guide to Hierarchical Classification

Paper Made Easy: A guide to Hierarchical Classification

No alt text provided for this image
Flat Classification
Hierarchical Classification

A very large amount of research in the data mining, machine learning, statistical pattern recognition and related research communities has focused on flat classification problems. Flat classification problem is referring to standard binary or multi-class classification problems. On the other hand, many important real-world classification problems are naturally cast as Hierarchical classification problems where the classes to be predicted are organized into a class hierarchy — typically a tree or a DAG (Directed Acyclic Graph)

The task of hierarchical classification, however, needs to be better defined, as it can be overlooked or confused with other tasks, which are often wrongly referred to by the same name.

for example:

Apple can refer to the fruit or the company

Issues with existing Classification:

Let us consider initially two types of conventional classification methods that cannot directly cope with hierarchical classes: two-class and multi-class classifiers.

First, the main difference between a binary classifier and a multi-class classifier is that the binary classifier can only handle two-class problems, whilst a multi-class classifier can handle in principle any number of classes.

Secondly, there are multi-class classifiers that can also be multi-label, i.e. the answer from the classifier can be more than one class assigned to a given example.

Thirdly, since these types of classifiers were not designed to deal with hierarchical classification problems, they will be referred to as flat classification algorithms.

Fourthly, in the context of hierarchical classification, most approaches could be called multi-label.

Existing hierarchical classification methods:

  1. The top-down(Flat Classification) approach is not a full hierarchical classification approach by itself, but rather a method for avoiding or correcting inconsistencies in class prediction at different levels, during the testing (rather than training) phase
  2. There are different ways of using local information to create local classifiers, and although most of them are referred to as top-down in the literature, they are very different during the training phase and slightly different in the test phase
  3. Big-bang (or global) classifiers are trained by considering the entire class hierarchy at once, and hence they lack the kind of modularity for local training of the classifier that is a core characteristic of the local classifier approach.

1. Flat Classification Approach

The flat classification approach, which is the simplest one to deal with hierarchical classification problems, consists of completely ignoring the class hierarchy, typically predicting only classes at the leaf nodes. This approach behaves like a traditional classification algorithm during training and testing. However, it provides an indirect solution to the problem of hierarchical classification, because, when a leaf class is assigned to an example, one can consider that all its ancestor classes are also implicitly assigned to that instance

No alt text provided for this image

However, this very simple approach has the serious disadvantage of having to build a classifier to discriminate among a large number of classes (all leaf classes), without exploring information about parent-child class relationships present in the class hierarchy.

2. Local Classifiers Approach

by Local Classifier Per Node Approach,

The local classifier per node approach consists of training one binary classifier for each node of the class hierarchy

No alt text provided for this image

by Local Classifier Per Parent Node Approach,

In this approach where, for each parent node in the class hierarchy, a multi-class classifier (or a problem decomposition approach with binary classifiers like One-Against-One scheme for Binary SVMs) is trained to distinguish between its child nodes

No alt text provided for this image

by Local Classifier Per Level Approach,

The local classifier per level approach consists of training one multiclass classifier for each level of the class hierarchy.

No alt text provided for this image

3. Big-bang (or Global Classifier) Approach

Although the problem of hierarchical classification can be tackled by using the previously described local approaches, learning a single global model for all classes has the advantage that the total size of the global classification model is typically considerably smaller, by comparison with the total size of all the local models learned by any of the local classifier approaches.

No alt text provided for this image

In the global classifier approach, a single (relatively complex) classification model is built from the training set, taking into account the class hierarchy as a whole during a single run of the classification algorithm. When used during the test phase, each test example is classified by the induced model, a process that can assign classes at potentially every level of the hierarchy to the test example.

Source : A Survey of Hierarchical Classification Across Different Application Domains Carlos N. Silla Jr. · Alex A. Freitas

Akil Rilwan

Machine Learning | Data Science | Smart Cities | Smart Mobility

11 个月

So guy, what are you doing here? This is completely plagiarised text. You added nothing. ??

回复
Tanya Dimitrov

Data Analyst, MSc

2 年

Hi Manish, The concept of hierarchial clustering is super interesting to me. I am trying to build my own hierarchial clustering model with different classification (Random forest, logistic regression,...) models at different nodes. However, it is unclear to me how to train the model (what kind of cross-validation to apply) to prevent overfitting but also in addressing datapoints that are misclassified bythe parent node. Any recommendations?

要查看或添加评论,请登录

Manish Prasad的更多文章

社区洞察

其他会员也浏览了