登录查看更多内容

Paper Made Easy: A guide to Hierarchical Classification

Manish Prasad

Lead Data Scientist at Vista | VIT Alumnus | Causal Inference | Bayesian Statistics | NLP & Computer Vision |

发布日期: 2020年1月19日

+ 关注

Flat Classification

Hierarchical Classification

A very large amount of research in the data mining, machine learning, statistical pattern recognition and related research communities has focused on flat classification problems. Flat classification problem is referring to standard binary or multi-class classification problems. On the other hand, many important real-world classification problems are naturally cast as Hierarchical classification problems where the classes to be predicted are organized into a class hierarchy — typically a tree or a DAG (Directed Acyclic Graph)

The task of hierarchical classification, however, needs to be better defined, as it can be overlooked or confused with other tasks, which are often wrongly referred to by the same name.

for example:

Apple can refer to the fruit or the company

Issues with existing Classification:

Let us consider initially two types of conventional classification methods that cannot directly cope with hierarchical classes: two-class and multi-class classifiers.

First, the main difference between a binary classifier and a multi-class classifier is that the binary classifier can only handle two-class problems, whilst a multi-class classifier can handle in principle any number of classes.

Secondly, there are multi-class classifiers that can also be multi-label, i.e. the answer from the classifier can be more than one class assigned to a given example.

Thirdly, since these types of classifiers were not designed to deal with hierarchical classification problems, they will be referred to as flat classification algorithms.

Fourthly, in the context of hierarchical classification, most approaches could be called multi-label.

Existing hierarchical classification methods:

The top-down(Flat Classification) approach is not a full hierarchical classification approach by itself, but rather a method for avoiding or correcting inconsistencies in class prediction at different levels, during the testing (rather than training) phase
There are different ways of using local information to create local classifiers, and although most of them are referred to as top-down in the literature, they are very different during the training phase and slightly different in the test phase
Big-bang (or global) classifiers are trained by considering the entire class hierarchy at once, and hence they lack the kind of modularity for local training of the classifier that is a core characteristic of the local classifier approach.

1. Flat Classification Approach

The flat classification approach, which is the simplest one to deal with hierarchical classification problems, consists of completely ignoring the class hierarchy, typically predicting only classes at the leaf nodes. This approach behaves like a traditional classification algorithm during training and testing. However, it provides an indirect solution to the problem of hierarchical classification, because, when a leaf class is assigned to an example, one can consider that all its ancestor classes are also implicitly assigned to that instance

However, this very simple approach has the serious disadvantage of having to build a classifier to discriminate among a large number of classes (all leaf classes), without exploring information about parent-child class relationships present in the class hierarchy.

2. Local Classifiers Approach

by Local Classifier Per Node Approach,

The local classifier per node approach consists of training one binary classifier for each node of the class hierarchy

by Local Classifier Per Parent Node Approach,

In this approach where, for each parent node in the class hierarchy, a multi-class classifier (or a problem decomposition approach with binary classifiers like One-Against-One scheme for Binary SVMs) is trained to distinguish between its child nodes

by Local Classifier Per Level Approach,

The local classifier per level approach consists of training one multiclass classifier for each level of the class hierarchy.

3. Big-bang (or Global Classifier) Approach

Although the problem of hierarchical classification can be tackled by using the previously described local approaches, learning a single global model for all classes has the advantage that the total size of the global classification model is typically considerably smaller, by comparison with the total size of all the local models learned by any of the local classifier approaches.

In the global classifier approach, a single (relatively complex) classification model is built from the training set, taking into account the class hierarchy as a whole during a single run of the classification algorithm. When used during the test phase, each test example is classified by the induced model, a process that can assign classes at potentially every level of the hierarchy to the test example.

Source : A Survey of Hierarchical Classification Across Different Application Domains Carlos N. Silla Jr. · Alex A. Freitas

Akil Rilwan

Machine Learning | Data Science | Smart Cities | Smart Mobility

11 个月

So guy, what are you doing here? This is completely plagiarised text. You added nothing. ??

Tanya Dimitrov

Data Analyst, MSc

2 年

Hi Manish, The concept of hierarchial clustering is super interesting to me. I am trying to build my own hierarchial clustering model with different classification (Random forest, logistic regression,...) models at different nodes. However, it is unclear to me how to train the model (what kind of cross-validation to apply) to prevent overfitting but also in addressing datapoints that are misclassified bythe parent node. Any recommendations?

1 次回应

查看更多评论

要查看或添加评论，请登录

Manish Prasad的更多文章

Case Study: Using LLM for Product Recommendations at “Zonama”

2024年6月15日

Case Study: Using LLM for Product Recommendations at “Zonama”

Ever wondered how big online stores seem to know exactly what you want? Meet Zonama, a giant in the e-commerce world…
Machine Learning Basics : Scalars, Vectors, Matrices and Tensors

2020年1月14日

Machine Learning Basics : Scalars, Vectors, Matrices and Tensors

Machine Learning involves several types of mathematical objects: Scalar A scalar is just a single number, in contrast…
Random Forest for Learning Imbalanced Data

2019年3月19日

Random Forest for Learning Imbalanced Data

Using Random Forest to Learn Imbalanced Data The most common difficulties while working on Classification is imbalanced…
Introduction to Reinforcement Learning : Trial and Error way of Learning

2019年1月9日

Introduction to Reinforcement Learning : Trial and Error way of Learning

What is Reinforcement Learning? Reinforcement learning is learning of “what to do — how to map situations to actions —…
Incremental learning algorithms and applications

2018年11月20日

Incremental learning algorithms and applications

Incremental learning refers to learning from streaming data, which arrive over time, with limited memory resources and,…

See all articles

Paper Made Easy: A guide to Hierarchical Classification

Manish Prasad

Lead Data Scientist at Vista | VIT Alumnus | Causal Inference | Bayesian Statistics | NLP & Computer Vision |

Issues with existing Classification:

Existing hierarchical classification methods:

1. Flat Classification Approach

2. Local Classifiers Approach

3. Big-bang (or Global Classifier) Approach

Manish Prasad的更多文章

社区洞察

其他会员也浏览了

Data Mining

Data Mining Techniques for Uncovering Hidden Market Opportunities

Data Mining - Classification: k-Nearest Neighbors (k-NN)

Research Leaders on Data Mining, Data Science and Big Data key advances, top trends

Sin #4: Publish data without a supplementary article.

Data mining vs. machine learning – what’s the difference

K-means clustering in Seismic Data Processing and Interpretation

Pattern Mining from Dark Data - A New Imperative in the Age of AI

SAS Visual Text Analytics

How Data Mining and AI Help Create Business Value

Issues with existing Classification:

Existing hierarchical classification methods:

1. Flat Classification Approach

2. Local Classifiers Approach

3. Big-bang (or Global Classifier) Approach

Manish Prasad的更多文章

Case Study: Using LLM for Product Recommendations at “Zonama”

Machine Learning Basics : Scalars, Vectors, Matrices and Tensors

Random Forest for Learning Imbalanced Data

Introduction to Reinforcement Learning : Trial and Error way of Learning

Incremental learning algorithms and applications

社区洞察

其他会员也浏览了

Data Mining

Data Mining Techniques for Uncovering Hidden Market Opportunities

Data Mining - Classification: k-Nearest Neighbors (k-NN)

Research Leaders on Data Mining, Data Science and Big Data key advances, top trends

Sin #4: Publish data without a supplementary article.

Data mining vs. machine learning – what’s the difference

K-means clustering in Seismic Data Processing and Interpretation

Pattern Mining from Dark Data - A New Imperative in the Age of AI

SAS Visual Text Analytics

How Data Mining and AI Help Create Business Value