A decision tree or why are we talking about horticulture in data science?

Mantas Lukauskas, PhD

AI @Hostinger, AI Evangelist @nexos.ai

发布日期: 2020年7月29日

+ 关注

A tree is usually a large perennial, deciduous, or evergreen woody plant. Areas of trees are called forests. Trees are common on all continents except Antarctica. Although there is no set minimum height, trees are usually classified as plants higher than 6 m with secondary branches branched from the trunk. However, bonsai are also considered trees, although they are less than a meter high.

But today is not about that...

The decision tree learning method is the most commonly used in data mining. The purpose of the method is to create a model that predicts the value of a dependent variable based on several independent variables. Decision trees have the following properties:

Perfectly visualizes the result
An easily interpretable result can be used when the variables are strongly interdependent and duplicate the information result
Tends to learn, it is difficult to choose optimal pruning

Each leaf is the value of a dependent variable in relation to the independent variables that are represented in the path from the tree trunk to the leaf. Decision trees can be used to solve both the data classification problem and the regression problem. On the side is a small decision tree that depicts whether the buyer will buy a computer. As we can see, the data are first divided into three groups according to the person's age, and in one of the divided groups, a "pure" class is obtained. The other two groups are then further subdivided so that only one class of data remains on each sheet. Continuous variables are also well used in decision trees. To use continuous variables in decision trees, a split-point must be provided, according to which the values of the variable will be split (usually) into two parts.

Various methods can be used for this, from the simplest, where the median is used for the division (then the sample is divided into two equal parts), to the complex ones, where the various decomposition options are additionally tested and the optimal one is chosen according to some criteria.

Combining values. Some decision tree algorithms always perform a split sample into exactly two parts (binary trees). In this case, if the variable is discrete and takes on more than two values, its values will be grouped into two groups into which to divide.

Decision trees, like conventional trees, are pruned because decision trees tend to learn. too well adapted to the training sample data. To prevent this, it is necessary to prune the trees by getting rid of the branches close to the leaves of the tree. Pruning methods are divided into two types:

Pre pruning - in this case, the tree is not formed using the above-mentioned stopping condition, but some criterion is used, according to which the division into branches is stopped earlier;
Post pruning - the tree is fully pruned and then some algorithm (criterion) is used to prune the branches.

Several different algorithms have been developed for tree training: ID3, C4.5, CART, CHAID, and so on. However, about these algorithms in detail a little later, let’s follow the news ??

In practice, however, it is often not individual trees that are used, but a derivative of these trees that are random forests. These methods are classified as "ensemble methods".

Not one tree is formed for one training sample, but many decision trees (e.g. 100 or 1000). To form trees that are not identical, randomization is used during tree formation: a random part of the sample is used to form the tree, not the whole; the variable for splitting the sample into tree branches is selected not optimally from all variables but random part variables.

With a decision tree forest, each tree performs a point classification, and the final class is selected (“by voting”) according to the majority principle. At the same time, it makes it possible to determine the probability with which a point is assigned to a particular class.

Properly randomized, random forests are not prone to learning, even if they use learning trees that are prone to learning.

Yes, that is theory but how to implement it in practice?

There is not one different way to put this into practice because mostly all statistical analysis software has decision tree functionality. Here are some ways to implement a decision tree. All of these are just examples, as there are many different libraries/packages to solve this problem.

要查看或添加评论，请登录

Mantas Lukauskas, PhD的更多文章

Transformers: how natural language processing improved that much and how they work

2021年6月30日

Transformers: how natural language processing improved that much and how they work

Natural language processing strives to build machines that understand and respond to text or voice data—and respond…

1 条评论
Recurrent Neural Networks: What Is This and How It Works?

2021年6月3日

Recurrent Neural Networks: What Is This and How It Works?

Commonly used direct propagation neural networks discussed in the previous article (https://www.linkedin.
One month at Zyro: what I expected and what I got here

2021年5月12日

One month at Zyro: what I expected and what I got here

Exactly a month ago, I wrote that a new stage in my career was beginning, which is like a new chapter in a book. So…

3 条评论
Neural networks and how they work

2021年5月6日

Neural networks and how they work

The human brain is one of the most interesting things that has not yet been fully elucidated. A neuron or nerve cell is…

1 条评论
Linear regression. What is it and how can it be useful?

2020年7月21日

Linear regression. What is it and how can it be useful?

Linear regression analysis is probably one of the most important methods of multivariate statistical analysis that can…
Attendance in "42nd Lithuanian National Conference of Physics"

2017年10月7日

Attendance in "42nd Lithuanian National Conference of Physics"

This week I, 1st year Master degree student of Business Big Data Analytics of Kaunas University of technology, attended…

See all articles

Yes, that is theory but how to implement it in practice?

Mantas Lukauskas, PhD的更多文章

Transformers: how natural language processing improved that much and how they work

Recurrent Neural Networks: What Is This and How It Works?

One month at Zyro: what I expected and what I got here

Neural networks and how they work

Linear regression. What is it and how can it be useful?

Attendance in "42nd Lithuanian National Conference of Physics"

社区洞察