Decision Tree Algorithms: My learning approach in brief
A decision tree using classifier object is shown in the left side and in the right side R-squared by alpha, used during pruning process of a regressor model

Decision Tree Algorithms: My learning approach in brief

A Decision Tree is a machine learning approach that uses an inverted-tree-like structure to model the relationship between the dependent and the independent variables. It is a supervised machine learning algorithm that is widely used. The most exciting part is that it can be used for both classification as well as regression models i.e. if we take the dependent variable as categorical or discrete(y/n, T/F, 0/1), then we build Classification Tree; when the dependent variable is continuous (age, income, salary, weather conditions), we build Regression Tree. Under the guidance of Frederick Nwanganga throughout the whole journey of learning this ML approach, the key points which I want to highlight are:

  • Recursive Partitioning is the first and one of the most important steps in splitting the data into smaller subsets.
  • The recursive partitioning is a greedy algorithm. This means that it makes the best decision at each step without considering the future consequences of the decision. This can lead to overfitting, which is when the model fits the training data too well but with testing data, its accuracy is moderately very low.
  • Due to this problem, classification trees use Entropy and Gini while regression trees use SSR(sum of squared residuals) as mathematical measures to detect the amount of impurity occurred in partitions.
  • To prevent overfitting in decision trees, Pruning is the process that is used to minutely manage the size of decision trees during or after the recursive partition. The pruning algorithm removes nodes that do not contribute significantly to the model's predictions. (pre-pruning/post-pruning)
  • Decision Trees can be used for both large and small datasets, however, an ample amount of training leads to better predictive accuracy.
  • Splits can be sometimes biased towards features with a large number of unique values.

#ai #ml #decisiontrees #algorithms #machinelearning #mlalgorithms #aicommunity

要查看或添加评论,请登录

社区洞察

其他会员也浏览了