Exploring Decision Trees: The Branching Paths of Data
A decision tree is a Non-parametric (doesn't assume that your data follows a specific shape or pattern)? supervised machine learning algorithm used for classification and regression tasks. It features a tree-like structure comprising a root node, branches, internal nodes, and leaf nodes. The name 'decision tree' itself suggests its use of a flowchart-like structure to present predictions.
Think of a decision tree as a visual flowchart for decision-making. Similar to a real tree, it consists of branches and leaves. At the top, you have the 'root' node, which represents the starting point. As you move down the tree, you encounter 'internal' nodes, which serve as decision points, and finally, 'leaf' nodes, which provide the answers or predictions.
Let's say you want to decide whether to go for a picnic. Your decision tree might start with the question, "Is it sunny?" If it's sunny, you might go, but if it's not, you'd consider another factor like, "Is it windy?" If it's windy, you might change your mind, but if it's not, you decide to go on a picnic. Each question and answer guides you to the next step until you reach your final decision.
Decision trees in layman terms can be thought of as series of if-else statements. They work by checking a condition, and if that condition is true, it progresses to the next connected node to make further decisions.
This way, decision trees systematically break down complex decisions into a sequence of simpler choices, making them a powerful for problem-solving tasks.
Let’s first understand some basic Terminologies used in decision tree?
Example of decision tree
In the given diagram, the decision tree begins by asking about the weather conditions: Is it sunny, cloudy, or rainy? If it's sunny, cloudy, or rainy, it goes on to consider factors like humidity and wind. Specifically, it checks if there's strong wind or weak wind, and if it's a situation of weak wind during rainy weather, it recommends going out to play.
Now, you might have noticed something interesting in this flowchart. When the weather is cloudy, the decision tree doesn't ask any further questions. You might wonder why it doesn't split more. The answer lies in more advanced concepts like entropy, information gain, and Gini index used in decision tree construction.
In simpler terms, the reason the decision tree stops at "cloudy" is that for the training dataset, the answer to whether you should play is always "yes" when it's cloudy, so there's no need to ask additional questions. The decision is straightforward, and that's why the tree stops at that point.
Now let’s understand what is Entropy..??
Entropy is a measure of the impurity or disorder in a dataset. It particularly determines how to split a dataset at each node of the tree.
In a decision tree, the goal is to create split/ branches? that result in subsets of data that are as homogenous(all elements belong to the same class) as possible.
The entropy value ranges from 0 to 1.
In decision tree algorithms, the goal is to minimize entropy at each split, resulting in subsets that are more pure, making the classification task easier and more accurate.
Now, Let’s understand entropy with help of example :
Imagine you're planning a picnic, and you want to check the weather forecast to decide whether to go or stay home. You look at the forecast, and it says one of three things: "Sunny," "Cloudy," or "Rainy."
Now, imagine you've been keeping track of how many times each of these forecasts turned out to be true. Here's what you find:
Total number of values is 35?
Always remember that the higher the Entropy, the lower will be the purity and the higher will be the impurity.
As mentioned earlier the goal of machine learning is to decrease the uncertainty or impurity in the dataset, here by using the entropy we are getting the impurity of a particular node, But we don’t know if the parent entropy or the entropy of a particular node has decreased or not.
For this, we bring a new metric called “Information gain” which tells us how much the parent entropy has decreased after splitting it with particular feature.
Information Gain:
Information Gain is a metric used to quantifies how much information a feature provides about the class labels of the data, and it is used to select the best feature for making decisions.?
Let’s understand with example:
The higher the Information Gain for a feature, the more it reduces the entropy of the parent node, and therefore, the better it is for making a decision in a decision tree. Features with higher Information Gain are typically selected as the best choices for node splitting during the construction of the tree.???
?Understanding Gini Impurity:
Gini impurity, sometimes referred to as the Gini index, is a valuable metric used in decision tree algorithms to measure impurity or disorder within a dataset. It serves as an alternative to entropy for assessing the quality of splits in decision trees.? Gini impurity is like a measure of how mixed up or organized your data is. It quantifies the disorder or impurity in the data and ranges from 0 (perfectly pure, where all data points belong to a single class) to 0.5 (completely impure, where data points are evenly spread among all classes).
Gini Impurity Formula:
The Gini Formula:
Difference between Entropy and Gini Impurity
Gini Impurity:
Entropy:
Gini impurity is often preferred when efficiency is a concern, as it is quicker to compute in the context of decision tree algorithms.
In conclusion, decision trees are a powerful tool in the world of data science and machine learning. By understanding how they work and when to use them, you can make more informed decisions and create accurate predictive models. So, are you ready to start building your decision trees and unlocking the potential of your data?
Tagging friends and Mentors - Sahibpreet Singh Avinash Benki Sourav Mukhopadhyay Dr. Shafila Bansal Bramhendra K
Research analyst at Phronesis Partners
1 年Keep it up
AI/ML (Founding) @ Dubverse.ai
1 年Good job Jashneet Kaur
Associate Data Scientist | M.Tech (AI) | Generative AI Enthusiast
1 年Great explanation Jashneet Kaur
Data Scientist and AI Enthusiast | Technical Writer, Udacity Bertelsmann Scholar
1 年Awesome explanation thanks for putting up Jashneet Kaur