登录查看更多内容

Should I Write This Blog or Not ? Let Data Science Decide

Haripriya Dhabalia

Marketing Officer at Raman & Weil

发布日期: 2020年10月19日

+ 关注

An introduction to the Decision Tree - Classification Machine Learning Algorithm

Easy to grasp, right? For those wondering, yes I am sipping tea as I write this post

What is a Decision Tree?

Decision Tree is a supervised machine learning algorithm which is a flow chart like structure. It is an upside down tree. It plays an effective role in uncovering new trends in healthcare organization, identifying relationship and predicting new data models. It is used for both classification and regression problems. This tool has proved to be more convenient to assist physicians in detecting the diseases by obtaining knowledge and information regarding the disease from patient’s data. There are several different popular algorithms to build a Decision Tree, but they must include two steps: Constructing a tree and Pruning the tree.

Constructing a tree:

Constructing a decision tree consists of 3 main elements -

Root Node: A root node is the topmost node in a Decision tree.
Leaf Node: A node is a leaf node if both left and right child nodes of it are NULL. The number of leaf nodes in a full binary tree with n nodes is equal to (n+1)/2.
Node: Internal nodes represents test on attribute. Each leaf node consist of labels.

Pruning the decision tree:

It sounds like once the tree is built we are done, but not really. Most of the time, we have to prune the tree to avoid “overfitting”.

It turns out that the tree on the left seems to be overfitted to the training example. After the pruning of the tree, we are expecting it becomes something on the right. It is built through binary recursive partitioning (divide and conquer), it splits the data into subsets, which are then split repeatedly into even smaller subsets, and so on and so forth until the process stops when the algorithm determines the data within the subsets are sufficiently homogenous, or another stopping criterion has been met. Commonly used algorithms are ID3, CHAID and MARS

Data splits based on the impurity measures as shown below:

Information Gain - This approach selects the splitting attribute that minimizes the value of entropy, thus maximizing the information gain. It is the measure of how much information the answer to a specific question provides.
Entropy – Measure of how much uncertainty is present in the given information
Gini index - This index measures the impurity of the data by calculating every attribute that is present in the dataset
Gain ratio Decision Trees – To reduce the effect of the bias resulting from the use of Information Gain, this ratio adjusts the information gain for each attribute to allow for the breadth and uniformity of the attribute values

Used Cases in Healthcare

Decision trees can support early and accurate diagnosis of myocardial infarction, it can predict the probability that a patient with chest pain is having an MI based solely upon data available [Tsien, 1998].
Bonner examined the application of the decision tree approach to collaborative clinical decision-making in mental health care in the United Kingdom [Bonner, 2001].
Development of an e-Health system focusing on chronic diseases among women in rural and remote areas by implementing a designed decision-tree based classifier in machine learning The generated pruned tree using C4 showed the root node as hemoglobin. This indicated that hemoglobin value has the highest influence in predicting CKD (chronic kidney disease) [Jones, 2001]
Letourneau et al used a decision tree approach in decision making for chronic wound care. Data were collected from two groups of home care nurses in large urban centers. One group was measured after initial contact with the decision tree, and the other group was measured two years after implementation of the decision tree. The chronic wound management decision tree (CWMDT) was used, in combination with pictorial case studies. They conclude that a decision tree can assist with decision making by guiding the nurse through assessment and treatment options [Letourneau, 1998].
Decision trees are also useful for fraud detection in the insurance sector [Babic, 2000].
Prioritizing patients for emergency room treatment based on age, gender, blood pressure, temperature, heart rate, severity of pain and other vital measurements [Cantu-Paz, 2000]

Business Benefits:

Decision trees assign specific values to each problem, decision path and outcome. Using monetary values makes costs and benefits explicit. This approach identifies the relevant decision paths, reduces uncertainty, clears up ambiguity and clarifies the financial consequences of various courses of action.
One of the advantages of decision trees is that their outputs are easy to read and interpret, without even requiring statistical knowledge. For example, when using decision trees to present demographic information on customers, the marketing department staff can read and interpret the graphical representation of the data without requiring statistical knowledge. The data can also be used to generate important insights on the probabilities, costs, and alternatives to various strategies formulated by the marketing department
Another advantage of decision trees is that, once the variables have been created, there is less data cleaning required. Cases of missing values and outliers have less significance on the decision tree’s data.

Challenges:

Decision trees can be unstable because small variations in the data might result in a completely different tree being generated. This problem is mitigated by using decision trees. The resulting change in the outcome can be managed by machine learning algorithms, such as boosting and bagging.
In addition, decision trees are less effective in making predictions when the main goal is to predict the outcome of a continuous variable. This is because decision trees tend to lose information when categorizing variables into multiple categories
Decision-tree learners can often create over-complex trees that do not generalize the data well. This is called over fitting. Mechanisms such as pruning, setting the minimum number of samples required at a leaf node or setting the maximum depth of the tree are necessary to avoid this problem.

Of course, it won’t happen every time, but possibly. Therefore, there are more machine learning algorithms such as C4.5 and CART which are used to improve the performance of a decision tree.

References:

Bonner, G., Decision making for health care professionals: use of decision trees within the community mental health setting, Journal of Advanced Nursing, vol. 35, pp. 349-356, August 2001
Kilpatrick, S., et al., Optimization by Simulated Annealing, Science, vol. 220, num. 4598, 1983.
Quinlan, J.R., Simplifying decision trees, International Journal of Man-machine Studies, num. 27, pp. 221-234, 2017