Decision Tree in Machine Learning

Decision Tree in Machine Learning

Decision tree learning is a supervised learning approach used in statistics, data mining and machine learning. In this formalism, a classification or regression decision tree is used as a predictive model to draw conclusions about a set of observations.

Some advantages of decision trees are:

Simple to understand and to interpret. Trees can be visualized.

Requires little data preparation. Other techniques often require data normalization, dummy variables need to be created and blank values to be removed. Some tree and algorithm combinations support missing values.

The cost of using the tree (i.e., predicting data) is logarithmic in the number of data points used to train the tree.

Able to handle both numerical and categorical data. However, the scikit-learn implementation does not support categorical variables for now. Other techniques are usually specialized in analyzing datasets that have only one type of variable. See algorithms for more information.

Able to handle multi-output problems.

Uses a white box model. If a given situation is observable in a model, the explanation for the condition is easily explained by boolean logic. By contrast, in a black box model (e.g., in an artificial neural network), results may be more difficult to interpret.

Possible to validate a model using statistical tests. That makes it possible to account for the reliability of the model.

Performs well even if its assumptions are somewhat violated by the true model from which the data were generated.

Decision tree learning employs a divide and conquer strategy by conducting a greedy search to identify the optimal split points within a tree.


Decision Tree Terminologies


There are specialized terms associated with decision trees that denote various components and facets of the tree structure and decision-making procedure. :

Root Node: A decision tree’s root node, which represents the original choice or feature from which the tree branches, is the highest node.

Internal Nodes (Decision Nodes): Nodes in the tree whose choices are determined by the values of particular attributes. There are branches on these nodes that go to other nodes.

Leaf Nodes (Terminal Nodes): The branches’ termini, when choices or forecasts are decided upon. There are no more branches on leaf nodes.

Branches (Edges): Links between nodes that show how decisions are made in response to particular circumstances.

Splitting: The process of dividing a node into two or more sub-nodes based on a decision criterion. It involves selecting a feature and a threshold to create subsets of data.

Parent Node: A node that is split into child nodes. The original node from which a split originates.

Child Node: Nodes created as a result of a split from a parent node.

Decision Criterion: The rule or condition used to determine how the data should be split at a decision node. It involves comparing feature values against a threshold.

Pruning: The process of removing branches or nodes from a decision tree to improve its generalisation and prevent overfitting.

要查看或添加评论,请登录

Isaias Bueno的更多文章

  • What is Amazon Route 53?

    What is Amazon Route 53?

    Amazon Route 53 is a highly available and scalable Domain Name System (DNS) web service. You can use Route 53 to…

  • What Is SEO – Search Engine Optimization?

    What Is SEO – Search Engine Optimization?

    SEO stands for Search Engine Optimization and helps search engines understand your website’s content and connect it…

  • What is Google Cloud Vertex AI?

    What is Google Cloud Vertex AI?

    Google Cloud Vertex AI is sort of like a Swiss Army knife for AI and machine learning projects on Google Cloud…

  • What is User Experience (UX) Design?

    What is User Experience (UX) Design?

    User experience (UX) design is the process design teams use to create products that provide meaningful and relevant…

    1 条评论
  • What is Amazon Kinesis?

    What is Amazon Kinesis?

    The Simple Explanation Amazon Kinesis is an Amazon Web Service designed to process large-scale data streams from a…

  • What Is AWS CLI (Command Line Interface) ?

    What Is AWS CLI (Command Line Interface) ?

    AWS CLI is a command line tool that is used for managing the AWS Services from the command line. Understanding AWS CLI…

  • What is Terraform?

    What is Terraform?

    Terraform is an infrastructure as code tool that lets you build, change, and version cloud and on-prem resources safely…

  • AWS Key Management Service (AWS KMS) for Data Encryption

    AWS Key Management Service (AWS KMS) for Data Encryption

    AWS provides over a hundred plus services which include storage, networking, database, application services, and many…

  • Understanding VPC links in Amazon API Gateway

    Understanding VPC links in Amazon API Gateway

    A VPC link is a resource in Amazon API Gateway that allows for connecting API routes to private resources inside a VPC.…

  • What is AWS Secrets Manager?

    What is AWS Secrets Manager?

    AWS Secrets Manager helps you manage, retrieve, and rotate database credentials, application credentials, OAuth tokens,…

社区洞察

其他会员也浏览了