Role of data in healthcare - Information gain & data entropy

Role of data in healthcare - Information gain & data entropy

Chapter 6 – Role of data in healthcare

Decision Trees in Data Modelling?

At the end of the day, the main question or concern of application of data in healthcare, would be around the practicality, as well as the actual impact it may result for the community, resulting in an improved diagnostic, disease management and so on. Since first and foremost, enormous efforts are behind the collection, storage, preparation and communication of such databases, the benefits of the generated insights should outweigh the recourses allocated to providing such valid data bases.

In this chapter the information gain in decision trees, designed for data models will be discussed, followed by a video, elaborating the calculations as an example.

Hence, before continuing the subject of the knowledge graphs and the potential applications in healthcare, I intend to have a pause, and to refer that how the database in healthcare (regardless of management or storage type, relational, OLAP, Knowledge Graph, etc.) may be insightful in a crucial topic, such as “predicting the events of MI or cardiac arrest for instance, for a given population”?

One of the stimulating stages of data analytics (as mentioned previously) is predictive analytics and data models.?

These models in fact utilize machine learning to figure the probability of certain events to occur in the future, given certain variables and connections between them.

In fact, When the topic of predictive data models is brought up, the minds of most of the people starts imagining complex programming principles and knowledge, and yes indeed, it is complex.

But the functionality of a data model mainly revolves around the initial steps where the data types are being grouped based on specific mutual attributes, especially in supervised learning algorithm development.


Information Gain
Each raw data set has certain entropy (Entropy is an information theory indicator which measures the impurity or uncertainty in a group of observations) from initial point and based on the entropy formula and concept, the lower the entropy, the better the data set.
Thus, when the data set is being classified and a decision tree is being developed, it is expected that the entropy to be reduced, comparing with initial data or previous step of classification. This trend is called information gain.
Prior to comprehend the and calculate the information gain of a certain clustering or decision tree, it is vital to understand the concept of Entropy in data set.?
No alt text provided for this image
No alt text provided for this image

Example (Informative Video):



#data #dataanalytics #datamodeling #Entropy #decisiontrees #Informationgain #healthcare #predictivemodeling #database #datanalysis #datadriven

要查看或添加评论,请登录

Areg Kocharian的更多文章

  • CHAPTER 8

    CHAPTER 8

    What is Hypothesis Testing? A hypothesis is an assumption that we intend to check. The approach is very similar to a…

  • MONTY HALL PROBLEM

    MONTY HALL PROBLEM

    Monty Hall is the name of the host of an American television program, "Let's Make a Deal" initially aired on NBC in…

  • POISSON DISTRIBUTION

    POISSON DISTRIBUTION

    The Poisson distribution resembles to the binomial distribution in nature, since represents occurrence of certain…

  • Binomial Distribution

    Binomial Distribution

    Binomial is one of the most applicable & relevant distributions to the real world events. This type of distribution is…

    1 条评论
  • DATASETS DISTROBUTIONS

    DATASETS DISTROBUTIONS

    Probability distribution is used in our daily lives. The other types of less common but more important probability…

  • OUTLIERS

    OUTLIERS

    In the previous chapter, the skewness was discussed. The three types of skewness (Left, right and zero) were brought…

  • SKEWNESS

    SKEWNESS

    Let's begin this section by asking a question: Referring to the application of statistics in your routines, let’s…

  • In what way, knowledge of statistics influences our routine?

    In what way, knowledge of statistics influences our routine?

    Three are 3 major realms that are affected inevitably, if we fail to have a fair judgment, given the knowns or…

  • Introduction

    Introduction

    “Statistips” includes a series of expositions of rudimentary concepts around statistical, analytical, data & insights…

    1 条评论
  • Google Search Console - 2

    Google Search Console - 2

    INDEXING In the section, which includes pages, video pages, sitemaps and removals, you may have either an overview of…

    2 条评论

社区洞察

其他会员也浏览了