Data Entropy [Series#2: I am Data!]
Mustafa Qizilbash
Data & AI Practitioner | Author | CDMP Certified | Innovator of DAC Architecture & PVP Approach | 50k Followers
Data Entropy, term must be known by Data Scientists but not by general data folks.
Let’s decode it…..
Entropy term is normally used in Data Science domain.
·????????Where finding unexpected outcomes is the aim
·????????Where surprises are welcomed
·????????Where results are analyzed based on probability ratio, lower the probability the better it is
·????????Where informative information is something which wasn’t known
Types
·????????High Entropy means, more surprises, more unexpected values so more informative
·????????Low Entropy means, less surprises, less unexpected values so less informative
‘Entropy is also called as Extreme Disorder of values.’
Referring to the image, we can see at the starting point all the signs are MINUS, then in the middle there are 50/50 signs of PLUS & MINUS and right at the end all the signs are PLUS. At those extremes left, middle and right corners, the Entropy is at the lowest, so no surprises are expected so nothing much for Data Scientists to predict right. But in 2nd, 4th, and 5th circles, it difficult to say how many PLUS(s) and MINUS(s), if values are not visible, at this points Entropy is at the highest.
Yes, it’s a confusing topic but this is how Data Scientists find unknown values and tries to predict unknown values.
Cheers.
I am an Enterprise Data Management, Data Governance, Data Modeling Experienced Professional | As a Team Leader, I ensure the highest data quality, security, and compliance standards.
2 年Well said, balanced and informative.