What is Data Entropy?
There is a common meme that LinkedIn regulars will know well. It shows a series of pictures of Lego, one with lots of bricks all mixed up, another with the bricks separated out by colour, and perhaps more with the bricks assembled into shapes.
Captions under each image will read something like 'Data, Information, Knowledge, Wisdom', or 'Data, Sorted, Arranged, Explained'.
As parent of small kids, I can attest that taking a big tub of random Lego pieces and sorting them into groups of similar colour or type takes a lot of effort. I can also tell you that trying to build a large Lego model without doing this sorting first is going to take many times longer.
And yet, no matter how many times Lego gets organised to make it easier for building with, within a week or two it is all mixed up again. It's like Lego has its own kind of Entropy.
Entropy
Entropy in nature is the tendency of things to become disordered over time. Objects that are finely sculpted lose definition, things that are separate become mixed, things that are extreme become moderate.
Natural forces and processes, from the thermodynamic behaviour of gas molecules to weathering and erosion, cause ordered matter to become disordered over time.
Hot objects donate heat to cooler things around them until all are the same temperature. Your latte doesn't go 'cold', and your frappe doesn't get 'warm', they both just align with room temperature. Scientific theory tells us that everything in the whole universe will eventually arrive at the same temperature.
Tectonic forces push the earth upwards to create mountains. However, once the 'organising' tectonic forces subside, wind and water will eat away at that mountain, like a sandcastle on the beach, slowly turning rock to dust and eventually leaving the earth as flat as it was before the mountain first rose.
We can apply energy in a directed way to slow, or even reverse entropy in controlled circumstances, e.g. by constructing erosion barriers on shorelines, or by microwaving that latte. However, without regular housekeeping 'order' inevitably becomes 'disorder', 'sorted' becomes 'random', 'different' becomes 'same'. Entropy will ultimately have its way.
领英推荐
Data Entropy
Data in an organisation has it's own kind of entropy too.?We might start out with highly planned and structured enterprise data systems, but gradually, little by little, disorder can creep in. In the cut and thrust of the working world, solutions often need to be tactical, rather than strategic.
The attraction of tactical action over strategic action to solve problems quickly is undeniable, with the longer term consequences often ignored. Examples of tactical behaviours that can lead to data disorder include:?
Some of these behaviours happen covertly within the business, away from the gaze of IT or the data management team. This is usually inadvertent, but sometime deliberate. The business must be facilitated and enabled by IT and Data Management to do their work, and a sure sign that things are going wrong is when the business starts implementing solutions by themselves to avoid dealing with these functions.
Others behaviours may be facilitated by IT and Data Management teams, under varying levels of pressure from the business to address high-priority requirements so quickly that they decide or agree to cut-corners.
To mitigate the effects of 'data entropy', active management and governance is essential:
Conclusion
For Lego, the natural force that tends towards disorder is children. All a parent can do is encourage their little ones to look after their toys and periodically help them tidy up and do some sorting. Hopefully we can prevent tears by ensuring precious pieces are not lost under the couch or in the belly of the vacuum cleaner.
Similarly, without on-going effort to manage data and enforce good practices, data entropy will gradually erode organisational data capability and capacity. However, if IT and business work together with common cause to 'apply energy' in the right way, data entropy can be minimised and reversed.
Questions on Data Warehousing, Data Integration, Data Quality, Business Intelligence, Data Management or Data Governance??Click Here?to begin a conversation.
John Thompson is a Director with EY's Technology Consulting practice. His primary focus for many years has been the effective design, management and optimal utilisation of large analytic data systems.
Marge'ah Limited
1 年This is great information to have when workshopping with educators about Computational Thinking and tactical data.
SRE / DevOps @ The Home Depot | Big Data & HPC | Industry 4.0 | Edge Computing | FinOps | Systems Engineering | Sustainability | IoT
2 年Such nice reading !
Associate Vice President | Data Analytics | Ex Deloitte US
3 年Great piece!
Manager Technology Consulting
3 年Who doesn't love a good Lego analogy? John has a talent for explaining tough topics in easy language. Thanks John.
Systems and Training
3 年My favorite definition is "things tend towards chaos"