Practical Data Quality
Karteek Y.
Strategic Data/IT Leader | AI, Cloud & Data Transformation | Field CTO | CIO Advisory & Enterprise Modernization
Imagine having pristine clean, good-quality data for all your analytics, machine learning, and decision-making. Yes — I can’t imagine it either!
How does one define Data Quality? There are many ways of doing that. Let us start simple. Data has to be 100% accurate and complete to be considered of good quality. Such a simple Utopian view of data quality seldom leads to anything but a disappointment.
Let us examine this further through a series of questions.
Now each data element goes through its own journey from the point of generation to the present time where we are trying to use it for some purpose. Data elements get created, linked to other data elements, overwritten, and even deleted sometimes before they are accessible to consumers.
领英推荐
Let us imagine, that if data elements were generated as good data (no human, machine, or integration errors.) when does it change from being good to bad? One thought - Data elements can also have a temporal value (time-sensitive).
For example, The face value of a publicly traded stock at 10 AM will be valuable when your purpose is to know the current value. But, the question remains the same - at 12 PM, the price at 10 AM is no longer current. i.e., your good data at 10 AM now turned into bad data in that context of your question.
What are some of the data quality issues you face? In the next article, let us look at some of the most common data quality problems and how we can solve them.
Here are complete links to all the posts in this series:
Leadership | Data Analytics & Software Engineering | Wellness Enthusiast
2 年wow, simple and straight , looking forwarding for the series !