Eight Data Quality Attributes to Consider in Project Planning
Have you ever had this experience?? You ring a call centre, and the very helpful robot that answers asks you to enter a range of details such as your date of birth, account number etc.? After the requisite time on hold, you get through to a human who then asks you for your date of birth and account number.?You explain through gritted teeth that you have already provided this.
Let’s assume the system that houses your data was built properly and meets the right quality standards – it’s conformant, usable, flexible, auditable etc.?The sponsor for the project that built the system has been lauded for the success of the build.
As a customer, your care factor for this “quality” outcome is zero.? Why?? Because the call centre human can’t see the data you just entered.? Data quality (in this case, it’s accessibility) matters; it’s integral to customer experience.
Great companies realise data is a key asset, and the technology that enables its use is secondary.? Just like product or system quality, data quality can be defined as a range of objective attributes, which can be subjectively chosen (and paid for).? In a data centric world, intelligent choices still need to be made about which attributes are important and worth paying for.?Too often in project planning, insufficient thought is given to these choices.
There is a broad range of definitions of data quality attributes, which could be debated at length.?I thought I’d pick out eight key attributes to explore in project planning:
1. Institutional Environment
The institutional and organisational factors that have an influence on the effectiveness and credibility of the creation, retrieval, update and deletion of data.
Libraries are good at this.
2. Relevance
The degree to which data meets the needs of users. Assessing relevance is a subjective matter dependent upon the varying needs of users.
If I am buying a smart watch, its ability to monitor my blood pressure is not something I would pay for if this data is not relevant to me.? For others, this could be very relevant and price point adjusts accordingly.
3. Timeliness
The period between the creation of the data and time at which the data becomes available. It often involves a trade-off against?accuracy.?
Smart electricity meters were deployed across the state of Victoria in the early 2000’s, and they now measure home electricity usage every 30 minutes. However, the data is only required to be transmitted to energy retailers and made available to customers on an overnight basis.? By comparison, a consumer who installs solar panels or a home battery gets real time data.? Day-old data is much less effective in allowing customers to manage and reduce their energy consumption.
4. Accuracy
The degree to which the data correctly describes the phenomena it was designed to measure.
When Apollo 11 landed on the moon, they reportedly had 20 seconds worth of fuel left.? The accuracy requirement for their fuel gauge was orders of magnitude higher than that of a family car, with commensurate increase in cost.
5. Consistency
Matching the data to the required format in a repeatable way.
领英推荐
A simple example here is date formatting: DD-MM-YY might contain the same data as YYYY-MM-DD, but it’s not easy to combine this data into a single table.
6. Accessibility
The ability to locate and read (or download) the data through the system.
In 2015, regulators looking into allegations of manipulation in the FX markets demanded large volumes of phone calls from Banks.? Phone records were often on digital tapes and were only ever used for occasional settling of queries around trade details.? Systems were not designed for bulk retrieval of calls. Meeting the regulators’ demands required significant system upgrades and took months.
7. Uniqueness
The data needs to be clean, unique and version controlled.
How many versions of your name and address does your bank have?? How many copies of the same photos of your family trip to Vietnam do you have on your PC? And don’t get me started on my music collection….
In addition to these attributes, a range of other questions should be considered when specifying data quality requirements.
8. Data Lineage
Data lineage refers to the origin and transformations that data goes through over time. Data lineage tells the story of a specific piece of data.
If you draw up a new will, and it says that the new will invalidates all the old ones, this may not matter too much.? If you need records on ownership of land title over time, you will want a record of what has changed and when.
Helpful Questions to Ask
Confidence in the data: are data governance, data protection and data security in place? What is the reputation of the data, and is it verified or verifiable? Are there controls on the creation, retrieval, update and deletion of data?
Value of the data: Is there a good cost/benefit case for the data? Is it being optimally used? Does it endanger people’s safety or privacy or the legal responsibilities of the enterprise? Does it support or contradict the corporate image or the corporate message?
Planning a Project
In the planning stages for a project a lot of effort is put into “as is – to be” system states.? Far less effort is put into this question for data.?A high-level diagnostic of data quality at the outset of a project can pay huge dividends.?Remediation of data quality gaps discovered during build (or worse, after go-live) can be expensive in terms of time, cost and reputation.
As examples:
Regardless of what type of company you are, data quality matters.?Having a good understanding of which data quality attributes matter to you, and whether these desired attributes exist and can be used should be an explicit stream of any good project plan. Further, budget and resources should be set aside to ensure you get what you need from your data.
Great and relevant points
Business Technology Delivery Executive - Technology Delivery & Capability - Unstructured Data, Payments/NPP, Wealth Divestments, M&A, Organisational Transformation- Customer, Business, Product and Technology. MAIPM, PMI
1 年The importance of this can’t be overstated…poor quality often leading to poor decisions.
General Manager | Westpac’s Top 20 Women of Influence | Board Member
1 年Data quality really shapes the customer experience. Great points here, Mike Stockley.