Measuring master data quality for purchase parts and standard parts involves evaluating various aspects of the data to ensure its accuracy, completeness, consistency, and reliability. Here's a step-by-step guide on how to measure data quality for both types of parts:
- Define Data Quality Metrics: Start by defining specific metrics to measure data quality. These metrics may include accuracy, completeness, consistency, timeliness, validity, uniqueness, and relevancy.
- Accuracy: Assess the accuracy of the data by comparing it against a trusted source or performing validations. For purchase parts, accuracy may involve verifying supplier information, part specifications, and pricing details. For standard parts, accuracy could involve checking technical specifications against industry standards or design requirements.
- Completeness: Evaluate whether the data contains all the required information. For purchase parts, completeness may involve checking if all necessary fields such as part number, description, supplier details, and pricing are filled out. For standard parts, completeness could involve ensuring that all required attributes such as dimensions, material specifications, and usage instructions are provided.
- Consistency: Check for consistency within the data set and across different data sources. Inconsistent data can lead to errors and discrepancies. For purchase parts, consistency may involve ensuring that part numbers, descriptions, and pricing information are consistent across various documents and systems. For standard parts, consistency could involve verifying that the same part is represented consistently across different databases or catalogs.
- Timeliness: Assess the timeliness of the data to ensure that it is up-to-date and relevant. For purchase parts, timeliness may involve monitoring lead times, delivery schedules, and inventory levels. For standard parts, timeliness could involve checking if the technical specifications are current and reflect any updates or revisions.
- Validity: Validate the data against predefined rules or standards to ensure its correctness and relevance. For purchase parts, validity may involve checking if supplier information complies with regulatory requirements or contractual agreements. For standard parts, validity could involve verifying if the technical specifications meet industry standards or engineering requirements.
- Uniqueness: Ensure that each data record is unique and free from duplicates. Duplicates can lead to confusion and inaccuracies in analysis. For purchase parts, uniqueness may involve identifying and removing duplicate supplier entries or part numbers. For standard parts, uniqueness could involve eliminating duplicate entries for the same part in different catalogs or databases.
- Relevancy: Evaluate the relevance of the data to the intended use or purpose. For purchase parts, relevancy may involve assessing if the parts meet the specific requirements and performance criteria. For standard parts, relevancy could involve determining if the parts are suitable for the intended application or industry standards.
- Continuous Monitoring and Improvement: Data quality is an ongoing process, so establish mechanisms for continuous monitoring and improvement. Regularly review data quality metrics, address any issues or discrepancies, and implement corrective actions to improve data quality over time.
Power BI: While primarily a business intelligence tool for data visualization and analysis, Power BI can also be used for basic data quality checks. You can create visualizations to inspect data patterns, identify outliers, and detect inconsistencies visually. Power BI's data profiling capabilities can help you understand the distribution and quality of your data.
Beside the visual of the data, it can also be measured, for example:
100 parts were created, 100 parts have names, 100 parts have article numbers, 100 parts have manufacturer names, 100 parts have weight inputs. Of course, this can also be achieved simple with mandatory fields to fill – but in this case here it can highlight broken or misused rules. Like, all weights with input 1,000 or 0,001 are highlighted – because some users like to cheat mandatory field with entering “fake” values. Another quite common cheat is, if the description or name should be entered in different languages, the measure tool can compare the entries, if they are different, it seems to be ok. But if they are same, like copy past same value in all fields, it seems to be a broken rule. The data steward can here easily detect such things and initiate a double check.
The main visual should show a percentage of correct data so that the owner can compare different time periods, if the percentage value of good data goes down – action steps are required.
The written article is based on my personal experiences and my individual selection. It makes no claim to completeness. If something is incorrect, I kindly request information or feedback.