DATA GOVERNANCE NEEDS META-METRICS & METADATA
Senthil Kumar
Vice President, Principal Architect @ Northern Trust | Cloud-Native Architecture
We are used to measure success in numbers, always look for final results expressed as – runs scored, Goals netted, Dollars made, % of Sales etc. and we expect such scales to always follow an interval that is fixed, and eventually determine success through dichotomous view of results (like simple binary view like Yes/No, Win/Loss, Profit/Loss etc.) When we are presented with success measures that are more categorical, quantitative and highly descriptive, it fails to appeal to our dichotomous reasoning to accept the same as success measure.
Well this is the challenge many Information architects faces when running a data governance organization, Its questionable how many data governance team today have created successful, actionable metrics that can help them express the progress results in easy to understand numbers and figures? Desperate attempt to tag business metrics and KPI as means to present data governance progress sometimes fails to excite executive sponsors or even create conflicts between business teams.
Meta-metrics is the new means that can help information architects and CDO teams to create meaningful metrics expressed in easy to understand scale. Unfortunate – Meta-Metrics is less understood term compared to its peer of metadata, but new normal in Information management space pushed through big data & IoT adoption is going to force organizations to rethink Meta-metrics as a way forward to express the data dimensions in a meaningful scale.
What is a Meta-Metrics?
We all know Metadata – its data about data, while in simple terms Meta-metrics is data about your data sets that conforms to same metadata. Let me explain it with an example here, Let us say you feel thirsty and you drive into a hyper mart to get some drink, the first step you would do is follow the sign boards that leads to type of drink that you would like to buy. The sign boards in the hyper mart stores are like your Metadata (Data that help your to take right direction to find what you need!)
Well once you reach the aisle of your choice, you are presented with many different drinks of choice from many vendors that can have different color, taste, flavor, brand etc. ( think of the data variations you see in your data set across each rows to be equal to this phenomena )
When you are presented with too many choices the easy way many would make decision is to go with your intuition or prior experience – Your intuition based decision can lead to selecting your drink either based on color/appearance or the advertisement you were exposed. Other influencing factor might be your own prior experience such as taste/flavor/brand or the experience of your friends/acquaintance who spoke great about a product. In both the above case, approach will be not so scientific or can argue not data driven or fact based. ( You do not need any data to make such decision, and it can vary from person to person, experience based)
A good fact-based decision in above example would be to read through the content of each drink along with the price and determine nutrition facts vs. price benefits before you square down to the final buy decision. While naturally we do not buy products from the shelves looking at each nutrition facts & price but growing world of data everywhere might one day promise such on demand capabilities to consumers via mobility. However today’s corporate business teams that invested in improving data capabilities have the ability to use data to such extend before making decisions, but they lack the nutrition fact information about the very own data they are about to consume, that leads to trust issues & eroding data confidence.
Meta-Metrics is similar to the Nutrition label of your drink, it is not metadata, but it compliments your metadata with trust, it help provide depth dimension to your data set that resides in vast ocean of data and helps improve fact based decision making for corporate.
Metric data is critical in big data world:
Let us take a simple invoice example – below image clearly help you understand the difference between Data/Metadata/Metric-data. As long as your data is just a small subset of just 20 invoices per day, you can navigate your facts around data set by leveraging just metadata. What happens if your invoice volume is 20,000 per day? What about 200,000 per day? What about 10 year’s period? Well you would enter the big data volume qualification and pure metadata alone might not help navigate the data volumes.
In such large data volume use cases, having metrics data along with metadata can change the entire data usage pattern and bring in more data trust around your data set, but such metrics cannot be collected at one stretch – it needs to be incrementally collected, daily as you load your data into big data environments – unfortunate such information today resides in data logs and data movement tools metadata repository along with job status. It’s important to collect & collate such information and generate meaningful value out of such log data.
How to Capture Metric Data:
One of the easy means to capture metadata metrics is to use data controls that can scan/monitor through your data and help collate the required metrics that can be injected into your very own metadata reports. But maturity of the tools in this space is still evolving and you do not see many players with such capabilities. While other means is to manually build scripts/data routines/maps etc. that can scan through the complex logs and job status data sets corresponding to ETL/data ingestion tools to weed out required metrics.
Whatever be the approach to capture data metrics, it’s important to start thinking about a process to capture meta-metrics and attempt to make incremental progress by expanding one metric at a time, the simple start would just be to create metrics around the records management in our data environment, it can provide important information about your enterprise data asset and directly help data governance team to know what they are managing. It can change your data governance discussion from descriptive metrics to precision metrics and can create more actionable insights.
Director of Clinical Informatics & Reporting at Elysium Healthcare
8 年how have you set this up abd how do you involve IG in doibg this?