What is Data Governance? And why is it necessary especially now?
Mahtab Syed
Data and AI Leader | AI Solutions | Cloud Architecture(Azure, GCP, AWS) | Data Engineering, Generative AI, Artificial Intelligence, Machine Learning and MLOps Programs | Coding and Kaggle
With the advent of Machine Learning and Artificial Intelligence for Predictions (Business metrics like Inventory, Profitability, Customer Retention ) and Generative AI for Generation (human like response grounded on internal data) its key to have good Quality Data available almost real time. Most organisation are struggling with this as Data Governance was never setup well. I have a real example below.
A large bank with 30-40 source systems generating data, 10-20 date warehouses collecting data for analysis, and many non-integrated SaaS databases, in most cases the basic principles of Data Governance are not followed.
In this bank a new Data Analyst starts and is given the task of creating Reports of Taxable obligations for all customers (with a complex business logic), and is pointed to few Data warehouses – 1st which has Customer master data, 2nd which has all Transactions, 3rd where Taxable income is calculated, plus other sources like below
-??????CRM with master data of all Banks customers
-??????Loan system
-??????Credit card system
-??????Banking Transaction systems
-??????Tax calculation system
-??????Few Data warehouses which get feeds from above at a different frequency
Within few days of analysis its apparent that there is no documented source of Master Data, no Data Catalog, no Metadata, Data calculations not explained, multiple sources of same data, multiple columns with similar data and poor Data Quality. With this there is no confidence in the Reports and the Data Analyst gets frustrated and quits; and with her goes all the knowledge gathered.
Data Governance principles and operations can help here, and DAMA DMBOK (Data Management Body Of Knowledge) https://www.dama.org/cpages/body-of-knowledge ?identifies the following knowledge areas:
Data Management Knowledge Areas
Apart from these terms there are other jargons (which are possibly included in the above):
?
For an Enterprise, getting up the Data Maturity Model and following principles and operations of Data Governance is not easy and can’t be done quickly. This is due to many reasons:
领英推荐
Here’s how can we slowly clean-up the above-mentioned Data mess in an enterprise by taking these steps, and get ROI over small investments at the same time…:
A word of caution: Align Data Governance with Business initiatives - Don’t propose the value of data Governance on its own.
Approximately 90% of data governance programs struggle because they made a business case for Data Governance on its own without articulating how the program will support funded Business initiatives sponsored outside of the data team. Instead of proposing the value of Data Governance on its own, we need to work backward from Business initiatives.
Reach out if you need help?with Data Governance.
Some fundamentals:
Source: Fundamentals of Data Engineering by?Joe Reis ???and?Matthew Housley
?2. Data Lifecycle starts from Creation -> Storage ->Usage and Enhancement -> Archival?-> Destruction
?
Acknowledgements:
Mahtab Syed, Melbourne, 26 Mar 2023, updated 25 Jun 2024