Maintaining Data Quality: A Guide for Modern Data Teams
This week, I tuned into a panel discussion between three data leaders (the CEO & CTO of Monte Carlo, and the CEO of Gable). The discussion was about how modern data teams can maintain data quality without frustrating end-users with overly strict governance policies. A number of points are worth discussing.
Strong Data Infrastructure
While technology cannot solve each and every data problem out there, it can definitely facilitate a great deal. Modern data infrastructure allow for end users to easily access the data pockets they need, without giving data owners headaches as regards data integrity and data quality. Best-in-class vendors like Microsoft and AWS offer a solid basis but can also lure data administrators in vendor-locked environments. Increasingly, that vendor lock is not a technical lock but a "knowledge lock" as getting to know a particular stack requires a substantial time-investment. Also look at specialist vendors like Snowflake or Databricks as they often allow to manage data infrastructure in multi-cloud (or Hybrid) settings.
Embrace the "data product" mindset
The idea of "data products" is becoming increasingly popular. It also implies that some of the methods for building other digital products are becoming commonplace in data teams. A number of examples:
领英推荐
When it comes to data quality, an important question to ask is "who owns the data"? One of the panelists referred to the use of the RACI framework. This framework defines responsibilities from "high" to "low":
The gist of the discussion was that a sound responsibility matrix underpinned by a solid technical framework (enabled with modern data infrastructure) constitutes a solid backbone as far as corporate data efforts go.
In case you're wondering whether your data quality is sufficient for BI & AI purposes, do reach out, we offer both high-level and granular assessments.