You Can’t Have Data Mesh Without Governance
Lauren Maffeo
Product at the State of Maryland / Author of "Designing Data Governance from the Ground Up"
As the volume of big data keeps growing, traditional tools used to manage it can fall short. Demand for data science led many organizations to try combining their data warehouses with big data tools. The challenge is that trying to deploy data this way can cause huge backlogs, especially in large organizations. If a single team owns both the data platform and all integrations, other teams that need analytics can lose time while they wait for their results.
Even if the data team owns all infrastructure, the volume of data they work with is often larger than what many business intelligence (BI) tools can handle. The number of data sources is often numerous as well. This situation makes a strong case for not having one team manage all data in one platform. Unless your organization is small, this option is bound to cause silos, backlogs, and lost productivity.
Using?data mesh?— an architectural pattern that lets cross-functional teams manage data domains as products — can?ease these risks.
Data Mesh and Data Products
Not sure what a?data product?is? You already work with them today. If your head of Sales keeps your company’s purchase order history in a JSON file, that JSON file a data product. Your data team can automate the file upload process so that the data refreshes daily and the latest data lives in a specific location within your cloud architecture. Data products might also be published datasets that live on someone’s laptop, or machine learning models that predict various costs, from shipping dates to marketing campaigns. Data products are not new inventions that your stewards must make — they’re an improved way of managing the data you have now.
Done well, data mesh lets teams access, develop, and manage data autonomously.
Data Mesh Principles
Zhamak Dehghani first described the data mesh concept in 2019, which promotes?four principles:
Moving Towards a Data-Driven Culture
This move to distributed architecture through a shared sense of ownership is one way to execute data governance. Before you can use this architectural technique, you must have clearly defined cross-functional data domains and assign each domain to stewards who own it. You also must ensure that your data platform allows domain experts to use the tools, techniques, and dashboards that serve their audiences, without depending on just one tool or team.
While data mesh is an ideal way to practice data governance, it is not the right architecture for all teams. If your organization and technical team are small, data mesh might make your work more complex. But if your organization has independent business units, autonomous teams that work independently, and data/analytics needs across these units/teams, data mesh is worth a look.
Done well, data mesh lets teams access, develop, and manage data autonomously. It also gives your data stewardship team an easier way to keep your data secure. But there’s a catch! Data mesh works only if you’ve done the hard work to create a data-driven culture and can automate that culture’s standards.
About the Author
Lauren Maffeo is a service designer at Steampunk and a member of the Technology Advisory Council at the UK Information Commissioner’s Office (ICO). Her first book,?Designing Data Governance from the Ground Up, is available in beta through The Pragmatic Programmers.
Until November 28th, 2022, you can save 40 percent on the ebook when you used promo code?turkeysale2022?at checkout. Promo codes are not valid on prior purchases.