Data Mesh 101: Why Federated Data Governance Is the Secret Sauce of Data Innovation
What makes the data mesh such a powerful concept is the principle of federated data governance.?
The big shift that the data mesh enables is being able to decentralise data, organising it instead along domain-driven lines, with each domain owning its own data that it treats as a product that is consumed by the rest of the organisation.??
The process of decentralising, democratising and productising data is a quantum leap in enterprise data architecture that opens the door to massive experimentation and innovation.
But you can’t just decentralise everything and wait for innovation to occur, there would be chaos.?
The secret sauce is using a federated approach to strike a balance between decentralised data sources (that enables innovation at scale) and centralised data governance (that provides the basis for consistency and collaboration across the organisation).?
What Is Data Federation?
Federated data governance in a data mesh describes a situation in which data governance standards are defined centrally, but local domain teams have the autonomy and resources to execute these standards however is most appropriate for their particular environment.?
In this model, autonomous data domain teams and centralised data governance functions collaborate in order to best meet the data needs of the whole organisation.?
In this way, teams can “shift left” the implementation of data governance policies and requirements in order to embed them into their data products early in the development lifecycle.
What might this look like in a data mesh?
The data is decentralised, with each domain taking ownership of its own data from end-to-end. This means that each team can scale their own processes without impacting other teams and domains.??
Consumers, however, are likely to require data from multiple domains so the different domain data needs to have a very high degree of interoperability so consumers can easily incorporate a variety of datasets from across the business.?
So each domain, in order to be part of the mesh, must follow a set of centrally-managed guidelines and standards that determine how their domain data will be categorised, managed, discovered and accessed. This covers things like data contracts, schemas and so on.
This also includes a shared data infrastructure layer that domains can draw on to build their own pipelines from pre-approved templates that ensure security and compliance (and avoid the duplication of each building their own infrastructure from scratch).?
This is where the centralised governance comes in, establishing data management practices and processes that ensure that the data provided by each domain is of the highest quality, from a consumer perspective.?
Why Data Federation Is a Superpower
There are a few key reasons why data federation is so impactful.?
Maintain independence, autonomy and accountability
The main benefit is that domains can operate with a high degree of autonomy.?
They know their own domain far better than anyone else and are best placed to decide exactly how they should manage their data and how they can best scale.?
This level of independence also ensures a high degree of accountability because a single team follows a given data product from production to consumption.
The result is high-quality data products that can be produced in a scalable and resilient fashion by teams that know their own domain intimately and are responsible for end-to-end delivery.?
Enable interdependence and collaboration across domains?
However, the data products that domains produce still need to be usable by the consumer.
There must be a minimum degree of interdependence between domains, which is why having centrally-governed standards is so critical.
Issues that affect all domains need to be subject to a wider authority—perhaps even a team of domain product owners—to ensure that domains are consistent in how they handle and process data.?
In a data mesh, data is viewed as a product, so we can draw inspiration from how product development is done in large organisations: ideally, there are certain centrally-governed development guardrails that are baked into architecture and how people work, within which developers are free to innovate as they wish.?
The data mesh can be set up similarly, with a team of experts responsible for curating and providing the interoperability ‘guardrails’ within which domains can operate however they see fit.?
领英推荐
Govern and consume data wherever it is?
When domains are functioning in ways that are both independent but interoperable it is possible to govern data with great effectiveness, wherever it is in an organisation.?
Domains take care of the local processes and concerns, with a central team ensuring minimum standards for consistency and accessibility.?
Data that is effectively governed in this way is a delight for consumers. They can get on with their work knowing that high-quality, highly-discoverable data is on tap and can be plugged into their projects when needed.?
Running around different teams trying to find if a particular data set exists or not or whether it can be transformed to meet your needs becomes a thing of the past.?
Enables massive scalability
When you have a mesh of independent but interoperable nodes that can be effectively governed and are easy to consume, you have a foundational pattern that can then be scaled massively across the organisation. Not only this, but each node can scale at its own pace, depending on its level of maturity.
The federated data mesh, once set up properly, is highly scalable, which is a massive advantage of this approach.?
Data Governance Federation Challenges
A federated data mesh model requires a high degree of data maturity in an organisation as it represents a very different and more free-flowing way of allowing domains to interact with each other and with the data itself than with more top-down, centralised approaches.??
But the main challenges around federation of data are not technical. The real challenge lies in federating a data mesh culture and mindset: the ways of working and thinking that must underpin this shift in how we handle data.?
Federating trust
Your organisation will have to be comfortable with federating not only their technology but their trust.?
A mindset shift is required to ensure that each domain has the skills, infrastructure and controls in place to allow it to act autonomously, within the guardrails of inter-domain interoperability.?
There are too many domains, however, to manage them all individually (and this would also defeat the purpose of decentralisation!). These domains, then, need to be trusted to get on with the job however they see fit, which some organisations that are used to more centralised control may find unsettling initially.?
Encouraging good data citizenship?
When each domain is given the trust for their particular piece of the data puzzle at the same time that domain takes on a huge amount of responsibility.?
Organisations must make clear that the new ways of working are in place to make life easier for everybody and for the common good of the organisation.?
For the data mesh to succeed people—whether they are data producers or consumers—need to be actively contributing to their corner of the data mesh.?
Striking the right balance?
Imagine that every domain had complete autonomy to manage their own data as they wish, with absolutely no consideration for cross-domain consistency or co-ordination. There would be carnage.?
Similarly, if domains were completely reliant on a centralised data function to manage and make data available then that would become a major bottleneck and innovation would grind to a halt.?
The challenge is to find the right balance for your particular organisation between allowing domains to evolve and scale their own data at their own pace while ensuring the data products that result are consistent with other domains.?
Critically, this balance will change over time as the organisation matures and so must be constantly adjusted.?
Final Thoughts
Data governance federation is the secret sauce that makes the data mesh possible, making highly-autonomous, local work possible, but within interoperability guardrails that allow for high degrees of collaboration between all the local teams.?
This combination of local excellence and inter-domain collaboration creates a massive web of high-quality data products that all corners of the business can draw on to enhance existing services or foster innovation.
Business Data Analyst at Lantm?nnen
3 年Love it when someone blows my mind. I will have to rethink, a lot. Thanks!
Progress only happens when people, systems, and things can share information seamlessly. I like to make that happen by building technology and products that are fit for purpose and get the job done.
3 年Interesting article with lots of food for thoughts. How can the "federated data governance" approach discussed here be extended to accommodate cases where a central governance approach isn't immediately achievable or desirable? I am thinking after mergers and acquisitions, or in the context of data ecosystems where data products need to be accessible across enterprise boundaries. I wonder if shifting governance left is an answer: a "decentralised governance domain" that expresses "guidelines and standards that determine how their domain data will be categorised, managed, discovered and accessed" via a semantic model that can be shared as a data product in itself. Granted - a shared ontology needs to be in place (or some alternative way to infer meaning) Consumers then have the choice of understanding how data products are governed and - possibly - also able to verify their compliance to the stated "rules of engagement" (instead of having to assume that every other domain complains to "central" ruler)? I also imagine that this model can potentially be extended over trust-less environments
Data & Digital Consultant | Architecture & Engineering Specialist | MCA Awards Finalist | Innovation Civilisation Podcast
3 年Fantastic article- the whole area of Data Management will need to be rethought in a federated, decentralised world and this article is a great step in that direction! Well done!
CTO and Strategy Lead @ Adobe | Digital Transformation
3 年Reading through, it brought a few things into mind from my experiences across my career… Larger enterprises still have a lot of legacy applications, which going back 20-30 years could be classed as “federated” as each individual product engine was built in isolation without common architecture principles.. this led to chaos and the needs for “master data”, common dictionaries, and a centralised warehouse, (arguments still exist from 30 odd years ago between Kimball v Inmon). You also talk about (I’m interpreting) operating models whereby you have custodians, stewards & consumers of the data. This links into Information Lifecycle Governance (ILG), which I assume is the “central governance” you talk about. Couple that with how enterprise are starting to use data in Systems of Record, Insight & Engagement, drives a whole new approach. You also have to be mindful that data generated at source does not always stay there, as you will have data flows throughout an enterprise (and possibly beyond)… so this is where policies and compliance comes in, founded on “Trust” - Data Classification, Data Hygiene, Data Lineage, Access Controls, Policy Enforcement, Enhanced Auditing and then the physical security layers like enhanced encryption