登录查看更多内容

Efficient Data Domains Organization

Alessio Cesana

Azure Cloud & AI Senior Consultant at Microsoft

发布日期: 2023年11月26日

Intro

In this writing, we discussed the reasons behind Data Mesh, and we defined some key elements. One of the key elements for the pattern is the decomposition of enterprise data across Data Domains.

Figure 1 – Domain Distribution across the organization

As we discussed, Data Domains should represent a container for related data products. The key question now is: which is the right way to build Data Domains?

Data Domains Theory?

Data Mesh theory defines?three kinds of Data Domains:

Source-aligned Data Domains: with this approach, data domains reflect the domains defined by the operational plane.
Consumer-aligned Data Domains: with this approach, data domains align with business processes that extract value from the data and serve specific use cases.
Aggregate data domains: with this approach a data domain mix both the domain structure of operational plane and business process.

Since aggregate data domains are used only in very specific scenario, they will be discussed in a future writing.

Basically, when we work with source-aligned data domains, we stay close to the source of the information. When we work with consumer-aligned data domains, we stay close to the business consumer, trying to fulfil its business needs.

Typically, when we design the analytical layer of the company application landscape, we tend to aggregate data into business entities, often trying to stay as far as possible to the specificity of our operational systems. With this in mind, source-aligned data domains may seem a weird choice. Everything will be clearer when we will discuss of the reference technological landscape for the data mesh pattern, but, for now, just think:

It is common to build analytics on data coming from source applications so having their data product managed as any other data product in the organization can be a good idea.
Operational system can have different technological landscapes that are hard to bend to the needs of the decentralized approach of the data mesh (e.g.: ERPs are typically centralized solutions, and it will almost never be allowed to business user to consume data as they like).

So, given the goal to harmonize operational system data management across the organization to data domains and data products definition, source-aligned data domains make sense.

Generally speaking:

Build source-aligned data domains, whenever you are working with a self-consistent set of information that closely resembles your source systems.
Build consumer-aligned data domains, whenever you need to aggregate data coming from different sources trying to reflect business entities and business processes.

Data Domains and the enterprise organization?

As we know, Data Domains will incorporate Data Products, and each Data product will be managed by Data Owner. Since the focus is on the ownership, a possible approach is to rely on the enterprise organization to identify Data Domains. In this respect, there are two possible approaches:

Data Domains are mapped to enterprise organizational units, grouping data products by their organizational ownership.
Data Domains are mapped to logical functions, grouping data products by their business purpose.

In the first scenario, Data Domains will be designed over the enterprise organizational structure and every identifiable group of people in the organization can represent a specific data domain:

Figure 2 – Contoso Corp sample organizational tree

In the Contoso Corp example shown in the picture above, each organizational unit can become a Data Domain, so we can have for example “Office 1”, “Department 2”, “Division 1” data domains, according to the specific use case. With this approach, there will be a direct and clear correspondence between Data Products, Data Owners, and the organizational structure. At the same time, we will know that the data products belonging to a given Data Domain, will realise value for their owning organizational unit.

领英推荐

Who owns data quality? And when?

Barr Moses 6 个月前

7 Elements of a Data Strategy

Analytics8 | Data & Analytics Consultancy 2 年前

250% more business value

Jose Almeida 3 年前

On the other hand, mapping Data Domains with logical functions implies to leverage on cross-unit functions for their definition. For example, you can map Data Domains to projects: they can involve people from different organizational units since often resources with cross-functional skills are required.

Figure 3 – Contoso Corp Project oriented Data Domains

With this approach, Data Domains are built around the definition of project deliverables, and they have no relation with the organization structure or specific business functions; on the contrary these Data Domains can be spread across different business functions, resulting in having Data Owners coming from different units:

Figure 4 – Cross functional Data Domain specification

Projects are just an example of cross-functional items: Data Domains can be built over meaningful areas of information; “Customer management” can be a Data Domain, that involves several organizations in the enterprise (commercial, accounting, delivery, product management, etc..).

A good way to distribute Data Domains could be to leverage on the data ownership: is the ownership related to the organizational structure of the enterprise? If so, perhaps the wisest choice would be to go with the organizational design. If the data ownership is mostly related to the outcome projects or to cross functional knowledge, then it would be wise to go with cross-unit Data Domains.

Even if everything seems to be straightforward as of now, as always reality is far more complex than that. Consider this scenario:

Data Products are built running projects managed by the IT function.
Even if most of the projects serve one organizational unit, some of them can serve more than one.
The data ownership of the produced data products can be assigned to users coming from different organizational units and different projects can share the same data owner according to the managed information.

Even though it can seem straightforward to use a cross-unit approach since:

In terms of data production, a project is self-consistent and complete.
Each project team should be able to work autonomously in an isolated area, dependencies from other projects/areas should be kept at minimum.

From a project point of view, it is consistent to have “project-driven Data Domains”. Nevertheless, if we look at the situation from a Data Owner point of view, we have this situation:

Data of business areas will be spread in several Data Domains, missing a unique view of the information.
The Data Owner will need to evaluate information scattered across several projects.

Even if we simplified the data production process, when we look at the data consumption from the data business meaning perspective, it has become inefficient.

This is likely to be a scenario in which having both the approaches in place could be wise.

One of the main advantages of this approach is that we are decoupling the ingestion from the data transformation. Data is coming into the system (and we will see in a few writing that exactly the “system” is) preserving its original shape and meaning. According to the business scenario data is then transformed, grouped, merged, filtered to solve the specific need and this transformation is always non disruptive.

In the next writing we will discuss this topic: is there any technological pre-requisite to adopt the Data Mesh approach? In general, what is required from an organization to successfully implement the pattern?

要查看或添加评论，请登录

Alessio Cesana的更多文章

What Is Data Mesh?

2023年10月16日

What Is Data Mesh?

Intro One thing that always impresses me any time I start working with a new customer is how often, when discussion…

Efficient Data Domains Organization

Alessio Cesana

Azure Cloud & AI Senior Consultant at Microsoft

Intro

Data Domains Theory?

Data Domains and the enterprise organization?

领英推荐

Alessio Cesana的更多文章

社区洞察

其他会员也浏览了

Active metadata platform as the future of data catalogs, weekly recommendations, and more

Data Silos: A Business Case for a New Data Warehouse

Data cataloging and its importance

From Raw Data to Real Results: The Benefits of Using Data Cafe for Your BI Needs

Star Schema: The Cornerstone of Your Enterprise Data Warehouse

Transform Your Data into Actionable Insights with Data Cafe

Blending Data Mesh and Data Fabric: Crafting a Balanced Data Strategy

How Data Cafe Makes Complex Data Simple and Accessible for Every Team

Building the best tech stack to democratize data

Saving Cents on Data Sense: Less Cost, More Value

Intro

Data Domains Theory?

Data Domains and the enterprise organization?

领英推荐

Alessio Cesana的更多文章

What Is Data Mesh?

社区洞察

其他会员也浏览了

Active metadata platform as the future of data catalogs, weekly recommendations, and more

Data Silos: A Business Case for a New Data Warehouse

Data cataloging and its importance

From Raw Data to Real Results: The Benefits of Using Data Cafe for Your BI Needs

Star Schema: The Cornerstone of Your Enterprise Data Warehouse

Transform Your Data into Actionable Insights with Data Cafe

Blending Data Mesh and Data Fabric: Crafting a Balanced Data Strategy

How Data Cafe Makes Complex Data Simple and Accessible for Every Team

Building the best tech stack to democratize data

Saving Cents on Data Sense: Less Cost, More Value