Essential Ingredients for a Data Mesh Architecture (Part 1 – Understanding the Concepts)

Essential Ingredients for a Data Mesh Architecture (Part 1 – Understanding the Concepts)

This is part of a series of Articles where we dig deeper into Data Mesh Architecture and understand the essential ingredients/pillars which can support the best implementation of this concept.

What is a Data Mesh Architecture?

The Data Mesh Architecture (as originally published here) provokes a radical shift in Data Strategies towards a Distributed, Domain-Driven, Microservices/API like product based design pattern where Data will be designed and produced closest to where the expertise/ownership lies without the Data Storage or the Data Pipeline being the first focal point.

For an easy analogy, think of the Data Mesh Architecture similar to the Continental Drift (connected via a mesh of transport and network mechanisms) where Data is organized based on Domains and by producers best familiar with the data.(Product Thinking)

No alt text provided for this image

Image Source (Animated-gifs.eu)

This Architecture and Data Strategy is stemmed from the following challenges today in Data Analytics

What are the challenges today with a Centralized Data Architecture?

1.     Focus is towards a central data platform

a.     Overview - This is something we have adopted since the Data warehousing space with the need to have a Centralized DataWarehouse which has evolved towards the need to adopt a centralized/global Data Lake (or perhaps a Data Ocean) coupled with a Datawarehouse.

b.     Challenge - Yet, in most organizations maintaining, standardizing, agile-development, migration and optimizing the use cases for such a platform is at the very least a big challenge.

c.     Alternatives – If we look at several other technology/ software practices, the main shift is to move away from the monolith/make it easier to adopt fast/distributed ownership. E.g. – Microservices, Blockchain etc.

2.     Focus on Data Management Pipelines/Type of Storage

a.     Overview – With more and different types of Data now part of Analytics, we are faced with constantly moving data across different Data Technologies, whether it is Data Lake, Database/Datawarehouse, NoSQL, GraphDB, Event Streaming/Pub-Sub.

b.     Challenge – The multiple technology types are not by themselves the challenge, but its when they become the decision on a data strategy or result in a lot of data movement/duplication. From an architectural perspective take the Lambda and Kappa Style Architectures – Both require data movement/copying across Streaming and Batch based data technologies.

c.     Alternatives – Let the business case decide the best location for data rather than technology choices. The Integration/Mesh layer should then be able to stitch them together rather than data duplication.

3.     Lack of focus towards Domain Driven Design and Product Thinking

a.     Overview/Challenge – The focus is far more towards current technology trends, such as Cloud Adoption, Data Science and Datalake rather than focus on creating a Self Sufficient Product Thinking set of Teams

b.     Alternatives – Focus on distributed domain-driven teams focused on creating datasets rich for consumption.

Concept Inspiration - Microservices.

No alt text provided for this image

Image Source (nordicapis.com)

Example to understand this?

Let us take the example of a Taxi Riding Service Company (Such as Uber, Lyft, Ola etc.) Below is a very basic example of how a Data Mesh looks like -

No alt text provided for this image

Essential Takeaway is the Product Thinking mindset that the Owners are responsible to exposing data in the right manner and to have multiple consumers of their datasets rather than a Factory thinking mindset to provide requirements which will be picked up by a Central set of teams performing Data management tasks with limited Business/Owner inputs or incentives.

In the next section of the Article we explore one of the main components of such a Data Mesh (Hint - Equivalent of the Registry/Discovery capabilities within a Service Mesh Architecture).

Abhijit Roy Choudhury

Data & Analytics ?? ● 4x Azure Certified ?? ● Leveraging Azure capabilities for Enterprise Data Platform Transformation ??

3 年

Nice article indeed ! Gives a valuable overview of the data mesh architecture.

HEMANT SINGHAL

AVP, Data, AI & Analytics - EMEA & APJ

5 年

Nice article Sidd

Johan Lundh

Helping organizations on their data driven Digital Transformation founded on AI

5 年

Very good insight Sidd..

要查看或添加评论,请登录

Siddharth Rajagopal的更多文章

社区洞察

其他会员也浏览了