Essential Ingredients for a Data Mesh Architecture (Part 1 – Understanding the Concepts)
This is part of a series of Articles where we dig deeper into Data Mesh Architecture and understand the essential ingredients/pillars which can support the best implementation of this concept.
What is a Data Mesh Architecture?
The Data Mesh Architecture (as originally published here) provokes a radical shift in Data Strategies towards a Distributed, Domain-Driven, Microservices/API like product based design pattern where Data will be designed and produced closest to where the expertise/ownership lies without the Data Storage or the Data Pipeline being the first focal point.
For an easy analogy, think of the Data Mesh Architecture similar to the Continental Drift (connected via a mesh of transport and network mechanisms) where Data is organized based on Domains and by producers best familiar with the data.(Product Thinking)
Image Source (Animated-gifs.eu)
This Architecture and Data Strategy is stemmed from the following challenges today in Data Analytics
What are the challenges today with a Centralized Data Architecture?
1. Focus is towards a central data platform –
a. Overview - This is something we have adopted since the Data warehousing space with the need to have a Centralized DataWarehouse which has evolved towards the need to adopt a centralized/global Data Lake (or perhaps a Data Ocean) coupled with a Datawarehouse.
b. Challenge - Yet, in most organizations maintaining, standardizing, agile-development, migration and optimizing the use cases for such a platform is at the very least a big challenge.
c. Alternatives – If we look at several other technology/ software practices, the main shift is to move away from the monolith/make it easier to adopt fast/distributed ownership. E.g. – Microservices, Blockchain etc.
2. Focus on Data Management Pipelines/Type of Storage
a. Overview – With more and different types of Data now part of Analytics, we are faced with constantly moving data across different Data Technologies, whether it is Data Lake, Database/Datawarehouse, NoSQL, GraphDB, Event Streaming/Pub-Sub.
b. Challenge – The multiple technology types are not by themselves the challenge, but its when they become the decision on a data strategy or result in a lot of data movement/duplication. From an architectural perspective take the Lambda and Kappa Style Architectures – Both require data movement/copying across Streaming and Batch based data technologies.
c. Alternatives – Let the business case decide the best location for data rather than technology choices. The Integration/Mesh layer should then be able to stitch them together rather than data duplication.
3. Lack of focus towards Domain Driven Design and Product Thinking
a. Overview/Challenge – The focus is far more towards current technology trends, such as Cloud Adoption, Data Science and Datalake rather than focus on creating a Self Sufficient Product Thinking set of Teams
b. Alternatives – Focus on distributed domain-driven teams focused on creating datasets rich for consumption.
Concept Inspiration - Microservices.
Image Source (nordicapis.com)
Example to understand this?
Let us take the example of a Taxi Riding Service Company (Such as Uber, Lyft, Ola etc.) Below is a very basic example of how a Data Mesh looks like -
Essential Takeaway is the Product Thinking mindset that the Owners are responsible to exposing data in the right manner and to have multiple consumers of their datasets rather than a Factory thinking mindset to provide requirements which will be picked up by a Central set of teams performing Data management tasks with limited Business/Owner inputs or incentives.
In the next section of the Article we explore one of the main components of such a Data Mesh (Hint - Equivalent of the Registry/Discovery capabilities within a Service Mesh Architecture).
Data & Analytics ?? ● 4x Azure Certified ?? ● Leveraging Azure capabilities for Enterprise Data Platform Transformation ??
3 年Nice article indeed ! Gives a valuable overview of the data mesh architecture.
AVP, Data, AI & Analytics - EMEA & APJ
5 年Nice article Sidd
Helping organizations on their data driven Digital Transformation founded on AI
5 年Very good insight Sidd..