Design a Data Mesh Architecture in Practice
Data Mesh vs Centralized Data Model
Long lasting relational databases and transactional architectures still have been well served variety of use cases. Once, however, organizations understood there are lots of “values” in data itself, analytical use cases brought different necessities and consequently different architectures.
Surging from Batch processing and Lambda architectures to Kappa and Micro services architectures basically came to addressed to accomplish bigdata challenges for the business.
Single Source of truth also surged to make sure everyone will see one source of unified data with centralizing data in a Data Lake. Therefore, one team produces the data and the whole other can consume those data.
In reality, however, in fully centralized data lakes, there are some clear gaps between business areas and IT team. IT teams & single Data Eng. teams try to build and create data pipelines in a hope that LOBs and executives can get full benefits of data. Since the gap is so big, in reality, in most of the cases this does not happen. And that’s because who produce the data and make it ready is not who really use it.
Data Mesh architecture concept, however, comes to reduce this gap. Organizations Looking at the data as a Product and not merely as an asset. It is where we believe we will be closer to a democratized data driven business.
One of the most important enablement you can name using Data Mesh is “Data Autonomy”. Where building a self-service data infrastructure can help very much data democratizing in practice.
Data Mesh Architecture in Practice
I have seen plenty of scenarios where in theory building an architecture with Domain based / Data product approach is easy. However, for some, making those theatrical concept into reality has been arguably a challenge.
领英推荐
Cloud has reduced the complexity to build flexible data architectures. Those flexibilities, governance and security options have been critical to build new approaches which will end up transform ideas to a reality.
Some points to consider:
The following are user experience considerations:
Data Consumer1 and Data Builder1 (Producers) are from a single domain/dept. The idea here is to remove the gap between those two entities. And in reality, means how we can give more autonomy and flexibility so that domain areas can make the most of their data by creating their own “product”.
Many thanks!
Sr. Cloud Architect Data & AI - Microsoft
2 年To check more discussion: https://www.dhirubhai.net/posts/arvindata_cloud-data-data-activity-6873258858748366848-xuXo
CDAO at Amil Group | 2024 Global Top100 Innovators in Data & Analytics by #Corinium | 2022 Global Top 100 Leading Enterprise Data Leaders by #CDOMagazine
2 年excelente Arvin, eu ainda vejo o desafio de manter o catalogo atualizado x a velocidade q fazemos de ingest?o no data lake. tambem vejo ainda as empresas c muita demanda para camada semantica centralizada para ter a versao unica do numero…. mas vejo crescer a demanda por self service …. e de novo camada semantica e catalogo sao os atores principais para isto dar certo. bj Ro
Data Management, Faithlife, LLC Founder/CEO, CartersFarm.Software - a small software company with small ideas. Aspiring Cartoon Mime Voice Actor.
2 年I think it is important to note that the reference to "business context" needs itself to be managed within an overarching ontology so semantic meaning can be established consistently for all of the self-service actors. In practical terms, lack of collaborative attention in this area has been the weak spot in many of the data architecture projects I've seen. Serious work using Knowledge Graphs or other semantic disciplines is essential to actually implementing data lakes, lake houses, or other approaches to data as a product. #ontology #knowledgegraph #semantic
Coordenador de Sistemas @ Vivo | DEVOPS TEAM
2 年Juliana Miranda
Solutions Architect at Databricks
2 年A gente pode criar uma banda de pagodata chamada data mesh e remesh ??