What do clients really want from DataMesh?
Dan O'Riordan
VP AI & Data Engineering : Agentic Engineering is where we are going so buckle up...
Let me set some context
Don't be shy or ashamed people, who doesn't like the spice girls? NO, this article isn't a biography about the life and times and songs of the Spice Girls, it is an article on Datamesh.....
BOO I hear you scream, not another article on DataMesh. Be polite, stop booing so I can continue...
I've been working for more than a year with Capgemini clients as together we look to dissect the groundbreaking article by Zhamak Deghani on DataMesh. Who would have thought that one article would have given the Data & AI business such a kick?
For the past year, I have met with lots of clients and I am talking about LOTS. They have all read the article and all can relate to the realization, the data estate is a mess. Centralization of the governance and management of the enterprise data lakes is not delivering on the promise of speedy business insights.
So the DataMesh principles make sense to everyone. Local teams who understand their domain need to work with the data engineers to prepare data for analytic workloads. Add in the required (some say desirable, I say required) SLA requirements (DATSIS : Discoverable, Addressable, Trustworthy, Self-Describing, Interoperable, Secure https://martinfowler.com/articles/data-monolith-to-mesh.html) that need to be in place before your analytic dataset goes into production. This is the basics of a Data Product.
So far so good!
Not really, we haven't added in : "THE FEAR FACTOR"
Every client is in absolute fear of the above picture. The fear for the client is the cost of the army of consultants running down the hills to run workshops to define a DataMesh strategy and a factory-style delivery model. How do we define our Domains? What does a DataProduct look like? What will the Organisation look like?
Don't get me wrong, you do need strategy and consultants who have domain knowledge. You also need them to work in effective and efficient POD's of delivery. At Capgemini, we have the best of these experts but we also understand we need to work with the innovators and leaders in technology.
So in the words of the Spice Girls : "I'll tell you what I want, what I really really want" I WANT DATA PRODUCTS (Not sure the Data Products part was in the song!!!)
领英推è
This lightbulb moment for me was when I met the team from Starburst. Some of the opening statements from one of their architects ( Andy Mott MBA ) "We can build a DataProduct in a couple of minutes!!!" I asked how? He replied, let me show you. He then wrote some ANSI SQL and low & behold, we had a DataProduct, in a Catalog, that was searchable, that had metadata defined.
I have always ONLY considered building a DataProduct using ingestion/spark to wrangle/ Cloud DWH. I still believe that there will be DataProducts that need to be built using these technologies but I have also realised that you can build a DataProduct using Federated MPP SQL Engines (Trino) Starburst.
What Starburst is delivering is significantly helping clients with their ask: "Show me what good looks like in the user experience of building a DataProduct". We recognise this at Capgemini, so this is why we are working with Starburst on integrating their offering as part of our accelerators. Our accelerators (IDEA) use IaC to build your Cloud Infrastructure in a secure way to allow multifunctional POD's build more complicated DataProducts that need spark and programming languages to prepare data. BUT, those DataSets need to be packaged as DataProducts. These teams can now use the Starburst Data Mesh Experience Plane to package the different prepared DataSets as DataProducts for consumption.
One final point on all of this. It is about Data Estate Modernisation. Most clients want to modernise their data estates. They want to maximise the modernisation journey and feel this is the time to adopt a DataMesh approach. Once again, the fear factor is that the business will have to wait a significant amount of time before they start to get the benefit that DataProducts will deliver.
With a deployment of IDEA + Starburst (we promise 4 hours), we can start doing discovery with well-formed multifunctional teams using the Starburst Federated MPP SQL Engine. If during the discovery, the required datasets needed requires ETL using spark and programming, we add those skills to the POD.
Think of it as a pipeline with two options. The POD's will be working in an agile way delivering the DataProducts onto the Starburst Data Mesh Experience plane.
As the migrations happen, the existing DataProducts will continue to iterate and improve. This improvement could be the redirecting of a DataSource from OnPrem that now has been migrated to Cloud.
This blended approach to delivering DataProducts vastly reduces the time to measurable business results from DataMesh.
At Capgemini, we are incredibly excited to be partnering with Starburst, true innovators in the DataMesh space. Finally, we may be moving on from the theory to the practical!
Outcome-Obsessed Strategist | Lifelong Learner | Father, husband and passionate adventurer.
2 å¹´Great article Dan O'Riordan and you are spot on re: The Fear. We're at the very start of a very exciting and game-changing innovation and it's an absolute pleasure to be partnering with you and the team on this.
Managing Enterprise Architect - Cloud Infra Service - Project & Consulting
2 å¹´Easy like with a picture of the spice girl ;-)
Building successful partnership @Starburst Data
2 年Thanks Dan O'Riordan for your passion for data. It’s what knowledge really really needs to deliver together value to customers
Tech Executive / Technology Agitator/ Cloud/ Data & AI lover/Mentor/Angel Tech Investor/GM/EMEA/LATAM/US
2 å¹´Very good post Dan O'Riordan , love the opening sense of humor. You portrait the real conversation. Data Mesh is a framework that may fit or not to fix several challenges and IDEA is a very nice piece of innovation that approach correctly the how.