登录查看更多内容

Much Like Society, Data is better with Democracy.

Dennis Balada

Senior Business Relationship Lead @ Origin Energy | SME, Engineering, Charging, Fleet| I make the transition to EV easy and commercially viable.

发布日期: 2022年8月29日

Becoming a data-driven organisation remains one of the top strategic goals of many companies I work with.

Most are well aware of the benefits of becoming?intelligently empowered: providing the best customer experience based on data and hyper-personalisation; reducing operational costs and time through data-driven optimisations; and giving employees super powers with trend analysis and business intelligence.

They have been investing heavily in building enablers such as data and intelligence platforms. Despite?increasing effort and investment in building such enabling platforms, the organizations find the results middling, why is that so?

In this article I'll touch on why I think this is the case and how it can be solved as a business leader.

Monkey see and Monkey do

Here in Australia there's a trend of organisations have been working diligently to stamp out line of business having the control and freedom to make decisions and spend accordingly, mostly because at one point it was like the wild west; Where there were hundreds of investments with poor integrations , shadow IT popping up an siloed data and security concerns been normal.

This led domain architects to build/ buy solutions that would eventually lead to centralised, monolithic and domain agnostic data platforms to remedy this.

Essentially we have moving away from data ownership that is specific to certain domains, to centralized data ownership that is not domain-specific and we have been very proud of creating the biggest monolith of them all - the big data platform.

This has worked in the past, prior to the explosion of data and cloud adoption, but in today's world has led to significant problems.

Centralised and monolithic "Big government" style ownership.

Unfortunately this centralised model can work for organisations that have a smaller number of different types of customers and consumers but it fails for companies with a lot of different types of customers and a lot of sources for their data. This is because the more data that is available everywhere, the harder it becomes to control all of it in one place. This is especially true for data about customers. There are more and more sources of customer information, both inside and outside of organisations. Trying to store all this data in one place will limit our ability to use all the different sources of information.

The Titanic Effect - Inability to move quickly

Organisations also need to experiment quickly, fail fast and learn from previous mistakes, which means that there are more ways that data from the platform can be used. This in turn means that there are more transformations of the data- aggregates, projections, and slices- that can satisfy the needs of organisations for data. However, the long response time to satisfy the data consumer needs has been a point of friction for organizations in the past and remains to be so in well-established data platform architectures such as in Data Lakes and Data warehouses.

Ironically siloed ownership and Frustrated users.

Siloing data engineers from the operational units is not sustainable.?The platform's hyper-specialised teams have little understanding of their source domains and need to work with a diverse set or needs, whether it be analytical or business intelligence related - but without clear guidance on where they can find these experts within an organisation who will provide access for consuming applications that use big Data tooling like Spark etc., then this separation only leads towards suboptimal outcomes due lack alignment across functions internally as well externally.

Data engineering centralisation creates disconnected source teams, frustrating consumers fighting for a spot on top of the data platform team backlog and an over stretched Data Platform Team.

领英推荐

Trust.

Simha Chandra Rama Venkata J 7 个月前

EU OPEN DATA DAYS 2025: A KEY EVENT FOR INNOVATION…

Mathieu Gitton 3 周前

2020 Trends: Moving Information - Enablers of Big Data

Shawn Nason 5 年前

How do we Evolve Past this ?

Centralised, monolithic and domain agnostic data platforms as I have explained above have created a lot of learnings over the last decade or so, and from those learnings businesses are now realising that decentralising, democratising that data making it available everywhere and interconnected is incredibly important.

This is called Data Mesh

Data Mesh emphasises data governance and data sharing across organisational silos. The data mesh approach encourages organisations to build data products that are relevant, meaningful, shareable, and governed by data policy.

A data mesh architecture includes a data hub, data proxies, data services, and a discovery layer. The data hub is the central repository for all data products. Data proxies are used to access data from disparate data sources. Data services provide APIs for data access and management. The discovery layer aids in the discovery of data products and their underlying data sets. A data mesh provides a flexible, scalable way to manage data across an organization. It enables organisations to better utilize their data assets and build better data products.

Wait! Silo's ... Isn't this full circle?

Much like some first love couples break up .. and end up together later in life, grown up and matured ( hopefully happily ever after) building a data mesh has taken the learnings of time and applied principles which are :

Secure - In this world of decentralized domain oriented data products, access control is applied at a finer granularity- for each individual item within the product. It also means some form of Global security control must be applied and this standardised upon to ensure threats can't spread across a network.
Addressable - Data products should have a unique address that follows a global convention. This way, users can find and use them easily. Different organizations might use different conventions for their data, depending on how it is stored and formatted. But it is important to make sure data is easy to use, so a standard for addressability should be developed. This would make it easier to find and access information in a polyglot environment.
Discoverable - A data product must be easily discoverable, fairly no brainer here. the main difference from traditional platforms is previously, there was a single platform that collected data and used it for its own purposes. Now, each domain provides its data in a way that is easily discoverable.
Interoperable - One of the most important things in a distributed data architecture is being able to correlate data between different domains. This lets you see how everything fits together and find insights. To do this, you need to follow certain standards so the data can be harmonized. This includes making sure the fields are formatted the same way, identifying multiple meanings for words across domains, using the same address conventions, and having common metadata fields.?
Self Describing - Good products don't need people to help them work. People can find them, understand them, and use them on their own. To make it easy for data engineers and data scientists to use your data, you need to have well-described semantics and syntax for your data, along with sample datasets. Data schemas are a good way to do that.
Trustworthy - People will not use a product if they can't trust it. In traditional data platforms, it is okay to extract and onboard data that has errors in it. This is where the majority of the efforts of centralized data pipelines are concentrated: cleansing data after ingestion. Domain owners will need to set Service level objectives (SLO's) and create some form of data quality indicators to ensure the trustworthiness and truthfulness of the data product.

I don't know if i'm coining the acronym "SADIST" here, feel free to use it if you have a sense of humor .

What's important is that cross functional skills and teams be invested in by businesses, policies and governance be implemented and the guiding principles be adhered to avoid going backwards.

Wrapping up i'll show you this paradigm shift looks like in the real world in this Diagram, what's key to note is that each domain has its preferences and toolboxes , but all are interoperable and sharing data in one big cohesive web.

Cloudera where I work specialises in bringing this together and enabling our customers to implement modern data architecture principles like Data mesh.

I suggest you check out our site or send me a message if you're interested to find out more.

Thanks for reading , See you on the next one.

Michael Kogan

Data Challenges Solver | Partner for your Data Dreams | Enterprise Technology Sales Executive @ Databricks

2 年

Try posting this on Medium

1 次回应

要查看或添加评论，请登录

Dennis Balada的更多文章

Why Cyclone Alfred Proved EVs are Critical Infrastructure

2025年3月11日

Why Cyclone Alfred Proved EVs are Critical Infrastructure

Cyclone Alfred recently tore through Queensland, leaving nearly half a million homes without power and highlighting…

2 条评论
Fight back against diminishing Solar Tariffs and EV Charging

2025年1月13日

Fight back against diminishing Solar Tariffs and EV Charging

Edit Correction 11/03/2025 Victoria’s rooftop solar situation has reached a tipping point. With a 76% increase in…

3 条评论
Building a Reliable Energy Market to Support Australia’s Clean Future

2024年11月5日

Building a Reliable Energy Market to Support Australia’s Clean Future

Our energy system in Australia has evolved a lot since the National Electricity Market (NEM) started over two decades…
Business Models for EV Charging: What's Working in 2024?

2024年10月10日

Business Models for EV Charging: What's Working in 2024?

The world of EVs has undergone a remarkable transformation in the last decade, but alongside the shiny cars and the…

2 条评论
Thoughts on Helping Businesses Navigate the EV Transition and Battery Safety

2024年9月22日

Thoughts on Helping Businesses Navigate the EV Transition and Battery Safety

As I’ve been digging into the recent report from the Joint Standing Committee on Road Safety " Read the report here "…
A story of change

2024年9月12日

A story of change

In the quiet corner of a boardroom, "Jack" sat staring at a long list of expenses. As the fleet manager for a growing…
Understanding Electric Vehicle Fire Safety in Australia

2023年9月19日

Understanding Electric Vehicle Fire Safety in Australia

As someone who has been involved in motorsports, turning wrenches my whole life, the recent move to E-mobility was an…
Cloudera for Re-engineering data pipelines for sustainability.

2023年3月3日

Cloudera for Re-engineering data pipelines for sustainability.

ESG and all that..
Serverless Nifi Flows within Cloudera CDP - A greater TCO and Agility for all cloud users

2023年1月12日

Serverless Nifi Flows within Cloudera CDP - A greater TCO and Agility for all cloud users

Cloudera DataFlow for the Public Cloud (CDF-PC) allows organisations to take control of their data flows while…

1 条评论
Data Lineage for everyone

2022年11月23日

Data Lineage for everyone

It's no secret that data is becoming increasingly important in today's business landscape. However, for the non-data…

3 条评论

See all articles

Much Like Society, Data is better with Democracy.

Dennis Balada

Senior Business Relationship Lead @ Origin Energy | SME, Engineering, Charging, Fleet| I make the transition to EV easy and commercially viable.

领英推荐

How do we Evolve Past this ?

Dennis Balada的更多文章

社区洞察

其他会员也浏览了

PASS Data Community Summit 2024 Keynotes – Live blogged

The Objective Observer Initiative: A New Frontier in Open Data and Agency

My speech at European Big Data Value Forum

Data 101: The World’s Most Important Resource, Why it Matters & Where It’s Going

Unlocking AI’s Potential: Why Data Access Frameworks Matter

Harnessing Data: How to Leverage Big Data to Advance Your Company

Human-Driven, Data-Informed: The Next Wave of Channel Transformation

DEMOCRITIZING DATA

Where are the (data) Champions... ?

Solving business’ data challenges with graph

领英推荐

How do we Evolve Past this ?

Dennis Balada的更多文章

Why Cyclone Alfred Proved EVs are Critical Infrastructure

Fight back against diminishing Solar Tariffs and EV Charging

Building a Reliable Energy Market to Support Australia’s Clean Future

Business Models for EV Charging: What's Working in 2024?

Thoughts on Helping Businesses Navigate the EV Transition and Battery Safety

A story of change

Understanding Electric Vehicle Fire Safety in Australia

Cloudera for Re-engineering data pipelines for sustainability.

Serverless Nifi Flows within Cloudera CDP - A greater TCO and Agility for all cloud users

Data Lineage for everyone

社区洞察

其他会员也浏览了

PASS Data Community Summit 2024 Keynotes – Live blogged

The Objective Observer Initiative: A New Frontier in Open Data and Agency

My speech at European Big Data Value Forum

Data 101: The World’s Most Important Resource, Why it Matters & Where It’s Going

Unlocking AI’s Potential: Why Data Access Frameworks Matter

Harnessing Data: How to Leverage Big Data to Advance Your Company

Human-Driven, Data-Informed: The Next Wave of Channel Transformation

DEMOCRITIZING DATA

Where are the (data) Champions... ?

Solving business’ data challenges with graph