Building Trust in Data
Data pipe... covered bridges?

Building Trust in Data

Much like how the Google Data Analytics course begins, "data, data everywhere" (I chose not to take this course)... Well, you get it, data really is everywhere.

And the Modern Data Stack is confusing. Two years ago, it looked like this:

No alt text provided for this image
Firstmark 2021

Fear not, the Modern Data Stack is gradually consolidating.

No alt text provided for this image
Vertex Ventures 2022

Beware, there's going to be lot of buzz words mentioned.

  1. Data Observability revolves around the monitoring of data pipelines.
  2. Data Mesh is the approach to remove centralization and eliminate the "data silos", to be able to access, manage and share the data; treat data as a product, not as a by-product.

  • The Process
  • The Content
  • The Lineage, when it is actionable.

In their practical use, both are more related to the work of the data engineer.

Data engineers spend about 50% of their time maintaining the pipelines, trying to minimize the data down time. And time is money.

1 minute of downtime on AWS costs $200,000 at the very minimum.

Having reliable data is ideal. The reality is less than ideal. And which of the data roles is responsible for that?

In comes Data Governance. "Data governance is everything you do to ensure data is secure, private, accurate, available, and usable. It includes the actions people must take, the processes they must follow, and the technology that supports them throughout the data life cycle." (Google Cloud)

So, observability means discovery, resolution, and most important in my mind- prevention. There's no point (although, there is glamour) in just resolving issues. Prevention may take some effort and won't justify itself right away, but will save time and money in the long run.

Data engineers don't need to be responsible for data quality. Data governance brings responsibility to the data producers- within domain data teams.

And data analysts can spend about 80% of their time cleaning and processing the data. Why do they need to do that? because it's wrong. Don't draw insights and present them to your stakeholders out of wrong data.

Data Quality- the data needs to be accurate, complete, consistent, valid, timely and unique.

Garbage in = Garbage out.

Not to mention the data that never gets used and just sits in storage (that's the Dark Data).

?? Linking the data teams to ROI is not simple, but it is the direction of data: optimizing the spend and running metrics around it. Your organization needs to be confident that the data is trust worthy and actionable, and that it has a monetary value.

Disclaimer: I am a data analytics student, this is written from my point of view and and is inspired in part by Monte Carlo, the Data Observability Platform.

Keren Henninger

? Data Exploration ?? Analyst at the Crossroads ?? Supply Chain @ Abbott

2 年

(Not resources, but relevant to further explore "The Modern Data Stack Evolution - 2023 will be the year of consolidation" by Chris Tabb from Data Day Texas 2023, and "What is Data Observability? 5 Key Pillars To Know In 2023" from Monte Carlo)

回复

要查看或添加评论,请登录

Keren Henninger的更多文章

  • Year End Analysis

    Year End Analysis

    I haven't been using LinkedIn at all in the last few months. Here is my summary: When it started: I opened a LinkedIn…

  • It's a Long Story

    It's a Long Story

    Happened this week..

    15 条评论
  • Well Rounded

    Well Rounded

    Happened this week..

  • Work in Progress

    Work in Progress

    Happened this week..

    6 条评论
  • Here's to YOU

    Here's to YOU

    Happened this week..

    6 条评论
  • Visible

    Visible

    Happened this week..

    6 条评论
  • The Way of Vikings

    The Way of Vikings

    Happened this week..

    3 条评论
  • Transform

    Transform

    Happened this week..

    6 条评论
  • #blackdogsociety

    #blackdogsociety

    Happened this week..

    3 条评论
  • This Week-ish in Data Analytics: Halloween Edition

    This Week-ish in Data Analytics: Halloween Edition

    Happened this week..

    1 条评论