登录查看更多内容

Enabling data teams with DataOps Culture Code, Databricks Unity Catalog, and more ??

Prukalpa ?

Co-Founder at Atlan –?Home for Data Teams | Forbes30 & Fortune40 lists | TED Speaker

发布日期: 2022年7月8日

Welcome to this week's edition of the ? Metadata Weekly ? newsletter.

Every week I bring you my recommended reads and share my (meta?) thoughts on everything metadata! ??If you’re new here, subscribe to the?newsletter?and get the latest from the world of metadata and the modern data stack.

Now let’s dive into this week’s newsletter. ?

??Databricks Unity Catalog and its role in making metadata more accessible!

I’m personally super excited about Databricks Unity Catalog as it solves a super important problem in the industry — exposing Spark lineage at a column level. Spark lineage is an incredibly difficult industry problem that open-source projects like?Spline?had taken a shot at solving, but none of these projects had gotten it right.

With Unity Catalog, Databricks has made some major updates — the most important being?automated real-time data lineage in the Databricks ecosystem,?which can track lineage across tables, columns, dashboards, notebooks, and jobs in any language inside the Databricks ecosystem. At Atlan, we’re excited about being a launch partner for Unity Catalog, which allows us to now finally create an end-to-end lineage map for Databricks customers directly from source to destination! (Learn more here)

In general, I’m super excited about more data tools opening up metadata APIs. ICYMI: Fivetran also recently announced its?Metadata API for the Snowflake data cloud.

P.S. Check out more updates and all the major announcements from this year’s Data + AI Summit in our?blog here.

???Spotlight:?The DataOps Culture Code

Earlier this week, Forrester announced their latest Wave for Data Catalogs for DataOps (and ICYMI, Atlan was the top right dot ???in the Wave, which was huge for us). This felt like a great time for a throwback to the?DataOps culture code?— the first internal “culture” document that we’d written for ourselves back in the day.

At Atlan,?we started as a data team ourselves, on a quest to make ourselves as agile as we could. We borrowed the principles of Agile from product teams, DevOps from engineering teams, and Lean Manufacturing from supply chain teams. We experimented for two years across 200 data projects to create our idea of what makes data teams successful. We called this the “DataOps Culture Code” back then and outlined our core principles.

?? It’s a team sport, and collaboration is key

The data team is the most interdisciplinary in any organization. Data scientists, analysts, engineers, business users... diverse people, with diverse tools, skillsets, and DNA. Embrace diversity, and create mechanisms for effective collaboration.

?? Treat data, code, models, and dashboards as assets/ products.

All data assets — from code and models to data and dashboards — are assets, and they should be treated as assets.

Assets should be easily discoverable.
Assets should be maintained.
Assets should be easily reusable.

???Optimize for agility

As business needs evolve rapidly, data teams need to be a step ahead, not deluged with three months of backlog. Constantly measure your team’s velocity, and invest in foundational initiatives to improve cycle times.

Rami Krispin 1 个月前

Data Science Prowess in Microsoft Fabric

Sonata Software 1 年前

GroupBy #9: FDAP stack, Iceberg and Hudi ACID…

Vu Trinh 1 年前

Reduce dependencies between business, analysts, and engineers.
Enable a documentation-first culture.
Automate whatever is repetitive.

?? Create systems of trust

With the inherent diversity of data teams, it's all too easy to misunderstand other team members' roles. But that creates trust deficiencies — especially when things go wrong!

Intentionally create systems of trust in your team.

Make everyone’s work accessible and discoverable to break down "tool" silos.
Create transparency in data pipelines and lineage so everyone can see and troubleshoot issues.
Set up monitoring and alerting systems to proactively know when things break.

??? Create a plug-and-play data stack

The data ecosystem will rapidly evolve. The tools, technology, and infrastructure you use today will (and should) be different from the tools you use two years later.

Your data stack should allow your team to experiment and innovate as technology evolves, without creating lock-ins.

Embrace tools that are open and extensible.
Leverage a strong metadata layer to tie diverse tooling together.

? User experience defines adoption velocity

Employees at?Airbnb?famously said, "Designing the interface and user experience of a data tool should not be an afterthought."

Without a good user experience, the best tools or most thoughtful processes won't be adopted by your team. Invest in user experience, even for internal tools. It will define adoption velocity!

Invest in simple and intuitive tools.
Software shouldn't need training programs.

???From my reading list

The Optimetricist?by Stephen Bailey
Stakeholders: The Most Important Relationship for Analysts?by Mikkel Dengs?e
Why Are We Still Struggling to Answer How Many Active Customers We Have??by Ben Rogojan
Data Product in Changing Environments: Rethinking and Updating Investments?by Eric Weber
Why I Will Not Build My Next Data Platform Myself?by Niels Claeys

I’ve also added some more resources to my data stack reading list. If you haven’t checked out the list yet, you can find and bookmark it?here.

If you’re new here, check out the?archive of this?newsletter?on Substack. I'll see you next week with some interesting stuff around the modern data stack.?

See you next week!

P.S. Liked reading this edition of the newsletter? I would love it if you could take a moment and share it with your friends on social.

Metadata Weekly

9,182 位关注者

Raj Kosaraju

CIO at Maxil Technology Solutions Inc

2 年

Prukalpa, it is very interesting to know that you have learned the principles of Agile from product teams. You borrowed the principles of Agile from product teams, DevOps from engineering teams, and Lean Manufacturing from supply chain teams.?This is very crucial and important for DevOps. It is a strategy for building, deploying, and maintaining software that creates agile methods for delivering new products—such as product features—faster than traditional delivery methods. A quick way to think about this concept is that DevOps is a combination of a development team (Dev) and an operations team (Ops)

2 次回应

要查看或添加评论，请登录

查看全部

Enabling data teams with DataOps Culture Code, Databricks Unity Catalog, and more ??

Prukalpa ?

Co-Founder at Atlan –?Home for Data Teams | Forbes30 & Fortune40 lists | TED Speaker

??Databricks Unity Catalog and its role in making metadata more accessible!

???Spotlight:?The DataOps Culture Code

?? It’s a team sport, and collaboration is key

?? Treat data, code, models, and dashboards as assets/ products.

???Optimize for agility

领英推荐

?? Create systems of trust

??? Create a plug-and-play data stack

? User experience defines adoption velocity

???From my reading list

Metadata Weekly

9,182 位关注者

更多精彩文章

社区洞察

其他会员也浏览了

GroupBy #9: FDAP stack, Iceberg and Hudi ACID Guarantees, Data Driven Management

?? DATA Pill #103 - Semantic chunking for RAG + free InfoShare pass contest

FinOps & Databricks (Episode 4) : Kaizen > Eliminate Waste for Efficiency

DATA Pill #077 - Snowflake + Snowpark + Streamlit + Vanna AI, How to reduced docker build times by 40%

DATA Pill #084 - MLOps BABY! MLOps -> MLFlops -> LLMOps?

Data Engineering & Ice Cream, Together At Last

How to use DagsHub for Data?Science

DATA Pill #087 - 2024 predictions and 2023 recap

Power of DVC in MLOps: A Comprehensive Overview

DATA Pill #022 - What have Google, META and others been doing all summer?

??Databricks Unity Catalog and its role in making metadata more accessible!

???Spotlight:?The DataOps Culture Code

?? It’s a team sport, and collaboration is key

?? Treat data, code, models, and dashboards as assets/ products.

???Optimize for agility

领英推荐

?? Create systems of trust

??? Create a plug-and-play data stack

? User experience defines adoption velocity

???From my reading list

Metadata Weekly

9,182 位关注者

How to craft the ultimate business case for data governance - Part 2

2024年11月1日

How to craft the ultimate business case for data governance - Part 1

2024年9月12日

How companies are making Forrester’s idea of modern data cataloging a reality

2024年8月30日

What the recent Forrester Wave means for data catalogs

2024年8月14日

The War of the Catalogs

2024年8月2日

3-step framework for scaling data quality in the age of generative AI

2024年7月18日

4 practical lessons from data governance leaders at Dropbox, General Motors, and Patagonia

2024年5月30日

Why data governance fails in today’s AI world

2024年5月13日

A Shared Language for Enterprise Data ?

2023年8月4日

Modernizing Data Stack ?

2023年6月29日

社区洞察

其他会员也浏览了

GroupBy #9: FDAP stack, Iceberg and Hudi ACID Guarantees, Data Driven Management

?? DATA Pill #103 - Semantic chunking for RAG + free InfoShare pass contest

FinOps & Databricks (Episode 4) : Kaizen > Eliminate Waste for Efficiency

DATA Pill #077 - Snowflake + Snowpark + Streamlit + Vanna AI, How to reduced docker build times by 40%

DATA Pill #084 - MLOps BABY! MLOps -> MLFlops -> LLMOps?

Data Engineering & Ice Cream, Together At Last

How to use DagsHub for Data?Science

DATA Pill #087 - 2024 predictions and 2023 recap

Power of DVC in MLOps: A Comprehensive Overview

DATA Pill #022 - What have Google, META and others been doing all summer?