Enabling data teams with DataOps Culture Code, Databricks Unity Catalog, and more ??
Image credit: Atlan

Enabling data teams with DataOps Culture Code, Databricks Unity Catalog, and more ??

Welcome to this week's edition of the ? Metadata Weekly ? newsletter.

Every week I bring you my recommended reads and share my (meta?) thoughts on everything metadata! ??If you’re new here, subscribe to the?newsletter?and get the latest from the world of metadata and the modern data stack.

Now let’s dive into this week’s newsletter. ?

??Databricks Unity Catalog and its role in making metadata more accessible!

I’m personally super excited about Databricks Unity Catalog as it solves a super important problem in the industry — exposing Spark lineage at a column level. Spark lineage is an incredibly difficult industry problem that open-source projects like?Spline?had taken a shot at solving, but none of these projects had gotten it right.

With Unity Catalog, Databricks has made some major updates — the most important being?automated real-time data lineage in the Databricks ecosystem,?which can track lineage across tables, columns, dashboards, notebooks, and jobs in any language inside the Databricks ecosystem. At Atlan, we’re excited about being a launch partner for Unity Catalog, which allows us to now finally create an end-to-end lineage map for Databricks customers directly from source to destination! (Learn more here)

No alt text provided for this image

In general, I’m super excited about more data tools opening up metadata APIs. ICYMI: Fivetran also recently announced its?Metadata API for the Snowflake data cloud.

P.S. Check out more updates and all the major announcements from this year’s Data + AI Summit in our?blog here.

???Spotlight:?The DataOps Culture Code

Earlier this week, Forrester announced their latest Wave for Data Catalogs for DataOps (and ICYMI, Atlan was the top right dot ???in the Wave, which was huge for us). This felt like a great time for a throwback to the?DataOps culture code?— the first internal “culture” document that we’d written for ourselves back in the day.

At Atlan,?we started as a data team ourselves, on a quest to make ourselves as agile as we could. We borrowed the principles of Agile from product teams, DevOps from engineering teams, and Lean Manufacturing from supply chain teams. We experimented for two years across 200 data projects to create our idea of what makes data teams successful. We called this the “DataOps Culture Code” back then and outlined our core principles.

?? It’s a team sport, and collaboration is key

The data team is the most interdisciplinary in any organization. Data scientists, analysts, engineers, business users... diverse people, with diverse tools, skillsets, and DNA. Embrace diversity, and create mechanisms for effective collaboration.

?? Treat data, code, models, and dashboards as assets/ products.

All data assets — from code and models to data and dashboards — are assets, and they should be treated as assets.

  • Assets should be easily discoverable.
  • Assets should be maintained.
  • Assets should be easily reusable.

???Optimize for agility

As business needs evolve rapidly, data teams need to be a step ahead, not deluged with three months of backlog. Constantly measure your team’s velocity, and invest in foundational initiatives to improve cycle times.

  • Reduce dependencies between business, analysts, and engineers.
  • Enable a documentation-first culture.
  • Automate whatever is repetitive.

?? Create systems of trust

With the inherent diversity of data teams, it's all too easy to misunderstand other team members' roles. But that creates trust deficiencies — especially when things go wrong!

Intentionally create systems of trust in your team.

  • Make everyone’s work accessible and discoverable to break down "tool" silos.
  • Create transparency in data pipelines and lineage so everyone can see and troubleshoot issues.
  • Set up monitoring and alerting systems to proactively know when things break.

??? Create a plug-and-play data stack

The data ecosystem will rapidly evolve. The tools, technology, and infrastructure you use today will (and should) be different from the tools you use two years later.

Your data stack should allow your team to experiment and innovate as technology evolves, without creating lock-ins.

  • Embrace tools that are open and extensible.
  • Leverage a strong metadata layer to tie diverse tooling together.

? User experience defines adoption velocity

Employees at?Airbnb?famously said, "Designing the interface and user experience of a data tool should not be an afterthought."

Without a good user experience, the best tools or most thoughtful processes won't be adopted by your team. Invest in user experience, even for internal tools. It will define adoption velocity!

  • Invest in simple and intuitive tools.
  • Software shouldn't need training programs.

???From my reading list

I’ve also added some more resources to my data stack reading list. If you haven’t checked out the list yet, you can find and bookmark it?here.

If you’re new here, check out the?archive of this?newsletter?on Substack. I'll see you next week with some interesting stuff around the modern data stack.?

See you next week!

P.S. Liked reading this edition of the newsletter? I would love it if you could take a moment and share it with your friends on social.

Raj Kosaraju

CIO at Maxil Technology Solutions Inc

2 年

Prukalpa, it is very interesting to know that you have learned the principles of Agile from product teams. You borrowed the principles of Agile from product teams, DevOps from engineering teams, and Lean Manufacturing from supply chain teams.?This is very crucial and important for DevOps. It is a strategy for building, deploying, and maintaining software that creates agile methods for delivering new products—such as product features—faster than traditional delivery methods. A quick way to think about this concept is that DevOps is a combination of a development team (Dev) and an operations team (Ops)

要查看或添加评论,请登录

社区洞察

其他会员也浏览了