Exit Modern Data Stack, Enter Modern Data Platform
Dagster Labs
Building out Dagster, the data orchestration platform built for productivity.
The Modern Data Stack has changed how organizations manage and leverage data. With its modular and specialized tools, it offers efficiency and scalability for various data tasks. However, the increasing complexity and interconnectedness of data pipelines have exposed limitations within the Modern Data Stack, and we've been seeing a shift towards the the rise of a Modern Data Platform.
Before you get excited, it's worth knowing that you can't buy a Data Platform, instead, a data platform is the culmination of all the tools, infrastructure, and data systems that exist within your business. Whether it's a well-designed platform or something else, is a different question.
The Modern Data Stack emphasizes best-of-breed tools and provided us with improved tooling and productivity, especially when contrasted with more legacy solutions. Now in its maturity, we see some of the challenges that come with it as well.
The Need for a Holistic Approach
While the UNIX philosophy of many different best-of-class tools that operate together felt like a breath of fresh air to the stagnant philosophy of legacy data tooling, the UNIX metaphor quickly fell apart as the various tools in the modern data stack were too complex to easily interface with each other. Beyond that, there was little incentive for deep integrations, with tools often providing surface-level depth as they connected with each other, if at all. What this lack of depth meant is that there remained a need for additional glue code to stitch together the pieces of the stack.
Being able to build, observe, and understand not only the various components that made up the stack, but all the interfaces between them has become a critical part of a modern data platform. It is no longer sufficient to hope that systems interoperate, but rather, as business needs evolve and business logic becomes more complex, that integration itself becomes a core benefit of what a data platform offers.
领英推荐
Data Reliability, Quality, and Freshness
Data reliability and quality soon became critical foundations to a data platform. As automation has become far more engrained within the entire business, from go-to-market activities within sales and marketing, to customer-support ticketing and even in-product analytics, being able to trust that the data being delivered meets quality SLAs often meant yet another tool in the stack that may or may not integrate with the entire platform.
Instead, what is needed is the ability to execute arbitrary computations that assert the quality of the data and the systems that drive it. This computation needs to occur alongside the existing infrastructure that manages the execution of the pipelines, in order to be able to introspect the data, as well as to halt any processes should failures occur.
Data Cataloging and Collaboration
A lack of observability is often the linch pin in the modern data stack. It can often be difficult, if not impossible, to understand the state of a system. Many tools do not expose enough details to understand what assets are being operated on, or their health. These can often require complex API calls or a set of onerous click-ops practices. Searching across tools in the stack is virtually impossible, and often results in documentation that is soon out-dated.
What is needed is a system that can both operate the data platform, as well as introspect and expose the various assets that make up the systems that it is operating.
The Modern Data Platform
The Modern Data Platform is a deliberate system built by data platform engineers that addresses these pain points. As companies evolve in complexity and make deeper, more automated decisions, a move away from a data stack and toward a modern data platform is a critical step along the journey from siloed data tools to an integrated ecosystem, enabling organizations to manage data as a strategic asset. It empowers data teams to build productive and trustworthy data pipelines while promoting collaboration and innovation throughout the organization