Cloud Data Warehousing—Ware in the Cloud is best?
My goal in writing “Cloud Data Warehousing—Volume II: Implementing Data Warehouse, Lakehouse, Mesh, and Fabric” and, indeed, Volume I was to provide help and guidance to those trying to understand and implement cloud data warehousing. The audience is the managers and architects—perhaps new to data warehousing—who make the high-level decisions about their enterprise’s overall approach and to help them avoid the many pitfalls inherent in the current public discourse. With the many overlapping and contradictory definitions and terminology that exist in the market, it is exceedingly difficult to answer the seemingly simple question: “Which approach should I choose—warehouse, lakehouse, fabric, or mesh?”
The short answer is, of course, a consultant’s “it depends.” To arrive at a comprehensive answer to what it depends on, you need to consider the wide range of architectural and technological considerations detailed in Volume I and Volume II. However, there is a short cut that can begin to point you in a reasonable direction! Gartner’s Hype Cycle for Data Management, 2023 included some of the key considerations in their broad review of data management evolution. ?The figure below is adapted from their more comprehensive figure. It shows Gartner’s positioning of four ADPs—data lake, data fabric, data mesh, and data lakehouse. Data warehouse, whether cloud or otherwise, is omitted, perhaps being too far to the right on the plateau of productivity ??.
The categories on the time axis of the hype cycle are well known and largely self-explanatory.
Predictably, data lake?is shown furthest along the evolutionary path. More surprising, perhaps, is the judgment (shown by the colour of the dot) that the plateau of productivity is still 2-5 years out, given data lake’s long history. However, the obstacles are well-known—a severe lack of metadata and limitations in the design and governance of the data stores included.
Data fabric—the concept promoted by Gartner—although plunging into the trough of disillusionment, is shown as the next furthest travelled and rated as transformational. However, it is still seen as 5-10 years from maturity. The obstacles relate to active metadata management maturity, as well as the differing and immature metadata standards across the industry.
Data lakehouse?and data mesh?are both at the peak of inflated expectations, although Gartner’s judgments on them differ considerably. Lakehouse is seen to be 2-5 years from the plateau, and the fastest moving pattern. This reflects the relative maturity of, and vendor support for, many of the underlying technologies. Obstacles include a divergence of vendor focus areas on either the warehouse or analytics aims of the approach, as well as the challenge of complex warehouse scenarios.
Perhaps the most surprising aspect of the analysis relates to data mesh, defined by Gartner as an evolving “data management approach”. The report asserts that, despite its rapid growth in hype (moving from innovation trigger in 2022 to peak of inflated expectations in 2023), it is likely to be obsolete (shown by the red dot) before it reaches the plateau. This is based on the likelihood that its core capabilities are likely to be subsumed into data fabric. However, data-as-a-product—a key component of data mesh—is highly rated by Gartner in the full report and clearly seen as having a life beyond data mesh.
领英推荐
Gartner’s Magic Quadrant is only one approach to judging the viability and applicability of the different Cloud Data Warehousing approaches. Further consideration can be found in “Cloud Data Warehousing—Volume II: Implementing Data Warehouse, Lakehouse, Mesh, and Fabric.” Of these, I find a close examination of your existing environment and current skills to be most strongly indicative of a preferred choice of approach.
This concludes this series of articles on my Cloud Data Warehousing series. Previous articles:
·???????? https://www.dhirubhai.net/pulse/cloud-data-warehousinga-mist-upon-lakehouse-barry-devlin-qcgle
·???????? https://www.dhirubhai.net/pulse/cloud-data-warehousingthe-sunny-skein-fabric-barry-devlin-vfuce
I hope you enjoyed them and found them enlightening.