What the recent Forrester Wave means for data catalogs
Prukalpa ?
Co-Founder at Atlan –?Home for Data Teams | Forbes30 & Fortune40 lists | TED Speaker
A massive transformation — data cataloging now includes governance, quality, security, monitoring, and more
Quick announcement: Metadata Weekly now has over 11,000 subscribers across Substack and LinkedIn! I’m so thankful for all of you who read and support this newsletter, and I’m excited to keep writing about all things data and metadata. ???
In the last issue, I talked about why data catalogs are falling short today —?in short, the modern data ecosystem and its users are more diverse than ever before, and metadata is itself evolving into big data. Whether they’re technical or universal, even the best data catalogs just can’t keep up, and companies still end up with widespread confusion and silos.
I’m not trying to be a Captain Negative, so what’s the solution? This is where some major news comes into play.
The Forrester Wave?: Enterprise Data Catalogs, Q3 2024 was just released. In this report, Forrester examined today’s most significant catalogs and emerged with plenty of thoughts about what it means to be a modern data catalog.?
In today’s issue, let’s examine what we believe this Forrester Wave means for not just data cataloging, but also the data governance, quality, security, and observability categories.?
?? A “massive transformation” in the data cataloging space
Most data people have firsthand experience with the “data wiki”, a data catalog that aims to inventory and document all of a company’s data. It’s expensive to buy, slow to set up, a pain to populate… and ultimately people just don’t want to use it.?
For the last few years, analysts have focused on what to add to these traditional data catalogs to make them successful. Forrester talked about machine learning data catalogs in 2018 and 2020, then focused on data catalogs for DataOps in 2022. Meanwhile, Gartner moved from traditional “metadata management” in 2020 to focusing on active metadata in 2022.
And yet, none of these additions seem to have fixed the problem with data catalogs. That’s why Forrester just announced a major transformation in the way it thinks about Enterprise Data Catalogs.
“Like other data management sectors, enterprise data catalogs (EDCs) are witnessing a transformation driven by AI advancements, fragmented and complex data estates, accessibility needs, and strategic imperatives to harness data for competitive advantage. The exponential surge in the velocity, variety, veracity, and volume of data demands solutions that transcend traditional metadata repositories and technical user bases. Customers seek solutions that can bridge the gap between complex datasets, governance, business insights, and AI enablement. Vendors are offering intelligent solutions, integrated AI and ML to automate and enhance data discovery, semantics curation, impact analysis, quality assessment, among other catalog functionalities. They are also improving user experiences to cater to both technical and nontechnical users, thereby supporting the goal of data democratization and self- service.” – The Forrester Wave?: Enterprise Data Catalogs, Q3 2024 (emphasis added)
Let me highlight that: Forrester said that EDCs today need to “transcend traditional metadata repositories and technical user bases”. In other words, catalogs can’t just be data wikis for technical data people any more.?
So what should a modern data catalog look like??
First, Forrester talked about how basic cataloging is no longer enough. Instead, EDCs need to automatically catalog, analyze, and govern your entire data ecosystem, from traditional databases to SaaS platforms, unstructured data, AI/ML repositories, and more.
“Advanced solutions offer features like automated metadata harvesting, cross-platform semantic mapping, policy enforcement, quality validation, and end-to-end lineage. This holistic approach ensures a complete view of all data assets, including AI/ML models, to enhance governance, compliance, and use across the organization.”
Second, this holistic approach can’t be powered by data stewards doing manual work. Instead, AI and automation are key to quickly rolling out catalogs and creating value with them. Note that this isn’t just about cataloging — it’s also about powering data governance and quality efforts, all within the catalog rather than in separate governance and quality tools.
领英推荐
“Modern solutions… offer advanced capabilities, including AI-assisted data discovery, generative AI (genAI) augmentation, ML-driven profiling, automated anomaly detection, predictive tagging, and proactive compliance reporting. These technologies are crucial for streamlining data governance, enhancing data quality, and unlocking actionable insights.”
Forrester then evaluated various cataloging tools based on what they deemed to be the key capabilities of a modern EDC. But instead of focusing on the standard aspects of a data catalog (e.g. metadata management, data discovery, data lineage), they also expected capabilities from what we often think of as separate spaces and tools —?e.g. data governance, security, privacy, etc. Here’s the list of evaluation criteria under “Current Offering” (emphasis is my own):?
In short, Forrester is drawing a line in the sand, arguing that we are now witnessing a “transformation” in the data cataloging space, driven by GenAI, fragmented data estates, diverse user needs, and business-critical use cases. As a result, the best data catalogs can’t just be catalogs anymore. Instead, they should use AI and automation to take over other metadata-driven capabilities like data governance, security, observability, and monitoring.
This is a huge shift but I think it’s ultimately a good one. The data space is incredibly fragmented these days, so if we can merge several different spaces and tools into one, it’s ultimately better for users.?
I personally think of this new idea of the EDC, the catalog that’s more than just a catalog, as a unified control plane —?a comprehensive layer that can manage context, governance, and compliance across diverse tools and for diverse users.?
?? Recognition of the impact customers have with Atlan
Not to bury the lead but… we were named a Leader in the Forrester Wave?: Enterprise Data Catalogs, Q3 2024, with the highest scores across all vendors in the “Current Offering” and “Strategy” categories!?
Atlan got the highest score possible in 15 criteria, including Data lineage; Governance, risk, and compliance (where we were the only company to score a 5/5); Adoption; and Deployment and time to value. The report? recognized us as "an unparalleled partner” for organizations “aiming for democratization and AI- enhanced self-service to governed data”.
“Atlan differentiates itself with a personalized, AI-driven catalog, providing quick value… Atlan’s Third-Gen Data Catalog is quickly outpacing established players by adeptly anticipating and addressing strategic customer needs through automation. Atlan is a visionary player with a clear, ambitious goal: to become the data and AI control plane enabling complex business use cases.”
With the highest possible scores in criteria like Vision, Innovation, and Roadmap, we’re more confident than ever about our vision of building a data and AI control plane, powered by active metadata, with complete configurability, interoperability, and openness to power every data team in every industry, however unique and complex their need.
?? More from my reading list
Top links from last week:
Associate Director @ Capgemini Invent | Data & AI Strategy Consulting, Ex-Fractal, Infosys & Tech Mahindra
6 个月Hi Prukalpa ?Fantastic read. Thanks for sharing key insights in the data catalog space and embedding GenAI to redefine or modernize how we approach the Data governance. I suggest also enhancing it by embedding a sustainability dimension. Are you available for a quick Zoom call to exchange ideas and share my experience in this space?
Marketing Leader | B2B SaaS GTM Advisor | GTM Dialogues Podcast | USA, Australia, India
6 个月It's exciting seeing the development of a new category, or a new way of thinking about the category.
Empowering humans of data @ Atlan | Earlier: Healthcare Innovation; Political Strategy Consulting | Duke University
6 个月I think the crux is clear -- the speed of change is dizzying. Modern data is at higher scales, more velocity, and the stack is increasingly complex! And AI has poured jet fuel on the fire! Governing this is a nighmare! And things have to change if EDCs stand a chance in truly helping data teams tame the madness. Cant keep up the old way -- not if they care about getting to actual outcomes anyway. And DEF cant do it by simply slapping *genai* on top of stuff either. Needs a fundamentally different mindset. Not slapping *automation* to a static catalog, but an automation-first catalog. Not just claiming to integrate disparate suite of products, but a truly integrated platform. Not bringing people after the fact with a UI pretty up but the same UX with 10x more clicks, but a fundamentally diff UX that is integrated "in your workflows" from day 0. Excited for this time in the Enterprise Data Catalog era! ??
SVP Client Insights Analytics (Digital Data and Marketing) at Bank Of America, Data Driven Strategist, Innovation Advisory Council. Member at Vation Ventures. Opinions/Comments/Views stated in LinkedIn are solely mine.
6 个月Thank you for sharing the insightful report I completely agree about with how AI can enhance cataloging and help find the place in the data ecosystem. Second this needs to part of the Startegy itself and not as an after thought those tendencies are still there unfortunately.