The Coming Out Party for Data Observability, at Big Data LDN?
Wei Wen Chen
I write about data management, analytics, artificial intelligence and machine learning. Please connect with me and we will learn and grow together.
On a rainy Wednesday September 20th, 2023, the Big Data LDN event held at the Olympia London, UK, was brimming with energy and enthusiasm surrounding the idea of Data Observability. The Big Data LDN event chaired by the brilliant Mike Ferguson , is the UK's leading free-to-attend data, analytics, and AI conference. It gave attendees the opportunity to discuss business requirements with over 180 leading technology vendors and consultants, hear from 300 expert speakers, and to network with peers. Mike confirmed that 20,000 were expected to attend over the course of 2 days, and it certainly felt that way.
Is this Data Observability's Coming Out Party?
IMHO the event marked a significant shift in the data landscape, with a dedicated "Data Observability and DataOps" theatre, highlighting the growing importance and recognition of this domain. Could the European's be ahead of the US in embracing Data Observability?
During my presentation on "How to use Data Observability to improve your analytics and AI, while saving money and staying compliant " (DM me for the slides), I conducted an informal poll in which I asked the packed crowd,
"Who here has heard of Data Observability (DO)?"
To which, half of the attendees raised their hands, which I thought was very encouraging. However, when I inquired about its deployment, only one person acknowledged so far, which I thought was fantastic, as there were so many that were here to gain understanding and were interested in adopting in the future.
Later on in the session I delved into Data Quality (DQ) implementations, with the question
How many here have implemented Data Quality?
and shockingly a mere 10% (estimated) of the audience responded affirmatively. I took away the possibilities as being:
In fact overheard at our booth most frequently
What's the difference between Data Observability and Data Quality?
A quick aside the evolution of Data Quality has been a fascinating journey. From its early days to its current state, the importance of ensuring accurate, consistent, and complete data has never been more paramount. In my previous article, "The History, Evolution and Future of Data Quality" , I had delved into the intricate details of this evolution and what the future holds.
But I digress, while both concepts aim to ensure the reliability and accuracy of data, they approach the challenge differently. Data Quality focuses on the correctness, consistency, and completeness of data. It's about ensuring that the data stored in databases and used in reports is accurate. On the other hand, Data Observability is about having visibility into the entire data ecosystem. It's about understanding how data flows through systems, where it comes from, where it's going, and how it's being transformed along the way. In essence, IMHO while Data Quality is about maintaining "state" of the data, Data Observability is about the "journey" of the data.
领英推荐
Another significant difference is the velocity, variety and volume of data that can be handled by data observability compared with traditional data quality tools. In order to improve Trust in Data (see "How to Lego and Trust your Data with Data Observability" ) data observability must support "shift left", whereby the monitoring and prevention of bad data entering the data pipelines occurs at the very start, well before traditional data quality and data governance even begin. This "shift left" means the volumes and velocity of the data being processed far exceed anything DQ and Governance products were architected for. Since they were incubated almost 15 years ago.
Another topic in my session were the other elements of data reliability beyond traditional DQ. Including schema drift, data reconciliation and more. I'll be writing more about those subjects in a future post.
Big Data is Back, with a Vengeance
In my article, "Acceldata Blows Past 0.5 Exabytes of Data Observed Monthly as Enterprise Data Observability Accelerates" , I delved deeper into the rapid adoption of Data Observability in enterprises and how the amount of data being observed is nearing Exabyte scale. That's certainly some big data.
Additionally, metadata is now forming an increasingly larger amount of the pie than it did previously, as the output and outcomes of Data Observability create significant amounts of metadata in the form of lineage, rules, alerts and events. All of which allows technologies such as AI and ML to uncover even more insight for recommendations, and eventually autonomous outcomes.
Bewgle is an AI company specializing in NLP (Natural Language Processing) and LLM (Large Language Models). Welcome to Acceldata, Bewgle team!
AI continues it's Upward Momentum
Which leads nicely to the next topic I saw buzzing around the event.
Generative AI & Vector Databases was of course discussed ad infinitum. Just like the Google Next 2023 event I recently attended, the hype was off the charts. The promise of businesses futuristic outcomes. Like others I've written about these topics vigorously over the last 6 months (see "Data and AI form a Dynamic Duo for Futuristic Business Outcomes" ), and "Generative AI: Augmenting & Automating Intelligence with Human in the Loop" )
But it's clear that we are only at the beginning and Data Observability has a huge roles to play, as I explained in my session, I opined "Can Data Observability Save Us from AI Disaster?" Certainly with the Bewgle acquisition, Acceldata is now in the best position of any company to support enterprises looking to make the most out of AI, and to do it responsibly and safely.
Vector databases were also a hot topic, I met several startups looking to optimize how data is being moved and loaded into these new databases, including the CEO of Superlinked at an event after party. He described his offering as "ETL for Vector Databases"
Show me the Money!
Finally my presentation also touched upon the topic of Spend Intelligence, and uncovering waste, and optimizing the use of popular tools like Snowflake and Databricks , who could easily be optimized with data observability. (See "Time is Money. How Data Observability is Your Business’ 401k" ) to learn how.
All this in Day One!
I'll close by saying that this event might well be the coming out party for Data Observability. The exhibit hall featured many companies offering such products, as well as vendors who were looking to partner. I originally thought of just writing a short post, but turned this into an article because I think this is a milestone in history of data observability, and I wanted it to be officially recorded from my perspective.
Our team and I attending the show and back home in the US and Bangalore, couldn't be more excited about the space. It's why I joined Acceldata , and this is just the beginning!
Thanks for coming Ramon!
I write about data management, analytics, artificial intelligence and machine learning. Please connect with me and we will learn and grow together.
1 年Duncan Slater Kate Tickner scooby doo, where are you???