Faulty Data is Worthless: Why Data Observability is the Next Tech Breakthrough

Faulty Data is Worthless: Why Data Observability is the Next Tech Breakthrough

By Oren Yunger and Glenn Solomon

Barr Moses was just 18 years old when she was appointed a commander of a data analysis unit in the Israeli Air Force. Responsible for the professional and personal development of a group of soldiers at such an early age, Barr learned a lot about what it takes to support a team to produce its best results. Now, 15 years later, as co-founder of big data startup Monte Carlo, she’s putting those lessons to work to help enterprises find and fix their broken data. Monte Carlo has just closed a $25 million Series B round co-led by GGV Capital and Redpoint Ventures, to bring total funding to $40 million and is fast becoming the leader of the emerging data observability category.

When Barr was hatching the idea for Monte Carlo, she realized that just as the unity of a brigade will falter if any of its soldiers are unhappy or out of sync, an organization will break down if the data that fuels its tech infrastructure is faulty or unreliable. If data engineers are forced to rely on inaccurate data, their predictions and models will of course also be inaccurate, resulting in bad business decisions that in turn result in millions in lost revenue. In fact, poor-quality and inaccurate data cost companies more than $3 trillion per year

For those who manage, process, and analyze data every day—data scientists and engineers, VPs of data, chief data officers—they’re well aware that “garbage in, garbage out” is the root cause of most faulty data-based decisions. They’re frustrated they cannot do their jobs successfully simply because they aren’t provided with accurate, reliable data. Yet other technologists and leaders within a company—from software engineers to CTOs, CIOs, and CEOs—often don’t realize just how much poor data quality is hampering their businesses. And, unfortunately, though software engineers have long relied on application performance management (APM) platforms such as New Relic and DataDog to avoid downtime, data professionals have had no such tools to ensure data reliability

That’s why Barr and her co-founder Lior Gavish (who also happens to be her husband) decided to start Monte Carlo in 2019. Monte Carlo’s mission is to accelerate the world’s adoption of data by minimizing data downtime. Data downtime refers to periods of time when a company’s data is partial, erroneous, missing, or otherwise inaccurate. The problem of data downtime is of course hugely problematic for data-driven organizations (which most successful companies are today) because they draw false conclusions from faulty data—and don’t even realize they’re doing so! Data downtime occurs for a variety of reasons, including unexpected changes in schema or buggy code. Faulty data is challenging to catch. It is usually spotted by an employee, or more embarrassingly, a customer or partner, who questions a finding in the analytics dashboard; data inconsistencies are often fixed far too late, after they’ve already spurred inaccurate conclusions. And even when a data engineer does realize that some data is “broken”, it’s time-consuming to understand the root causes of inaccuracies and figure out how to fix these issues.

In a world that’s becoming more and more data-driven, companies can no longer afford to base critical decisions on faulty data. Just one inaccurate data loop can result in a huge mistake, such as reporting an inaccurate number to Wall Street or mischarging a customer. Why is data often unreliable and what can be done to fix that issue? Data is often unreliable because companies lack visibility into the architecture of their data stack. They can see, at a high level, what’s in their data lakes and databases using visualization software that runs on top of these structures. They see a crude map of upstream and downstream data and all the dependencies from these streams. But in a typical mid-sized company, more than 50 different people make changes to data on a daily basis, and visualization software doesn’t show all the real-time changes and spot where discrepancies arise. To solve this, enterprises must turn to data observability. The Monte Carlo Data Observability platform uses AI to provide a more accurate map of a company’s data. The AI algorithms “listen” to data to spot inaccuracies and uncover any issues with freshness. For example, Monte Carlo could spot a data schema change that doesn’t show up on visualization software because someone had accidentally or purposefully deleted it. Monte Carlo’s software also relies on machine learning to understand how a company’s underlying data system operates, spotting deviations from the norm.

We invested in Monte Carlo’s Series A and Series B rounds because it solves a big problem: that companies today cannot rely on their data to be 100% reliable. Companies collect and store myriad types of data, from customer and sales data, to performance data on their products, to third-party data from outside sources. But just having a ton of data won’t help companies make smart decisions. They need to be able to trust that all their data is accurate and fresh. Monte Carlo has already amassed multiple big-name customers, including some of Israel’s fastest-growing companies such as Hippo and Yotpo, as well as Eventbrite, Compass, and Mindbody in the US. And the company is growing quickly as more and more companies realize that identifying spotty data is critical to business success.

Data is the fuel that will power the global economy in the 21st century, informing critical decisions in every industry, from retail and insurance, to real estate, healthcare, banking, government, and beyond. But you can’t make smart decisions based on faulty data. In the next few years, data reliability will become a common challenge that every organization must tackle, and data observability will hold the key to its success.

Erik Bloch

Super security nerd | Security Tinkerer | 30+ year vet of the cyberwars

4 年

Why the startup I was at a few year ago failed. We “assumed” the data was available and actionable. Instead it was garbage in, garbage out.

Jake Makler

AI @ IBM | Advisor | Writer (jakemakler.com) | Dad

4 年

Exciting company, exciting space Oren Yunger !

Daniel Karp

General Partner @Cervin. Early-stage VC Investor in Infrastructure, Security, DevOps, OSS, Data and Cloud-native startups. Ex-Cisco M&A/VC, Microsoft Azure

4 年

Congrats Lior Gavish, Barr Moses Oren Yunger - category defining indeed!

要查看或添加评论,请登录

Oren Yunger的更多文章

社区洞察

其他会员也浏览了