A few months ago, I had the opportunity to chat with Bob Muglia, former CEO of Snowflake, about his journey building one of the most defining data companies of the 21st century.
During our conversation, Bob shared some fun stories about Snowflake’s early days, waxed poetic about values-driven companies, and revealed his predictions for the future of the modern data stack.?
Trends discussed included data sharing, knowledge graphs, domain-oriented ownership, and even data as a product, but one theme reigned supreme: the need for data trust. According to Muglia, the only way to operationalize data and truly put analytics to work is if you can trust it. And the only way you can trust it is if you have end-to-end standardization and observability.
Here are 4 reasons why I think 2022 will be the year of Data Observability:?
- Data is *actually* becoming more democratized. For the past 10 years, thought leaders have talked a big game about the rise of data democratization, but manual tooling and siloed approaches have made it hard to scale. The good news? In 2021, we’re finally starting to make some progress. Data sharing has emerged as a capability for data-driven organizations, analytics engineers are bridging the gap between data collection and business intelligence, and codeless analytics tools are pushing data’s cognitive load downstream.?
- Data engineering is becoming a first-class citizen. In a 2016 article, Maxime Beauchemin, creator of Apache Airflow, argued that data engineering was “the worst seat” on the data team because the job was hard and the respect from business stakeholders was minimal. Fortunately, the rise of automated tooling, the introduction of DevOps and agile principles to data engineering workflows, and a more codified set of responsibilities (managing data reliability, scale, infrastructure, etc.) are giving data engineers greater ownership over their narrative by allowing them to spend more time building and less time firefighting.
- Data pipelines are becoming productized. One of the buzziest topics of 2021 was the concept of “treating data like a product,” in other words, applying the same rigor and standards around usability, trust, and performance to analytics pipelines as you would to SaaS products. Under this framework, teams should treat data systems like production software, a process that requires contracts and service-level agreements (SLAs), to help measure reliability and ensure alignment with stakeholders.?
- Data is becoming increasingly tied to operations. Revenue and marketing teams have long relied on analytics to inform decision making, breeding an entirely new suite of business intelligence tools like Looker and Tableau, and expediting our migration to cloud data warehouses like Snowflake, BigQuery and Redshift. Now, functional groups across the enterprise are relying on data not just to drive insights but also power digital services and products. This means bigger budgets, more headcount, and greater executive visibility for data teams.?
All this is to say: I can’t wait to see what the future holds for data observability. Regardless of what happens, one thing is for sure: we’re just getting started.
- The Year in Data & Analytics - 2021: (7-minute read) In his annual end-of-year trends summary, James Densmore, HubSpot’s Director of Engineering, Business Intelligence, highlights his take on the biggest movements in data analytics and engineering in 2021. On his list? The evolving data org structure, orchestration, and reverse ETL. (PS - be sure to read James’ new bite-sized O’Reilly book on building data pipelines - the perfect length for busy data leaders).
- Gitlab’s Trusted Data Framework. (13-minute read) This isn’t a particularly new resource, but Gitlab’s public Data Team Platform docs is a must read for anyone looking to brush up on their data hygiene. Specifically, their section about how they build and maintain a trusted data framework highlights the benefits of investing in a multi-layered data reliability stack.?
- How to Choose the Right Structure for Your Data Team (10-minute read) Second perhaps only to the data mesh, team reporting structure is the most contested topic in data – and for good reason. Centralization leads to bottlenecks and slowdowns, while decentralization can introduce additional complexity and duplication. What’s a data leader to do? Greg Waldman, Senior Director of Business Intelligence at Toast, a newly public provider of point of sale software for restaurants, discusses the evolution of his data team, from centralized to decentralized and (nearly) back again, proposing a hybrid structure that marries the best of both worlds.
- How PepsiCo leverages data to make better business decisions (18-minute watch) In his presentation for Meta’s first-ever Data Observability Day, Vaibhav Kulkarni, VP of Engineering & Data Science at PepsiCo, walks through how his team designed their modern data stack, built on Snowflake, Tableau, and Databricks, to power their ROI engine and better understand the effects of their marketing and advertising initiatives. According to Vaibhav, the success of the platform relies on one simple rule: treat data quality like a first-class citizen.?
- Why Is Treating Data Like a Product So Hard? (3-minute read) Earlier this year, Eric Weber, Senior Director of Experimentation and Causal Inference at Stitch Fix and popular LinkedIn influencer, launched a Substack dedicated to answering the question: how do we apply product management methodologies to data systems? Even though we know this is best practice, it doesn’t make it any less challenging (hence, the onslaught of VC capital being thrown at the problem). In this post, Eric offers a few reasons why, outlining important differences between data and software, including the focus on internal vs. external products and the difficulty around quantifying the value of data.
SaaS, Data-driven Marketing Leader
3 年Thanks for the invite, Barr! We just opened up the CFP for Zero Gravity: A Modern Cloud Data Pipeline Event. Sharing here in case you or any of your followers are interested. Incorta.com/zerogravity
Agree with Meir - instantly subscribed & looking forward to more. Am especially interested in podcasts you speak on or reccomend.
Global Head of Data Technology at MarketAxess
3 年Nice writeup here Barr!
Nice summary Barr Moses love the proposition that Monte Carlo is putting forward. Time to apply the principles put forward by New Relic and Datadog to observing data quality, ergo. #dataobservability
Product, AI and data
3 年Great read! Excited to see the future developments in data observability