When Data Quality Fires Break Out, You're Always First to Know with Acryl Observe
Acryl Data
Reliable Data. Compliant AI. Simple | Driving DataHub, the #1 Open Source Metadata Platform | Discover, Govern, Observe.
It’s the home stretch for your company’s quarterly financial reporting, just hours before 7 AM—when earnings are due to be released.
That’s when an eagle-eyed analyst spots an anomaly in the revenue data: the deferred_revenue column is missing for the last few days of the quarter. This means your company’s earnings report probably understates future revenue obligations.
Your Monday morning ruined. Yet again, it’s all hands on deck for you and your team.
At Acryl, we don’t just empathize, we sympathize. We’ve lived this too.
That’s why we built Acryl Observe.
Introducing Acryl Observe
Observe is a complete data observability solution integrated with Acryl DataHub . It helps you detect data quality issues as soon as they happen so you can address them proactively, rather than waiting for them to impact your business’ operations and services. And it integrates seamlessly with all data warehouses—including Snowflake, BigQuery, Redshift, and Databricks.
But Acryl Observe is more than just detection. When data breakages do inevitably occur, it gives you everything you need to assess impact, debug, and resolve them fast; notifying all the right people with real-time status updates along the way.
Perhaps the most important piece of all, Acryl Observe is built on Acryl DataHub, the leading data catalog. By unifying traditionally siloed tools and capabilities—Data Observability, Data Governance, and Data Discovery—the platform helps your organization reduce complexity, optimize costs, and increase the accessibility and adoption of data throughout your entire organization.
“We chose Acryl because we see the value of having both a data catalog and observability capabilities in one tool. Having data owners, maintainers, and consumers in one place streamlines incident management and allows for faster time to resolution.” — Olivier, Data Engineering Manager at Depop
Be the First to Know
When I was a data product engineer at LinkedIn, it was normal for data users to discover breakages before I did. Somehow, some way, the data teams were always the last to know. With Acryl Observe, you can begin to flip the script. Data teams can be the first to know, by configuring automated data quality checks for
It offers four types of pre-built checks out-of-the-box: Freshness Assertions, Volume Assertions, Column Assertions, and Custom SQL Assertions to address a broad range of needs spanning structural and semantic integrity.
It also includes out-of-the-box anomaly detection—Smart Assertions—that help cover your blind spots, using AI models trained on your tables’ history.
When something does go wrong, your team will be notified immediately, with alerts that reach you where you work—Slack, email and more. You can configure alerts, ensuring they’re sent to the right people at the right time, giving your team a chance to get ahead of data quality problems before they become major incidents.
In other words, no more harshly worded emails. Less time coordinating expensive backfilling efforts across many tables and reports. Your team can respond to these incidents before they disrupt the business, and ultimately win your organization’s trust. Time and time again.
One Tool for Triaging and Responding to Incidents
Anytime I was debugging a data quality issue at LinkedIn, my journey usually started with the catalog—DataHub.
领英推荐
Most of what I needed was already there: detailed lineage metadata; documentation; dataset ownership; compliance information; and recent profiling statistics. This was all useful context for me to begin the painstaking process of fixing bad data.
Even so, DataHub was just one of several tools. For example, DataHub didn’t have automated alerting capabilities. Today, you can use its Assertions framework to track custom data quality checks, but you have to do the work of reporting the results. And there is no mechanism for getting notified when these fail. You have to build and maintain these mechanisms yourself.
It also didn’t have incident-management features. To resolve data outages, our team bounced between scattered Slack threads for communication, Azkaban logs (Azkaban was our version of Apache Airflow), and DataHub for assessing impact and finding the relevant stakeholders.
This lack of centralization often left stakeholders waiting in the dark when major data incidents arose, and left our team scrambling to pick up the pieces instead of preventing issues altogether.
It was clear that in an ideal world, the catalog would have been the source of truth – where incidents are raised & resolved collaboratively, Slack threads are tracked, accountability is established, and real-time table statistics and lineage is surfaced– a central hub for all your data activities.
We built Acryl Observe on this vision – offering an end-to-end solution for resolving data quality incidents fast.
With the combination of Acryl Cloud, Observe helps you stage an effective response to data quality breakages by providing:
With Observe, you get a central command center for triaging incidents and coordinating an effective response before things get worse.
A Bird’s Eye View of Your Data’s Health
One final capability I wish I had during my time in data engineering: a visual overview of the state and health of the data stack. At LinkedIn, my team and I sometimes felt as if we were flying blind given the sheer scope of data assets we were responsible for; reactively responding to the hottest fires of the day and failing to track or improve our situation as a whole. This meant the small, but still meaningful problems often went untracked and unresolved indefinitely.
We built the Data Health Dashboard into Acryl Observe to make sure that other teams have a better experience than we did. This dashboard provides a real-time overview of the data quality and health of your entire ecosystem. This way you can see where the high priority fires are and begin to build a scalable, sustainable strategy for keeping things green across the board.
My favorite part of the dashboard is that data team members – and even individual stakeholders – can filter down the dashboard views to focus on their specific areas of interest. This makes it incredibly easy for anyone to monitor the health of their region in the ecosystem, and never lose track of lingering issues.
A Unified Solution
All together, Acryl Cloud gives you a central control plane for your data—unifying Data Discovery, Data Governance and Data Quality in a single platform.
By unifying these related, but traditionally siloed, capabilities, you can remove your data team as a central bottleneck, and empower your team to do more with less. Unifying these capabilities can reduce conflict in shared context like data ownership and documentation; and alleviate the operational overhead (and cost!) of maintaining several different tools. The end result is a more effective, scalable, & sustainable way to manage your organization’s mission-critical data.
Discover the Acryl Observe Difference