Building a Strong Enterprise Data Governance Strategy with OneLake & Microsoft Purview

Building a Strong Enterprise Data Governance Strategy with OneLake & Microsoft Purview

Data governance is a critical practice to ensure your data is discoverable, trustworthy, secure, and compliant across the enterprise. Microsoft Fabric introduces the OneLake Data Catalog as a centralized place to find and govern the data you own. OneLake Data Catalog provides built-in governance insights (via a Govern tab) and seamless integration with Microsoft Purview for unified governance and compliance.

Implementing Data Governance in OneLake Data Catalog

OneLake Data Catalog is part of Microsoft Fabric and offers two main views: Explore (for browsing data items) and Govern (for governance insights). The Govern tab provides at-a-glance insights into the governance posture of your Fabric data and recommends actions to improve it. Follow these steps to leverage OneLake Data Catalog’s governance features:

1. Open OneLake Data Catalog and Navigate to Govern Tab – In your Fabric portal, click the OneLake icon on the left navigation pane to open the OneLake Data Catalog. By default, the catalog opens on the Explore tab (showing all your data items). Switch to the Govern tab at the top to access governance insights. (The first time you open the Govern tab, it may take a moment to load as it prepares the initial governance report.)


Use the OneLake icon (highlighted on the left) to open the catalog. The top menu includes Explore and Govern tabs – click Govern to view governance insights. In the Explore view above, you can see a list of data items and their details; the Govern tab will instead display governance metrics and recommendations.

2. Review High-Level Governance Insights – Once on the Govern tab, you’ll see a summary of your content’s governance status. This “at a glance” overview shows metrics like how many items you own, how they are distributed by type, how many workspaces and domains they span, and recency of data refreshes . For example, you can quickly see the number of items of each type (reports, lakehouses, datasets, etc.), the percentage of your items with recent refreshes (to flag stale data), and how many items have descriptions or tags. These insights help identify potential issues, such as a large number of out-of-date items or assets lacking descriptions. (Insights are personalized to the data you own in Fabric – essentially the items visible under “My items” in the catalog.)

3. View Detailed Governance Reports – For more in-depth analysis, click the “View more” link on the Govern tab. This opens a page with all available governance insights, including specific sections on sensitivity label coverage and item metadata completeness. Here, OneLake provides interactive visuals and charts for various governance aspects of your data such as:

a. Your Data Estate – Overview of the content you’ve created: number of domains, workspaces, and items, an interactive data hierarchy (drill down from domain to individual items), item count by type, items by last refresh date (to spot stale items), and items by last access date (to find potentially unused data)

b. Sensitivity Label Coverage – What portion of your items have sensitivity labels applied, how labels are distributed, and lists of unlabeled items (by type, or recently refreshed/visited) which might pose a security risk if not properly labeled

c. Discover, Trust, and Reuse – Insights into metadata that drives data trust, such as how many items are endorsed (Promoted, Certified, or Master Data), how many items lack endorsements, how many have descriptions, and usage of tags. For instance, you might see that only a small percentage of your datasets are certified, indicating an opportunity to increase certification for quality assurance

4. Identify Governance Gaps – Use the above insights to pinpoint areas for improvement. Common governance gaps include:

  • Stale or Outdated Data – Items not refreshed in a long time. The insights highlight items that haven’t been refreshed in over 4 months, as these are likely stale and possibly unused.
  • Missing Sensitivity Labels – Unlabeled items are flagged because data without a sensitivity label could be misused or shared inappropriately, lacking the protection labels would provide
  • Lack of Endorsement – If most data assets are not endorsed (Promoted/Certified), users may not know which data is trustworthy. For example, a very low percentage of endorsed items indicates that the organization hasn’t reviewed or marked reliable datasets, which can reduce trust and encourage duplicate or unofficial data copies
  • Incomplete Metadata – Many assets without descriptions, owners, or tags. Missing descriptions can lead to misunderstandings or misuse of data because users lack context

5. Use Recommended Actions – Scroll down to the Recommended actions section on the Govern tab. Here you’ll find cards suggesting specific actions to improve your data’s governance posture. Each card targets an issue discovered in the insights. For example, you might see a card like “Increase the percentage of endorsed items” if few of your items are endorsed, or “Apply sensitivity labels to unlabeled items” if you have many unlabeled assets. Click on a recommendation card to see details: it will explain why the issue matters and provide a step-by-step “How to fix it” guide.

Example of a recommended action tooltip in OneLake Data Catalog. In this example, only 1% of the user’s items are endorsed, triggering a recommendation to increase the percentage of endorsed items. The pop-up explains why endorsement is important (endorsed items – marked as Master Data, Certified, or Promoted – increase trust and reuse, whereas lack of endorsement can lead to duplicate and low-quality data) and outlines how to fix it (e.g. review items without endorsements and request endorsement for them). This guided approach helps data owners take actionable steps to improve governance.

6. Leverage Built-in Governance Resources – The Govern tab also provides quick links to useful tools and learning materials to assist your governance efforts. In the “Get help with your data governance efforts” section, you’ll find:

  • Top solutions – Links to relevant Microsoft Fabric solutions for governance, compliance, and security (for example, links to set up Monitor for activity tracking, or to open the Purview Hub for advanced governance)
  • Read, Watch, and Learn – Curated documentation, tutorials, or videos on data governance topics, so you can dive deeper into best practices or specific features. Use these resources as needed. For example, the Monitor hub is useful for tracking refresh failures or events across Fabric workspaces, and the Purview Hub can be used to see broader governance insights within Fabric (more on Purview integration below).

7. Viewing and Customizing the OneLake Governance Report – Behind the scenes, Fabric generates a “OneLake catalog governance report (automatically generated)” in your My Workspace the first time you open the Govern tab. This is a Power BI report and dataset (semantic model) that powers the Govern tab visuals. Every time you open or refresh the Govern tab, this report’s data is refreshed with the latest metadata from your Fabric items. If you want to view the full report directly, you can navigate to your My Workspace in Fabric and open the OneLake catalog governance report. This allows you to see all the governance metrics in a full-screen report experience (the same content that “View more” shows).

You can even customize or extend the report on your own terms: for example, create a copy of the report and edit that copy to add new visuals or tailor it to your needs. Note: Do not edit the original auto-generated report or its dataset – modifying the original can break the Govern tab’s functionality. Instead, always make a copy or build a new report off the provided dataset if you need a custom view.

Integrating OneLake Data Catalog with Microsoft Purview for Unified Governance

Microsoft Purview is a comprehensive suite of data governance, risk, and compliance solutions that works hand-in-hand with Microsoft Fabric to govern your entire data estate. OneLake Data Catalog (the Fabric catalog) is integrated with Purview to ensure that governance and compliance are unified across platforms. This integration allows you to manage Fabric data in the context of enterprise-wide policies and oversight. Key integration points include:

  • Unified Data Catalog and Metadata: Microsoft Purview’s Unified Data Catalog automatically incorporates your Fabric (OneLake) metadata and makes Fabric items discoverable in Purview. This means that assets you create in Fabric (such as lakehouses, datasets, warehouses, Power BI reports, etc.) will automatically show up in Purview with their metadata and relationships, via a live connection.
  • Sensitivity Labels and Information Protection: Microsoft Purview’s Information Protection integration allows Fabric to use the same sensitivity labels defined in Purview across all Fabric items.
  • Data Loss Prevention (DLP) Policies: Microsoft Purview’s DLP capabilities extend to Fabric to help prevent sensitive data exfiltration or mishandling. Currently, Purview DLP integration supports Fabric’s Power BI semantic models (datasets) in particular/
  • Audit and Activity Monitoring: All user activities in Microsoft Fabric are automatically logged to Microsoft Purview Audit
  • Purview Hub in Fabric: To make this integration even more seamless for Fabric users, Microsoft Fabric includes a Purview Hub (preview feature) within the Fabric portal

Viewing Governance Reports in Microsoft Purview

Microsoft Purview provides a rich set of built-in reports and dashboards to help governance stakeholders monitor the health of the data estate. These reports are part of the Purview Data Estate Insights experience, which is designed for roles like data stewards, officers, and administrators. If you have a Purview account set up (often accessible via the Azure portal or a Purview Studio link), you can view these governance reports as follows:

Access the Purview Portal – Open the Microsoft Purview governance portal. This could be through the Azure portal (by navigating to your Purview resource and clicking “Open Purview Studio”) or via a direct URL. Ensure you have the necessary Purview permissions (like a Data Curator or Reader role) to view insights.

Navigate to Data Estate Insights – In the Purview portal, look for an Insights or Data Estate Insights section. In the classic Purview interface, there is a dedicated “Data Estate Insights” blade. In the new unified Purview portal experience, these reports might be accessible within the Data Catalog area (often via an Insights tab or menu). Once you open the Insights section, you will see multiple categories of dashboards, typically organized as Health, Inventory & Ownership, and Curation & Governance


This example dashboard provides an overview of data estate health and performance. At the top, summary metrics show Asset curation (what percentage of assets are fully curated with metadata), Asset ownership (how many assets have an owner assigned vs. no owner), and Catalog usage (monthly active users of the data catalog). The table below breaks down these metrics by collection, indicating for each data collection the number of assets, percentage with sensitive classifications, curation status, ownership assignment rate, percentage of assets without classifications, etc. The bar chart at the bottom shows asset curation over time. Purview’s built-in reports like this help data officers quickly identify governance gaps (e.g., assets with no owner or missing classifications) and track improvement over time.

Use Health Reports for High-Level Metrics – Under the Health category, you’ll find dashboards like Data Stewardship and Catalog Adoption. The Data Stewardship report (as shown above) highlights key performance indicators for data governance – for example, what percentage of assets are curated with complete metadata, how many assets have owners (vs. unassigned), and classification coverage. It also shows adoption metrics like weekly/monthly active catalog users and search queries, helping you gauge how well the organization is utilizing the data catalog

The Catalog Adoption dashboard specifically focuses on usage: number of active users, number of searches, viewed assets, and top search terms, giving insight into how engaging the catalog is. These health reports help leadership understand the overall trajectory of your data governance – e.g., is catalog usage increasing and are data curation efforts paying off.

Review Inventory and Ownership Reports – Under Inventory & Ownership, a key report is Data Assets (inventory insights). This report provides a summary of your entire data estate inventory, broken down by sources and collections. You can see total asset counts, how assets are distributed by environment or source type (Azure SQL, Amazon S3, Fabric OneLake, etc.), and track changes like new assets, deleted assets, or stale assets over the last 30 days.

Examine Curation and Governance Reports – Under Curation & Governance, Purview offers reports on Glossary, Classifications, and Sensitivity Labels, among others.These help you measure how well your data is curated with semantic metadata:

  • Business Glossary report: shows the health and usage of your glossary terms (e.g., how many terms exist, how many are approved vs. draft, and how many assets are linked to glossary terms) – giving insight into whether business terminology is being applied to data.
  • Classifications report: details the types of classifications found in your scans and which assets have been classified.
  • Sensitivity Labels report: summarizes the sensitivity labels applied to assets and allows you to review labeled vs. unlabeled content. It often highlights how many files or data assets carry each label (e.g., Public, General, Confidential, Highly Confidential) and can list assets that were labeled by the scans
  • Data Governance Health (overall scorecard): Purview also provides an overall Data Governance or Data Estate Health report. This is an aggregate view that combines various metrics (like percentages of assets classified, labeled, with owners, with glossary terms, etc.) into a high-level health score or overview.

Drill Down and Export Insights – In all Purview insights reports, you can typically interact with the visuals. Selecting a segment (like a particular classification type or a specific domain/collection) will filter the data. If you spot a problem (say, 25% of assets have no classifications in a certain collection), you can click to get more details on those assets. Purview often provides a way to export these details or navigate directly to the Data Catalog listing of those assets for remediation. For instance, the insights might have a “View assets” link when you see a statistic like “100 assets with no owner” – clicking that could take you to the catalog view filtered to those 100 assets so you can assign owners.

Best Practices for Effective Data Governance in OneLake and Purview

  • Organize Data by Domains (Use Fabric Domains)
  • Ensure Clear Data Ownership and Stewardship
  • Apply Sensitivity Labels and Classifications Consistently
  • Use Endorsements to Signal Data Quality
  • Improve Metadata: Descriptions and Tags
  • Regularly Prune and Refresh Data
  • Monitor Progress and Iterate
  • Foster a Data Culture of Accountability

By following these best practices, you’ll enhance the effectiveness of your data governance program. Microsoft’s documentation emphasizes a federated, collaborative approach to data governance – using central tools but engaging data owners throughout the organization.

OneLake Data Catalog and Microsoft Purview together provide the technical foundation for this: OneLake makes it easy for individual users to do the right thing (through built-in recommendations and metadata management), while Purview gives oversight and control at the organizational level.

Conclusion

In summary, OneLake Data Catalog in Microsoft Fabric offers a user-friendly way to implement data governance on the data you work with every day, providing insights and guided actions to improve data quality, security, and compliance. Its integration with Microsoft Purview ensures that this governance scales enterprise-wide – unifying your Fabric data with the rest of your data estate under common policies and oversight. By following the step-by-step instructions to use the OneLake Govern tab, taking advantage of Purview’s unified governance features, regularly viewing governance reports, and adhering to best practices, you can establish a strong data governance framework. This will not only help protect and manage data (meeting compliance requirements), but also boost confidence in your data – enabling your organization to unlock more value from it in a responsible way. Data governance is a journey, but with the right tools and practices, it becomes a natural part of your data culture. Happy governing!

要查看或添加评论,请登录

Chandan Bilvaraj的更多文章

社区洞察