登录查看更多内容

Crazy or Genius? Ingest data to Microsoft Fabric / OneLake via Import Semantic Models for better performance than Dataflows

Jon Stjernegaard V?ge

Microsoft MVP | Data Speaker, Trainer & Consultant | Founder @ Fabric Symposium

发布日期: 2024年5月14日

+ 关注

Crazy or genius? Ingesting data to Microsoft Fabric / OneLake with Import Semantic Models instead of Dataflow?Gen2

Introduction

A Fabric feature which has flown a little under the radar (and which is still in Preview as of May 2024), is the ability to sync data from imported Semantic Models back to OneLake: Learn about Microsoft OneLake Delta table integration in Power BI and Microsoft Fabric?—?Power BI | Microsoft Learn

The integration writes data from all Import Tables in the Semantic Models to Delta Tables in OneLake, and once stored there, they may be accessed through by all the usual means, for example by creating shortcuts to the tables from inside a Lakehouse, or even browsing the files in your Desktop OneLake File Explorer.

This article will cover how to set up the integration, and at the end we’ll discuss if this is genius, or if it is crazy and you should just stick to Dataflows.

How to set up the Integration

First things first. You need to enable two Tenant Settings in your Fabric Admin Portal:

Next, I created a table in Dataverse which I want to Sync to my OneLake:

I connected to the Dataverse table from Power BI desktop, and without doing anything else, published it to a Fabric Enabled Workspace:

Opening the Model from the Workspace, I go into the settings, and enable the OneLake Integration setting:

After applying the new setting, data will be ingested into your OneLake when your Semantic Model is either refreshed manually or by a schedule.

The data can subsequently be accessed by creating a Shortcut within a Lakehouse:

Or even by browsing the files using the OneLake explorer on your desktop:

If I go ahead and alter a few values in the table in Dataverse (in this case, deleting a value, and editing a value):

The updated values show when I query the shortcutted Lakehouse Table, but only after a refresh of the Semantic Model has occurred:

领英推荐

Microsoft FABRIC

?? Christophe Hervouet 9 个月前

How to Handle Missing Data in Power BI?

Anurodh Kumar 3 个月前

Using Data Science and Google Cloud to Win March…

Bill Schmarzo 6 年前

Under the hood, the update is simply performed by creating new Delta Tables, and deleting the old ones. There is no time traveling and retention of old files with this integration:

How does the approach compare to that of using Dataflows?

While the authoring experience is similar, there are the usual differences between Dataflows and Power Query Desktop to take into consideration.

In addition, I would highlight the following Pros and Cons:

Pros

Refresh Performance

Comparing to a simple Dataflow Gen2 which connects to the same Dataverse Table, which has disabled Query Staging, and Query Folding enabled,and using automatic settings to write to the same Fabric Lakehouse but in a new table:

The Semantic Model took 8 seconds for the refresh that had to also create the initial table in OneLake, and 7 seconds to refresh the second time:

While the Dataflow took 17 and 20 seconds to do the initial load and subsequent refresh respectively:

CU Units & Costs:

In terms of costs, as per the documentation, while in preview, the solution only incurs costs (and that is costs in terms of Capacity Units) from the storage and compute of the exported model, while the actual export operation is not billed. I suppose this also hints that the export will eventually draw on your Capacity Units as well, however, a quick look at the Fabric Capacity Metrics App, looking at the Non-Billable line for the OneLake shows that the compute was rather small in terms of CUs for the initial Export, and even smaller for the refresh I did 15 minutes later:

Looking at the numbers for the dataflows, we are seeing much much higher CU consumption from both the Initial Refresh and the Second Refresh:

Cons

On the other side, what are the cons of this ingestion method?

Besides the obvious fact of the feature being in Preview, it also carries the unfortunate consequence that any logic taking place in the data load and transformation in Power Query is impossible to dissect and understand from within the Fabric environment.

Further, to make changes to the ingestion (e.g. adding new tables, adding a column), you will need to download the?.pbix file and make your changes before republishing, or alternatively modify it using Tabular Editor.

What’s the Verdict? Crazy or?Genius?

While the loss of logical transparency is a considerable downside, I am stoked about the potential for achieving much higher performance/lower costs for low-code ETL into Fabric. So maybe it is worth that downside?

I haven’t tested enough to make any final conclusions. Especially considering the feature is still in Preview. But I hope the article inspires more people to test out the functionality, including testing the limits by increasing the volume of data.

Thomas DUCROQUETZ

Développeur de solution décisionnelle avec Microsoft Power BI [Certifié Microsoft PL-300]

10 个月

Seems pretty crazy Jon ! Let’s hope that DF gen 2 will be optimized because it’s the best low code approach in my opinion but seems to be the worst in terms of performance and cost. Thank you very much for sharing.

1 次回应

Vibhas Kashyap

Data Architecture Senior Manager at Accenture

10 个月

Sounds interesting and great ??

Chris Kudahl S?rensen

?? Business Intelligence Architect @ Unit IT //?Microsoft Power BI, Fabric, Azure, Data Warehousing, Data Engineering // Jeg hj?lper virksomheder med at transformere komplekse data til v?rdifulde indsigter ??

10 个月

At least some CRAZY differences ??, but nonetheless, very interesting - thanks for sharing Jon!

1 次回应

Jon Stjernegaard V?ge

Microsoft MVP | Data Speaker, Trainer & Consultant | Founder @ Fabric Symposium

10 个月

Thomas Andersen ?? https://downhill-data.com/2024/05/14/crazy-or-genius-ingesting-data-to-microsoft-fabric-onelake-with-import-semantic-models-instead-of-dataflow-gen2/

1 次回应

查看更多评论

要查看或添加评论，请登录

Jon Stjernegaard V?ge的更多文章

How to setup RLS in Microsoft Fabric / Power BI without breaking DirectLake functionality

2024年3月19日

How to setup RLS in Microsoft Fabric / Power BI without breaking DirectLake functionality

Introduction RLS in Microsoft Fabric is a convoluted topic which at the time of writing, is not very clearly spelled…

6 条评论
Tutorial: Fabric for Analysts - Acting on data with Data Activator

2023年12月11日

Tutorial: Fabric for Analysts - Acting on data with Data Activator

Introduction While Microsoft Fabric promises a collaborative workspace for Data Analysts, Engineers and Scientists…
Tutorial: Fabric for Analysts - Consuming and Reporting

2023年12月5日

Tutorial: Fabric for Analysts - Consuming and Reporting

Introduction While Microsoft Fabric promises a collaborative workspace for Data Analysts, Engineers and Scientists…

1 条评论
Tutorial: Fabric for Analysts - Data Storage & Serving

2023年11月27日

Tutorial: Fabric for Analysts - Data Storage & Serving

Introduction While Microsoft Fabric promises a collaborative workspace for Data Analysts, Engineers and Scientists…
Fabric for Analysts Part 1: Data Ingestion & Preparation

2023年11月20日

Fabric for Analysts Part 1: Data Ingestion & Preparation

Introduction While Microsoft Fabric promises a collaborative workspace for Data Analysts, Engineers and Scientists…

3 条评论
Live data write-back to Microsoft Fabric Data Warehouse and OneLake from Power BI / Power Apps / Power Automate

2023年7月12日

Live data write-back to Microsoft Fabric Data Warehouse and OneLake from Power BI / Power Apps / Power Automate

Introduction Data write-back is a topic dear to me, and you may have come across one of my other articles on the…

5 条评论
Microsoft Fabric & Power Platform - How to get your Dataverse Data into Fabric / OneLake

2023年6月9日

Microsoft Fabric & Power Platform - How to get your Dataverse Data into Fabric / OneLake

Introduction Microsoft Fabric is top of mind everywhere, and its promises for improving the analytic capabilities of…

3 条评论
Guide: Auto-Trigger Button OnSelect Actions in Power Apps when selecting data points in Power BI

2023年5月22日

Guide: Auto-Trigger Button OnSelect Actions in Power Apps when selecting data points in Power BI

Introduction If you have ready some of my previous articles, you know that I find Power Apps embedded in Power BI…

2 条评论
Guide: Data Write-Back directly in PowerPoint Presentations (w. Power BI + Power Apps)

2023年5月15日

Guide: Data Write-Back directly in PowerPoint Presentations (w. Power BI + Power Apps)

Introduction It is now easier than ever to Embed Live Power BI Reports in Power Point presentations. However, an…

8 条评论
Better Power Apps: Patching nested Parent/Child records with Auto-Generated IDs using just one Save Button

2023年3月24日

Better Power Apps: Patching nested Parent/Child records with Auto-Generated IDs using just one Save Button

Introduction This article is the first in a series on best practices for developing better Canvas Power Apps. Have you…

2 条评论

See all articles

Crazy or Genius? Ingest data to Microsoft Fabric / OneLake via Import Semantic Models for better performance than Dataflows

Jon Stjernegaard V?ge

Microsoft MVP | Data Speaker, Trainer & Consultant | Founder @ Fabric Symposium

Crazy or genius? Ingesting data to Microsoft Fabric / OneLake with Import Semantic Models instead of Dataflow?Gen2

Introduction

How to set up the Integration

领英推荐

How does the approach compare to that of using Dataflows?

Pros

Cons

What’s the Verdict? Crazy or?Genius?

Jon Stjernegaard V?ge的更多文章

社区洞察

其他会员也浏览了

Minding the Often Forgotten Microsoft Dataverse Entity Relationship Behaviors in Your Dynamics/PowerApps System Designs

GraphQL

Unlock Your Data Like Never Before: Meet Snowflake's Game-Changing Universal Search

To Amend, Purpose and Bill – Text Analytics and the US Congress

How to bridge the gap between GA4 and BigQuery export

Understanding CQRS

What Are Label Spans?

How to Build RAG application with Knowledge Graph

How to Use Advanced Techniques in BigQuery to Optimize Your Queries

Crazy or genius? Ingesting data to Microsoft Fabric / OneLake with Import Semantic Models instead of Dataflow?Gen2

Introduction

How to set up the Integration

领英推荐

How does the approach compare to that of using Dataflows?

Pros

Cons

What’s the Verdict? Crazy or?Genius?

Jon Stjernegaard V?ge的更多文章

How to setup RLS in Microsoft Fabric / Power BI without breaking DirectLake functionality

Tutorial: Fabric for Analysts - Acting on data with Data Activator

Tutorial: Fabric for Analysts - Consuming and Reporting

Tutorial: Fabric for Analysts - Data Storage & Serving

Fabric for Analysts Part 1: Data Ingestion & Preparation

Live data write-back to Microsoft Fabric Data Warehouse and OneLake from Power BI / Power Apps / Power Automate

Microsoft Fabric & Power Platform - How to get your Dataverse Data into Fabric / OneLake

Guide: Auto-Trigger Button OnSelect Actions in Power Apps when selecting data points in Power BI

Guide: Data Write-Back directly in PowerPoint Presentations (w. Power BI + Power Apps)

Better Power Apps: Patching nested Parent/Child records with Auto-Generated IDs using just one Save Button

社区洞察

其他会员也浏览了

Minding the Often Forgotten Microsoft Dataverse Entity Relationship Behaviors in Your Dynamics/PowerApps System Designs

GraphQL

Unlock Your Data Like Never Before: Meet Snowflake's Game-Changing Universal Search

To Amend, Purpose and Bill – Text Analytics and the US Congress

How to bridge the gap between GA4 and BigQuery export

Understanding CQRS

What Are Label Spans?

How to Build RAG application with Knowledge Graph

How to Use Advanced Techniques in BigQuery to Optimize Your Queries