登录查看更多内容

What is Mirroring in MS Fabric and why do I consider it the next big thing?

Nikola Ilic

I make music from the data??Data Mozart ??| MVP Data Platform | O'Reilly Author | Pluralsight Author | MCT

发布日期: 2024年5月14日

A few weeks ago, my good friend, Tom Martens (x), asked me what are my top 5 “not-so-obvious” features in Microsoft Fabric. You know, we are not talking about Direct Lake, Lakehouses, notebooks, etc, although I already covered most of these in my introductory Microsoft Fabric article. If you are completely new to Fabric, I encourage you to check this article first.

So, I sent my list to Tom, and then he sent me his picks – aside from the fact that our choices matched 3 out of 5 (“Great minds always think in sync” ??, as Tom likes to say) – the thing that made me smile is that one of the three “matching” features was – Mirroring.

This is something I’ve been eagerly expected since it was announced, so I couldn’t wait to put my hands on it and check how it works in real life. Therefore, this article will try to examine this feature from different angles, which should help you evaluate possible use cases and real-life implementations.

DISCLAIMER: At the moment of writing, this feature is still in public preview, which means that things can change before mirroring becomes generally available

Understanding the context

First things first. Before I show you how to leverage this feature in Microsoft Fabric, let’s first explain the feature itself.

But, before we explain the feature itself, we need to go one step back and examine the key logic behind the Microsoft Fabric workloads, so that you understand the full context of the Mirroring importance.

One of the Fabric’s pillars is OneLake. Explaining OneLake is out of the scope of this article, but you can think of it as of “OneDrive for all your organizational data”. All your organizational data should be stored in one central location (OneLake), and even if your data resides somewhere else (let’s say in ADLS Gen2, GCP, Dataverse, or AWS), you can create OneLake shortcuts and have this data available for processing by all Fabric analytical engines, the same as the data would have been stored physically in OneLake. So, no data copying or movement to Fabric – you can access it via shortcuts, while OneLake manages all permissions and credentials. How cool is that! By the way, shortcuts were also on my Top 5 features list…

Image taken from Microsoft's presentation

I believe you are starting to get the idea:) Microsoft Fabric is being sold as a “unified end-to-end analytics solution” – so, how it could be unified if you have your data stored somewhere else, and then you need to establish complex ETL/ELT processes to bring this data physically to Fabric?

“All right, this is cool, but I have my data stored in different sources, such as Snowflake, Azure SQL DB, or Cosmos DB…As far as I see on your illustration above, shortcuts can’t be created to these data sources, right?”

Unfortunately, you’re right (at least, at this moment)! All these databases (and many others) rely on their proprietary storage formats, and Fabric is all about Delta format. Therefore, it’s not possible to simply create shortcuts to these data sources. So, welcome back to reality and the world of creating ETL/ELT pipelines to make this data available for analytics…

What if I tell you that reality can be much better? So much better that you can have your data from Snowflake or Azure SQL DB available in Fabric in near real-time, without the need to build a single ETL/ELT process?!

Welcome to “mirrored reality”!

Remember the idea of the “unified experience”? That’s exactly what the Mirroring feature ensures for “non-shortcut-supported” data sources! You simply provide connection details of the “mirrored” database, and after the initial snapshot has been created, data will be synchronized in real-time! Whenever someone performs insert/update/delete on the source database, these changes will be automatically propagated to Fabric, hence your users will always have the latest data available for their Fabric workloads!

Before I show how this works in real life, a little more of a theory:) (Thanks to Idris Motiwala from Microsoft team working on Mirroring feature, for the clarification). Mirroring uses a special change feed tech in the background, to enable writing directly to OneLake, instead of creating “changed data” tables in the source database. Therefore, data stored in the database’s proprietary format will be “translated” to Delta format and stored as Delta tables in OneLake.

Once in OneLake, you can do all the things you are doing with “regular” Fabric workloads – query mirrored data, or even write cross-database queries to combine the data from the mirrored database, existing Fabric warehouse, or Fabric lakehouse.

领英推荐

June 27, 2024 Newsletter

ProcureSQL 8 个月前

Microsoft Fabric Costs and Capacity Management

Daniel Friedman 5 个月前

Azure KQL: Complexity in a Low Code World

Senserva 1 年前

I hear you, I hear you…”What if we want to leverage a Direct Lake mode in Power BI reports? No chance we can include mirrored database data, right?” You couldn’t be more wrong! Don’t forget, mirrored data is now in OneLake in the Delta file format – so, nothing prevents Direct Lake mode from reading this data the same as it would have read Fabric “native” Delta tables! And, it’s not only about Direct Lake – all Fabric capabilities, such as notebooks, for example, can be leveraged over mirrored data.

Mirroring in action

Let’s now dive deep and examine how Mirroring works in Microsoft Fabric.

First, Mirroring must be enabled within your Fabric tenant:

Once in the Data Warehouse experience in MS Fabric, you can choose between creating Mirrored database for Azure SQL DB, Snowflake, and CosmosDB:

Once you select desired database for replication, there are certain prerequisites to complete before mirrored data shows within your Fabric tenant. I won’t go into details explaining how to set everything up, since there is a great step-by-step tutorial on Microsoft Learn. Also, my fellow MVP, Gilbert Quevauvilliers created a fantastic overview of how to quickly get up and running with mirrored Azure SQL DB, which you can check here.

Things to keep in mind…

Let me briefly introduce some of the (in my opinion) key facts when it comes to mirroring:

Data is PHYSICALLY stored in your Fabric workspace (unlike with shortcuts, when there is no physical data movement)
Because the data is physically stored in OneLake, you need to pay for the storage. At this moment, Microsoft will give you free storage (if anything in Fabric is really free, haha) of up to the number of terabytes that equals the number of CUs you purchased. Example: F2 SKU gives you 2 TB of mirrored data free of charge, F4 SKU 4 TB, and so on…
What happens if you pause Fabric capacity? Well, in that scenario, you’ll be charged for the storage based on the regular OneLake storage pricing (~ $24 per TB)
Direct Lake mode for Power BI semantic models is supported for mirrored tables (data is V-ordered to additionally increase the performance)
If you delete the mirrored table, it won’t affect the original table in the source database
Mirroring is not an “ALL or NOTHING” process – this means, you can select individual tables that you want mirrored in Fabric, not just the entire database (but you can, if you want:))
Keep in mind that if your source database is located in a different region than Fabric capacity, egress fees will be charged
Views are currently not supported (mirroring applies only to regular tables)

What about SQL Server?!

I know that most of us are interested to hear if mirroring supports SQL Server as a data source…I got this question at least 10 times during my Ask the experts sessions at the Microsoft Fabric Community conference in Vegas this March. As of today, SQL Server is NOT supported for mirroring, but according to the official Fabric release map, it’s planned for this year! So, fingers crossed that we can soon have our SQL Server on-prem data easily replicated in Fabric.

Conclusion

In my opinion, mirroring was a missing puzzle in the “One Copy for All Data” mantra and I’m sure that it will immensely help in Fabric adoption across organizations that have already implemented and developed mature data platform solutions. Once Mirroring becomes available for on-prem data sources, such as SQL Server or Oracle, Microsoft Fabric will become a no-brainer choice for organizations looking to modernize their data estate.

Thanks for reading!

Head First Data

13,151 位关注者

Kirstine Rahbek Banke

Data Engineering | Business Intelligence | Data Platform

5 个月

Great article! I have seen quite a bit of difference in performance using mirrored tables (with a shortcut to lakehouse) compared to actual managed Delta tables in my lakehouse. The diagnostic output from the notebook that runs transformations using multiple of the mirrored tables suggests optimising the tables because of many small delta files. But from what I can find online or in the documentation, nothing is mentioned about table maintenance. Have you had any experience with this?

1 次回应

Rahul Katke

Senior Technical Specialist at CitiusTech

8 个月

Hi Nikola Ilic, currently Fabric detects my SQL server on VM as an Azure SQL instance and then says that currently mirroring from on-prem SQL isn't supported. If we were to use SQL server on VM, why does it first treat it as Azure SQL and then give the above error? Thoughts?

Zakaria Fadili

BI Analyst

9 个月

Does the DP-600 exam include questions about database Mirroring in Fabric? Thank you for your support and guidance!

1 次回应

Abner Musonda

Data Engineering Group Manager at Avanade

10 个月

Big fan of mirroring as well. Nikola Ilic are you aware if anyone has tested and published more detailed stats between mirrored data sources? What is the expected/SLA latency for NRT for the data sources currently supported? I'm curious about what is planned for how governance will be managed when it comes to GDPR compliance and similar requirements. I'm also excited to see how Microsoft Purview will complement these features.

2 次回应

Parvinder Chana

10 个月

Woot woot ????

1 次回应

查看更多评论

要查看或添加评论，请登录

Nikola Ilic的更多文章

Lock Up! Understanding Data Access Options in Microsoft Fabric

2025年3月12日

Lock Up! Understanding Data Access Options in Microsoft Fabric

Security! It’s undoubtedly one of the key considerations when implementing a data platform within the organization…

5 条评论
FREE Learning Opportunity - Data Modeling for Power BI

2025年2月17日

FREE Learning Opportunity - Data Modeling for Power BI

A few days ago, while closing in on 5000 subscribers to my YouTube channel, I promised I'd do another free training for…

8 条评论
5 Clicks to WOW! How changing data types can quickly optimize your Power BI model!

2025年1月29日

5 Clicks to WOW! How changing data types can quickly optimize your Power BI model!

A few weeks ago, I was tasked with optimizing a slow-performing Power BI report. Of course, there can be dozens of…

7 条评论
FREE 4-hour DP-600 (Fabric Analytics Engineer) Workshop - Impressions and Session materials

2024年12月17日

FREE 4-hour DP-600 (Fabric Analytics Engineer) Workshop - Impressions and Session materials

WE DID IT!!! Last Friday, together with my amazing community friends, I delivered a FREE 4-hour DP-600…

9 条评论
SQL Database in Fabric – What, Why, and How?

2024年12月3日

SQL Database in Fabric – What, Why, and How?

Once upon a time, there was a SQL Server! SQL Server was an emperor! It used to live alone, trying to help everyone in…

24 条评论
FREE 4-Hour Fabric Analytics Engineer Workshop!

2024年11月7日

FREE 4-Hour Fabric Analytics Engineer Workshop!

Who doesn't like FREE stuff?! Well, since Microsoft kicked off the "giving season", by providing 5000 free vouchers for…

13 条评论
THE Strongest Link! 5 Reasons why Semantic Link IS the Fabric big deal

2024年11月5日

THE Strongest Link! 5 Reasons why Semantic Link IS the Fabric big deal

Since Microsoft Fabric was publicly unveiled in May 2023, there has been an ocean of announcements around this new…

7 条评论
How to bring SQL Server data into Microsoft Fabric

2024年10月8日

How to bring SQL Server data into Microsoft Fabric

Options, options, options…Having the possibility to perform a certain task in multiple different ways is usually a…

9 条评论
What is a Power Query Template and why is it a big deal in the era of Fabric?

2024年8月20日

What is a Power Query Template and why is it a big deal in the era of Fabric?

If you are in the data analytics realm, Microsoft Fabric has been all over the place in the previous months! Fabric…

4 条评论
50 Shades of Direct Lake – Everything You Need to Know About the New Power BI Storage Mode!

2024年6月18日

50 Shades of Direct Lake – Everything You Need to Know About the New Power BI Storage Mode!

DISCLAIMER: The purpose of this article IS NOT to provide the answer to the question: "Which one is "better" - Import…

9 条评论

See all articles

What is Mirroring in MS Fabric and why do I consider it the next big thing?

Nikola Ilic

I make music from the data??Data Mozart ??| MVP Data Platform | O'Reilly Author | Pluralsight Author | MCT

Understanding the context

Welcome to “mirrored reality”!

领英推荐

Mirroring in action

Things to keep in mind…

What about SQL Server?!

Conclusion

Head First Data

13,151 位关注者

Nikola Ilic的更多文章

社区洞察

其他会员也浏览了

Exploring Features and Current Limitations of Microsoft Fabric

Is this your year to invest in big data?

How to Select Proper Data Backend Technology on Azure

Create and Manage OneLake Shortcuts

Explain by Example: CosmosDB

Microsoft Project Silica: Storing Data for a Millennia

Microsoft Fabric is the New Office

Microsoft Partner Summary - September 30th - October 4th, 2024

Microsoft Partner Summary - August 5th - August 9th, 2024

Week 21 of #100WeeksofAzureDataAI: Errors I found during Purview Deployment??

Understanding the context

Welcome to “mirrored reality”!

领英推荐

Mirroring in action

Things to keep in mind…

What about SQL Server?!

Conclusion

Head First Data

13,151 位关注者

Nikola Ilic的更多文章

Lock Up! Understanding Data Access Options in Microsoft Fabric

FREE Learning Opportunity - Data Modeling for Power BI

5 Clicks to WOW! How changing data types can quickly optimize your Power BI model!

FREE 4-hour DP-600 (Fabric Analytics Engineer) Workshop - Impressions and Session materials

SQL Database in Fabric – What, Why, and How?

FREE 4-Hour Fabric Analytics Engineer Workshop!

THE Strongest Link! 5 Reasons why Semantic Link IS the Fabric big deal

How to bring SQL Server data into Microsoft Fabric

What is a Power Query Template and why is it a big deal in the era of Fabric?

50 Shades of Direct Lake – Everything You Need to Know About the New Power BI Storage Mode!

社区洞察

其他会员也浏览了

Exploring Features and Current Limitations of Microsoft Fabric

Is this your year to invest in big data?

How to Select Proper Data Backend Technology on Azure

Create and Manage OneLake Shortcuts

Explain by Example: CosmosDB

Microsoft Project Silica: Storing Data for a Millennia

Microsoft Fabric is the New Office

Microsoft Partner Summary - September 30th - October 4th, 2024

Microsoft Partner Summary - August 5th - August 9th, 2024

Week 21 of #100WeeksofAzureDataAI: Errors I found during Purview Deployment??