Microsoft Fabric End-to-End Project?—? with Shorcut, Data Pipeline, DataFlow.
Musili Adebayo
Data Engineer | Fabric Analytics Engineer | Data Analyst | Azure | Power BI | Python | SQL
Are you interested in exploring Microsoft Fabric? Then this article is for you.
Microsoft Fabric is an end-to-end analytics platform that allows you to perform data ingestion, transformation, processing real-time analytics, and dashboard reporting. It encompasses data engineering, data science, and Data analytics bringing unified solutions for enterprises.
The aim of this Microsoft Fabric Project is to:
So, let's get started??
Prerequisites for this project
1. You need a Microsoft Fabric Trial account. Create one here.
2. You can get the Wide World Importer data used in this project here.
3. Open a free Azure account to create an Azure SQL Database or alternatively use any other on-prem database of your choice.
Step 1: Create a New Lakehouse
I created a new Lakehouse called WWI_ProjectLakehouse from the Synapse Data Engineering of Microsoft Fabric. See Microsoft Documentation on creating a Lakehouse.
Step 2: Getting Data into Lakhouse.
After creating our WWI_ProjectLakehouse we need to start getting data into the Lakehouse.
We will be using 4 different ways of ingesting the data to into the Lakehouse.
Starting the Ingestion processes above.
Step 2.1: Using a Shortcut to Warehouse — The Fact Table (WWI_FactOrders Data).
I created a Data Warehouse called FabricProjectWarehouse. We need to get the WWI_FactOrders CSV data into the warehouse via any method you prefer. I use DataFlow Gen 2 to get the WWI_FactOrders CSV into the data warehouse data then select > Import from Text/CSV File.
Now, lets create the Shorcut of the WWI_FactOrders Data into our WWI_ProjectLakehouse.?
It is important to know that we are making a Shortcut (point to other files’ store locations without moving the original data) into our Lakehouse because we want to avoid duplication and still have access to the WWI_FactOrders data in the FabricProjectWarehouse without having to copy the data into the Lakehouse.
To create a Shortcut, go to the Lakehouse we created earlier, and click on New shortcut as seen in Image 1 in Step 1 or use the Get Data option. Select Internal Sources > Microsoft OnLake, Select our destination warehouse of FabricProjectWarehouse, navigate to Tables and check our WWI_FactOrders and next, then Create.
Step 2.2: Using Data Factory Pipeline to get the Orderdetails Data ( WWI_OrderDetails).
Select the New Data Pipeline from my Lakehouse as seen in Image 1 or use the Get Data option and create a Data Pipeline called WWI_pipeline_OrderDetails.
Please note that the WWI_OrderDetails data is being copied from my Azure SQL Database.
This will open the Copy data screen below and we will select my Azure SQL Database and create a connection, set connection settings and credentials as seen below.
After successfully connecting to my Azure SQL Database, A preview pane opens where I can select the data source dbo.WWI_OrderDetails and set the data destination to my WWI_ProjectLakehouse and Load to a new table that I have renamed as WWI_OrderDetails.
The last stage is to Review and Save. We can see my Data Pipeline was successfully copied and my WWI_ProjectLakehouse.
领英推荐
Step 2.3: Via Upload Files?—?3 dimension tables from?CSV.
In the Lakehouse, we need to upload the WorldWideImporters Date, Categories, Customers, Shippers, and Product data which are in CSV and Excel files to the File Section of the WWI_ProjectLakehouse.?
Remember we already have our WWI_FactOrders and WWI_rderDetails in Tables which were ingested via Shortcut and Data Pipeline.
We are going to Load to Tables the WorldWide Importers Categories, Date and Shippers data in CSV format to Tables in the WWI_ProjectLakehouse.
Please note that for Excel Files it cannot be loaded into Tables unlike our CSV files. That is where DataFlow Gen 2 comes in to save the day leading to our Step 4 method of data ingestation.
Step 2.4: Using Dataflows to Load 2 Dimension Tables in Excel?Format.
We will use the Get Data Option in our Lakehouse to get the Customers and Products Excel Data Using DataFlow Gen 2 in the WWI_ProjectLakehouse.For this, In the Lakehouse, Select Get Data > DataFlow Gen 2 > Connect to the excel file via link to the Onedrive or Regular Upload from your local computer, then Publish. In my case, as seen below. The WWI_Products data was uploaded and published. The same process was repeated for the WWI_Customers data.
Now that we have successfully ingested all our data via various tools and methods, Let's do some Data Modelling in Microsoft Fabric.
Step 3: Data Modelling.
From the WWI_ProjectLakehouse, switch to the SQL Analytics Endpoint as seen below and click on the Model tab to start your data modeling.
Step 4: Data Visualization in Power BI.
Before we start building our visualization within the Power BI Service online within Microsoft Fabric. First, we need to create some measures using the New Measure. Then we can click on New Report and start building our report dashboard in Power BI online.
NB: Due to limitation of the Microsoft Power BI Service within Fabric as compared to Power BI Desktop. I have created all my measures in the OrderDetails Table and if your New Measure pane is greyed out like mine as seen in the modeling picture above. Then, go to Reporting > Manage Default Semantic Model to make it active.
After creating my measures, below is the dashboard report analysis of wide world importers data with insights of performance over a 3 years period from 1996–1998.
Thank you for reading.
Please remember to subscribe and follow me for more insightful articles and you can reach me via my socials below if you'd like to talk more. LinkedIn: Musili Adebayo Twitter: Musili_Adebayo
P.S. Enjoy this article? Support my work directly.
?
Data Engineer | Microsoft Fabric Analytics Engineer
7 个月Thanks for sharing this Musili! Most of the time we can't find practical stuff so this is a very insigthful content for people like me starting with Fabric. ??
Creative Director & Marketing Data Analyst | Founder at AB Digital & A'bot | AI Solutions Architect
7 个月Very insightful Musili, I'm still very new to working with Fabric and I'm still on my path of mastering Python....taking it one Mastering at a time Thank you for the content