Microsoft Fabric End-to-End Project?—? with Shorcut, Data Pipeline, DataFlow.
AI Generated Image

Microsoft Fabric End-to-End Project?—? with Shorcut, Data Pipeline, DataFlow.

Are you interested in exploring Microsoft Fabric? Then this article is for you.

Microsoft Fabric End-to-End Project?—?with Shorcut, Data Pipeline, DataFlow?—? Image by Author

Microsoft Fabric is an end-to-end analytics platform that allows you to perform data ingestion, transformation, processing real-time analytics, and dashboard reporting. It encompasses data engineering, data science, and Data analytics bringing unified solutions for enterprises.

The aim of this Microsoft Fabric Project is to:

  1. Create a New Lakehouse.
  2. Get data into the Lakehouse using various methods of data ingestion such as Shortcut, Data Pipeline, DataFlow, and File Upload.
  3. Perform Data Modelling.
  4. Finally, Data Visualization of the ingested table (Optional).

So, let's get started??

Prerequisites for this project

1. You need a Microsoft Fabric Trial account. Create one here.

2. You can get the Wide World Importer data used in this project here.

3. Open a free Azure account to create an Azure SQL Database or alternatively use any other on-prem database of your choice.

Step 1: Create a New Lakehouse

I created a new Lakehouse called WWI_ProjectLakehouse from the Synapse Data Engineering of Microsoft Fabric. See Microsoft Documentation on creating a Lakehouse.

Image 1:

Step 2: Getting Data into Lakhouse.

After creating our WWI_ProjectLakehouse we need to start getting data into the Lakehouse.

We will be using 4 different ways of ingesting the data to into the Lakehouse.

  1. Using a Shortcut to Warehouse — The Fact Table.
  2. Using Data Factory Pipeline to get the Orderdetails Data.
  3. Via Upload Files — 3 Dimension tables in CSV Format.
  4. Using Dataflows Gen 2 for 2 dimension tables In Excel format.

Starting the Ingestion processes above.

Step 2.1: Using a Shortcut to Warehouse — The Fact Table (WWI_FactOrders Data).

I created a Data Warehouse called FabricProjectWarehouse. We need to get the WWI_FactOrders CSV data into the warehouse via any method you prefer. I use DataFlow Gen 2 to get the WWI_FactOrders CSV into the data warehouse data then select > Import from Text/CSV File.

Creating a Warehouse

Now, lets create the Shorcut of the WWI_FactOrders Data into our WWI_ProjectLakehouse.?

It is important to know that we are making a Shortcut (point to other files’ store locations without moving the original data) into our Lakehouse because we want to avoid duplication and still have access to the WWI_FactOrders data in the FabricProjectWarehouse without having to copy the data into the Lakehouse.

To create a Shortcut, go to the Lakehouse we created earlier, and click on New shortcut as seen in Image 1 in Step 1 or use the Get Data option. Select Internal Sources > Microsoft OnLake, Select our destination warehouse of FabricProjectWarehouse, navigate to Tables and check our WWI_FactOrders and next, then Create.

A

Step 2.2: Using Data Factory Pipeline to get the Orderdetails Data ( WWI_OrderDetails).

Select the New Data Pipeline from my Lakehouse as seen in Image 1 or use the Get Data option and create a Data Pipeline called WWI_pipeline_OrderDetails.

Please note that the WWI_OrderDetails data is being copied from my Azure SQL Database.

This will open the Copy data screen below and we will select my Azure SQL Database and create a connection, set connection settings and credentials as seen below.

Creating a Connection to Azure SQL Database and Setting the Connections and Credentials

After successfully connecting to my Azure SQL Database, A preview pane opens where I can select the data source dbo.WWI_OrderDetails and set the data destination to my WWI_ProjectLakehouse and Load to a new table that I have renamed as WWI_OrderDetails.

Selecting the Database and Connecting to the Data Destination in the WWI_ProjectLakehouse.

The last stage is to Review and Save. We can see my Data Pipeline was successfully copied and my WWI_ProjectLakehouse.

Please look over the settings and connections and validate Data Pipeline Copy Activity.

Step 2.3: Via Upload Files?—?3 dimension tables from?CSV.

In the Lakehouse, we need to upload the WorldWideImporters Date, Categories, Customers, Shippers, and Product data which are in CSV and Excel files to the File Section of the WWI_ProjectLakehouse.?

Remember we already have our WWI_FactOrders and WWI_rderDetails in Tables which were ingested via Shortcut and Data Pipeline.

We are going to Load to Tables the WorldWide Importers Categories, Date and Shippers data in CSV format to Tables in the WWI_ProjectLakehouse.

Loading Data Uploaded via Files in Lakehouse to Tables.
Please note that for Excel Files it cannot be loaded into Tables unlike our CSV files. That is where DataFlow Gen 2 comes in to save the day leading to our Step 4 method of data ingestation.

Step 2.4: Using Dataflows to Load 2 Dimension Tables in Excel?Format.

We will use the Get Data Option in our Lakehouse to get the Customers and Products Excel Data Using DataFlow Gen 2 in the WWI_ProjectLakehouse.For this, In the Lakehouse, Select Get Data > DataFlow Gen 2 > Connect to the excel file via link to the Onedrive or Regular Upload from your local computer, then Publish. In my case, as seen below. The WWI_Products data was uploaded and published. The same process was repeated for the WWI_Customers data.

Using Dataflows to Load Products and Customers Dimension Tables in Excel Format to the Lakehouse.


Now that we have successfully ingested all our data via various tools and methods, Let's do some Data Modelling in Microsoft Fabric.

Step 3: Data Modelling.

From the WWI_ProjectLakehouse, switch to the SQL Analytics Endpoint as seen below and click on the Model tab to start your data modeling.

A snowflake Data modeling in the SQL Analytics Endpoint in Lakehouse

Step 4: Data Visualization in Power BI.

Before we start building our visualization within the Power BI Service online within Microsoft Fabric. First, we need to create some measures using the New Measure. Then we can click on New Report and start building our report dashboard in Power BI online.

NB: Due to limitation of the Microsoft Power BI Service within Fabric as compared to Power BI Desktop. I have created all my measures in the OrderDetails Table and if your New Measure pane is greyed out like mine as seen in the modeling picture above. Then, go to Reporting > Manage Default Semantic Model to make it active.

After creating my measures, below is the dashboard report analysis of wide world importers data with insights of performance over a 3 years period from 1996–1998.

Wide World Importer Dashboard Analysis?—?By Author

Thank you for reading.

Please remember to subscribe and follow me for more insightful articles and you can reach me via my socials below if you'd like to talk more. LinkedIn: Musili Adebayo Twitter: Musili_Adebayo

P.S. Enjoy this article? Support my work directly.




?









Juan López Iglesias

Data Engineer | Microsoft Fabric Analytics Engineer

7 个月

Thanks for sharing this Musili! Most of the time we can't find practical stuff so this is a very insigthful content for people like me starting with Fabric. ??

Abigail Botma

Creative Director & Marketing Data Analyst | Founder at AB Digital & A'bot | AI Solutions Architect

7 个月

Very insightful Musili, I'm still very new to working with Fabric and I'm still on my path of mastering Python....taking it one Mastering at a time Thank you for the content

要查看或添加评论,请登录

Musili Adebayo的更多文章

社区洞察

其他会员也浏览了