Data Integration from Fabric Lakehouse to Snowflake Database using Data Pipeline
Abiola A. David, MSc, MVP
??Microsoft? Fabric & Excel MVP [5X] | Senior Fabric Solutions Architect | Microsoft Fabric, Azure, Power BI, Databricks, SQL, Excel, Snowflake, GCP | MSc, Big Data & BI | DP700 & DP600 Certified | C# Corner MVP [2X]
In this article, I am going to walk you through how to perform a scalable data integration from Microsoft Fabric Lakehouse to Snowflake Data Warehouse using Data Pipeline.
In the screenshot below, I've got data in the salesdatatocopytable in the Fabric Lakehouse. The data is of course, in delta table.
Create Snowflake Account and Server Name
To create a free trial 30 days account, proceed to this link: https://www.snowflake.com/ and follow the prompt to create account, and specify the cloud host: Microsoft Axure, AWS or Google Cloud Platform
You will be required to provide username and password during the signup. Also, you will be required to confirm your email and then, you will receive an email with details of your server.
Create Warehouse, Database, Schema, Table and Insert Records
We want to perform data engineering to ingest the data into my Snowflake warehouse. Before we can achieve that, it is required we have warehouse, database, schema and table created in the Snowflake account.
In the screen below, I have the follow SQL script:
CREATE WAREHOUSE FabricWH;
CREATE DATABASE FabricDB;
USE FabricDB;
USE FabricDB;
CREATE SCHEMA fabric_schema;
USE SCHEMA fabric_schema;
CREATE TABLE DataFromFabric
(
OrderDate string,
Products VARCHAR (20),
PaymentType VARCHAR (15),
Units INT,
Price INT,
SalesAmount DECIMAL(10,2)
);
Proceed to run the SQL codes. After successfully running of the codes, form the screenshot below, the DataFromFabric table has been created without any records in the table.
We are going to head back Fabric Lakehouse to initiative the process of ingesting the data.
Switch to Fabric Data Engineering Experience
In the screenshot below, at the bottom left of the screenshot, switch to Data Engineering experience. Note the copy_to_s3 lakehouse is in the A to Z of Warehouse workspace
In the Data Engineering home page as seen in the screenshot below, select Data Pipeline.
Provide descriptive for the pipeline. In this called, DataIngestionToSnowflake is provided
Click on Create
In the Data Pipeline Home as seen below, we can start building our data pipeline.
Select Copy data.
In the Choose data source window, scroll down and select Lakehouse
Select Next
In the next window, select the Lakehouse to copy data from. In this article, copy_to_s3 lakehouse is selected.
Click on Next
In the Connect to data source, select the single table that matched the table created in the Snowflake database earlier. The salesdatatocopytos3 is selected
Click on Next
In the Choose Data Destination, select Snowflake as seen below
Click on Next
In the next window, provide the Server by copy the server which you can get from the Welcome to Snowflake email you received!
Next, provide the Warehouse name created earlier. In this article, FabricWH is provided
In the Connection credentials, you can optionally provide connection name or proceed with what is generated automatically.
Next, scroll down and provide the username and password provided during account registration on snowflake website
Proceed by clicking on Next
In the intermediate window, click on test connection to be certain connection is established to Snowflake Data Warehouse. In this article, the connection is successful as seen in the screenshot below. Select the target database created earlier. FabricDB database is selected
Click on Next
In the next window, from the Table dropdown, select the target table. Fabric_Schema.DataFromFabric table is selected. We are also investigate the source and destination column data types and map as required.
Click on Next
In the Settings, enable staging is checked automatically and Data store type is set to Workspace. This are fine
Click on Next
In the Review + Save stage, click on Save + Run to execute the data transfer.
In the screenshot below, the data transfer using pipeline was successful with the activity status showing green checkmark.
To investigate the data, head to snowflake and run a select * from DataFromFabric in the worksheet as seen below
There we go! The data ingestion worked as expected. If you enjoy this data engineering tutorial, share the article with your connection, comment and give it a thumps up. See you in the next article