登录查看更多内容

Data Integration from Fabric Lakehouse to Snowflake Database using Data Pipeline

Abiola A. David, MSc, MVP

??Microsoft? Fabric & Excel MVP [5X] | Senior Fabric Solutions Architect | Microsoft Fabric, Azure, Power BI, Databricks, SQL, Excel, Snowflake, GCP | MSc, Big Data & BI | DP700 & DP600 Certified | C# Corner MVP [2X]

发布日期: 2024年4月1日

In this article, I am going to walk you through how to perform a scalable data integration from Microsoft Fabric Lakehouse to Snowflake Data Warehouse using Data Pipeline.

In the screenshot below, I've got data in the salesdatatocopytable in the Fabric Lakehouse. The data is of course, in delta table.

Create Snowflake Account and Server Name

To create a free trial 30 days account, proceed to this link: https://www.snowflake.com/ and follow the prompt to create account, and specify the cloud host: Microsoft Axure, AWS or Google Cloud Platform

You will be required to provide username and password during the signup. Also, you will be required to confirm your email and then, you will receive an email with details of your server.

Create Warehouse, Database, Schema, Table and Insert Records

We want to perform data engineering to ingest the data into my Snowflake warehouse. Before we can achieve that, it is required we have warehouse, database, schema and table created in the Snowflake account.

In the screen below, I have the follow SQL script:

CREATE WAREHOUSE FabricWH;

CREATE DATABASE FabricDB;

USE FabricDB;

CREATE SCHEMA fabric_schema;

USE SCHEMA fabric_schema;

CREATE TABLE DataFromFabric

(

OrderDate string,

Products VARCHAR (20),

PaymentType VARCHAR (15),

Units INT,

Price INT,

SalesAmount DECIMAL(10,2)

);

Proceed to run the SQL codes. After successfully running of the codes, form the screenshot below, the DataFromFabric table has been created without any records in the table.

We are going to head back Fabric Lakehouse to initiative the process of ingesting the data.

Switch to Fabric Data Engineering Experience

In the screenshot below, at the bottom left of the screenshot, switch to Data Engineering experience. Note the copy_to_s3 lakehouse is in the A to Z of Warehouse workspace

In the Data Engineering home page as seen in the screenshot below, select Data Pipeline.

Provide descriptive for the pipeline. In this called, DataIngestionToSnowflake is provided

Click on Create

In the Data Pipeline Home as seen below, we can start building our data pipeline.

领英推荐

Snowflake Data Marketplace

Lyftrondata 2 个月前

Snowflake Data Marketplace

Lyftrondata 4 个月前

Dremio Monthly Reflections: March 2024

Dremio 11 个月前

Select Copy data.

In the Choose data source window, scroll down and select Lakehouse

Select Next

In the next window, select the Lakehouse to copy data from. In this article, copy_to_s3 lakehouse is selected.

Click on Next

In the Connect to data source, select the single table that matched the table created in the Snowflake database earlier. The salesdatatocopytos3 is selected

Click on Next

In the Choose Data Destination, select Snowflake as seen below

Click on Next

In the next window, provide the Server by copy the server which you can get from the Welcome to Snowflake email you received!

Next, provide the Warehouse name created earlier. In this article, FabricWH is provided

In the Connection credentials, you can optionally provide connection name or proceed with what is generated automatically.

Next, scroll down and provide the username and password provided during account registration on snowflake website

Proceed by clicking on Next

In the intermediate window, click on test connection to be certain connection is established to Snowflake Data Warehouse. In this article, the connection is successful as seen in the screenshot below. Select the target database created earlier. FabricDB database is selected

Click on Next

In the next window, from the Table dropdown, select the target table. Fabric_Schema.DataFromFabric table is selected. We are also investigate the source and destination column data types and map as required.

Click on Next

In the Settings, enable staging is checked automatically and Data store type is set to Workspace. This are fine

Click on Next

In the Review + Save stage, click on Save + Run to execute the data transfer.

In the screenshot below, the data transfer using pipeline was successful with the activity status showing green checkmark.

To investigate the data, head to snowflake and run a select * from DataFromFabric in the worksheet as seen below

There we go! The data ingestion worked as expected. If you enjoy this data engineering tutorial, share the article with your connection, comment and give it a thumps up. See you in the next article

要查看或添加评论，请登录

Abiola A. David, MSc, MVP的更多文章

??SQL Database in Fabric: A Game-Changer for AI Application Development

2024年11月25日

??SQL Database in Fabric: A Game-Changer for AI Application Development

The new SQL Database in Microsoft Fabric which was announced in Microsoft Ignite 2024 is a workloads designed to make…

4 条评论
Getting Started with Databricks Datasets

2024年10月17日

Getting Started with Databricks Datasets

In our fast-paced, data-driven world, being able to quickly access and analyse huge amounts of data is key to success…
?New Microsoft Fabric Domain for Data Governance

2024年8月6日

?New Microsoft Fabric Domain for Data Governance

The new Microsoft Fabric domain feature, now available in the admin portal, revolutionizes data governance by aligning…
Establish Connection to GitHub in Fabric Workspace

2024年7月17日

Establish Connection to GitHub in Fabric Workspace

In the previous article, I walked you through the steps to enable the new GitHub in the Fabric Admin Portal. This…
NEW: Enabling GitHub Integration in Microsoft Fabric

2024年7月16日

NEW: Enabling GitHub Integration in Microsoft Fabric

One of the powerful features in Microsoft Fabric is the possibility to perform integration with GitHub, a widely used…
Azure Databricks Cluster

2024年7月15日

Azure Databricks Cluster

Azure Databricks is designed for big data analytics and machine learning. At the heart of databricks service lies the…

2 条评论
Understanding Query Folding in Power Query

2024年5月29日

Understanding Query Folding in Power Query

Query folding is an optimization process where Power Query tries to push as much execution as possible to data source…
Understanding Role-Based Access Control (RBAC) in Microsoft Fabric

2024年5月28日

Understanding Role-Based Access Control (RBAC) in Microsoft Fabric

One of the key data management security concept in #MicrosoftFabric is understanding the #RBAC (Role-Based Access…
Azure Connections in Microsoft Fabric Admin Portal

2024年5月21日

Azure Connections in Microsoft Fabric Admin Portal

The Azure connections in Microsoft Fabric Admin Portal is an important feature that allows connecting Azure services to…
Get Data from SharePoint Folder using Fabric Dataflow Gen 2

2024年4月27日

Get Data from SharePoint Folder using Fabric Dataflow Gen 2

Fabric Dataflow Gen 2 is a powerful cloud-based ETL tool for extracting, transforming, and loading (ETL) data from…

See all articles

Data Integration from Fabric Lakehouse to Snowflake Database using Data Pipeline

Abiola A. David, MSc, MVP

??Microsoft? Fabric & Excel MVP [5X] | Senior Fabric Solutions Architect | Microsoft Fabric, Azure, Power BI, Databricks, SQL, Excel, Snowflake, GCP | MSc, Big Data & BI | DP700 & DP600 Certified | C# Corner MVP [2X]

领英推荐

Abiola A. David, MSc, MVP的更多文章

社区洞察

其他会员也浏览了

Data Management News for the Week of June 7; Updates from Cloudera, Snowflake, Informatica & More

Data Platforms - An Outlook

Data Modernization for Small and Medium Enterprises (SMEs)

Role Hierarchies in Snowflake: Effective Data Access Management

Using OCI Object Storage based Raw Data in Oracle Analytics

Data Management News for the Week of September 16; Updates from Kensu, Oracle, Syniti, and More

10 Reasons to Make Apache Iceberg and Dremio Part of Your Data Lakehouse Strategy

The Impact of Data Fabric Architecture on Single Source of Truth

Lakehouse, make Big Data great again

Special edition: 19 gotchas to look out for when evaluating data lineage

领英推荐

Abiola A. David, MSc, MVP的更多文章

??SQL Database in Fabric: A Game-Changer for AI Application Development

Getting Started with Databricks Datasets

?New Microsoft Fabric Domain for Data Governance

Establish Connection to GitHub in Fabric Workspace

NEW: Enabling GitHub Integration in Microsoft Fabric

Azure Databricks Cluster

Understanding Query Folding in Power Query

Understanding Role-Based Access Control (RBAC) in Microsoft Fabric

Azure Connections in Microsoft Fabric Admin Portal

Get Data from SharePoint Folder using Fabric Dataflow Gen 2

社区洞察

其他会员也浏览了

Data Management News for the Week of June 7; Updates from Cloudera, Snowflake, Informatica & More

Data Platforms - An Outlook

Data Modernization for Small and Medium Enterprises (SMEs)

Role Hierarchies in Snowflake: Effective Data Access Management

Using OCI Object Storage based Raw Data in Oracle Analytics

Data Management News for the Week of September 16; Updates from Kensu, Oracle, Syniti, and More

10 Reasons to Make Apache Iceberg and Dremio Part of Your Data Lakehouse Strategy

The Impact of Data Fabric Architecture on Single Source of Truth

Lakehouse, make Big Data great again

Special edition: 19 gotchas to look out for when evaluating data lineage