登录查看更多内容

ADF Copy Data: Copy Data From Azure Blob Storage To A SQL Database Using Azure Data Factory

Atul Kumar

Founder at K21Academy | Cloud Computing, AWS, Azure, Generative AI & ML Expert | Helping You Succeed in Your Cloud & Tech Journey | IT & Services Specialist | Transforming Careers Through Training

发布日期: 2024年3月6日

Today I am going to share the case study of ADF copy data from Blob storage to a SQL Database with Azure Data Factory (ETL service) which I will be discussing in detail in my Azure Data ?Engineering Free Class..

Haven't registered for it Till Now? There's still time to join my class! In this session, I'll dive deep into the Integrated Analytics Platform Azure Synapse Analytics, exploring not just the platform but also delivering insights into current job opportunities and market trends.

Overview Of Azure Blob Storage

Azure Blob Storage is Microsoft’s Azure object storage solution for the cloud. It is designed to optimize and store massive amounts of unstructured data.

It is used for Streaming video and audio, writing to log files, Storing data for backup and restore disaster recovery, and archiving.

Azure Blob storage offers three types of resources:

The storage account
A container in the storage account
A blob in a container

Objects in Azure Blob storage are accessible via the Azure PowerShell, Azure Storage REST API, Azure CLI, or an Azure Storage client library.

Overview Of Azure SQL Database

It is a fully managed platform as a service. Here the platform manages aspects such as database software upgrades, patching, backups, the monitoring. Using Azure SQL Database, we can provide a highly available and performant storage layer for our applications.

Types of Deployment Options for the SQL Database:

Single Database
Elastics Pool
Managed Instance

Azure SQL Database offers three service tiers:

General Purpose or Standard
Business Purpose or Premium
Hyperscale

ADF Copy Data From Blob Storage To SQL Database

Create a blob and a SQL table
Create an Azure data factory
Use the Copy Data tool to create a pipeline and Monitor the pipeline

STEP 1: Create a blob and a SQL table

1) Create a source blob, and launch Notepad on your desktop. Copy the following text and save it in a file named input Emp.txt on your disk.

FirstName|LastName

John|Doe

Jane|Doe

2) Create a container in your Blob storage. The container is named adftutorial.

3) Upload the emp.txt file to the adfcontainer folder.

4) Create a sink SQL table, Use the following SQL script to create a table named dbo.emp in your SQL Database.

CREATE TABLE dbo.emp

(

ID int IDENTITY(1,1) NOT NULL,

FirstName varchar(50),

LastName varchar(50)

)

CREATE CLUSTERED INDEX IX_emp_ID ON dbo.emp (ID);

Note: Ensure that Allow access to Azure services is turned ON for your SQL Server so that Data Factory can write data to your SQL Server. To verify and turn on this setting, go to logical SQL server > Overview > Set server firewall> set the Allow access to Azure services option to ON.

STEP 2: Create a data factory

1) Sign in to the Azure portal. Select Analytics > Select Data Factory.

2) On The New Data Factory Page, Select Create

3) On the Basics Details page, Enter the following details. Then Select Git Configuration

Bamiyan Gobets 1 年前

The Latest in Distributed SQL - September

TiDB, powered by PingCAP 1 个月前

Azure Synapse Analytics - Save costs up to 94%

RajaRathan Kamble 1 年前

4) On the Git configuration page, select the check box, and then Go To Networking. Then select Review+Create

5)?After the creation is finished, the Data Factory home page is displayed. select the?Author & Monitor tile.

STEP 3: Use the ADF Copy Data tool to create a pipeline

1) Select the + (plus) button, and then select Pipeline.

2) In the General panel under Properties, specify CopyPipeline for Name. Then collapse the panel by clicking the Properties icon in the top-right corner.

3) In the Activities toolbox, expand Move & Transform. Drag the Copy Data activity from the Activities toolbox to the pipeline designer surface. You can also search for activities in the Activities toolbox. Specify CopyFromBlobToSql?for Name.

4)? Go to the Source tab. Select + New to create a source dataset.

5) In the New Dataset dialog box, select Azure Blob Storage to copy data from azure blob storage, and then select Continue.

6) In the Select Format dialog box, choose the format type of your data, and then select Continue.

7) In the Set Properties dialog box, enter SourceBlobDataset for Name. Select the checkbox for the first row as a header. Under the Linked service text box, select + New.

8) In the New Linked Service (Azure Blob Storage) dialog box, enter AzureStorageLinkedService as name, select your storage account from the Storage account name list. Test connection, select Create to deploy the linked service.

9) After the linked service is created, it’s navigated back to the Set properties page. Nextto File path, select Browse. Navigate to the adftutorial/input folder, select the emp.txt file, and then select OK

10) Select OK. It automatically navigates to the pipeline page. In the Source tab, confirm that SourceBlobDataset is selected. To preview data on this page, select Preview data.

11) Go to the Sink tab, and select + New to create a sink dataset. . In the New Dataset dialog box, input “SQL” in the search box to filter the connectors, select Azure SQL Database, and then select Continue.

12) In the Set Properties dialog box, enter OutputSqlDataset for Name. From the Linked service dropdown list, select + New.

13) In the New Linked Service (Azure SQL Database) dialog box, fill the following details.

14) Test Connection may be failed. Go to your Azure SQL database, Select your database. Go to Set Server Firewall setting page. On the Firewall settings page, Select yes in Allow Azure services and resources to access this server. Then Save settings

15) On the New Linked Service (Azure SQL Database) Page, Select Test connection to test the connection. Then Select Create to deploy the linked service.

16)?It automatically navigates to the Set Properties dialog box. In Table, select [dbo].[emp].Then select OK.

17) To validate the pipeline, select Validate from the toolbar.

18) Once the pipeline can run successfully, in the top toolbar, select Publish all. Publishes entities (datasets, and pipelines) you created to Data Factory. Select Publish.

19) Select Trigger on the toolbar, and then select Trigger Now.? On the Pipeline Run page, select OK.

20)?Go to the Monitor tab on the left. You see a pipeline run that is triggered by a manual trigger. You can use links under the PIPELINE NAME column to view activity details and to rerun the pipeline.

21) To see activity runs associated with the pipeline run, select the CopyPipeline link under the PIPELINE NAME column.

22) Select All pipeline runs at the top to go back to the Pipeline Runs view. To refresh the view, select Refresh.

23)?Verify that you create a Copy data from Azure Blob storage to a database in Azure SQL Database by using Azure Data Factory is Succeeded

Congratulations! You just use the Copy Data tool to create a pipeline and Monitor the pipeline and activity run successfully.

If you found value in this content, please drop "Informative" comment below and Follow me for more insightful articles like this.

Also, don't miss out on joining my complimentary class on Azure Cloud —I look forward to seeing you there!??

ADF Copy Data: Copy Data From Azure Blob Storage To A SQL Database Using Azure Data Factory

Atul Kumar

Founder at K21Academy | Cloud Computing, AWS, Azure, Generative AI & ML Expert | Helping You Succeed in Your Cloud & Tech Journey | IT & Services Specialist | Transforming Careers Through Training

Overview Of Azure Blob Storage

Overview Of Azure SQL Database

ADF Copy Data From Blob Storage To SQL Database

STEP 1: Create a blob and a SQL table

领英推荐

STEP 3: Use the ADF Copy Data tool to create a pipeline

更多精彩文章

社区洞察

其他会员也浏览了

Capture Data Changes in Azure Data Factory and Azure Synapse Analytics

Replicating data from RDS SQL Server to Redshift using AWS DMS in compliance with PCI DSS

SQL Server Big Data Clusters on Azure

Snowflake is not a data warehouse

Data Replication Tools in Azure SQL Databases

Migrating On-Premise Data From Various Database Platform To AWS Utilizing Snowflake

How to test Azure Data Pipeline?

The Evolution of Microsoft Data Products and Services

Overview Of Azure Blob Storage

Overview Of Azure SQL Database

ADF Copy Data From Blob Storage To SQL Database

STEP 1: Create a blob and a SQL table

领英推荐

STEP 3: Use the ADF Copy Data tool to create a pipeline

(New Project) Build an ETL service pipeline

2024年5月1日

A Detailed Guide to Containers & Kubernetes Certifications

2024年4月29日

ETL tools (On Prem & Cloud ) + Interview Questions.

2024年4月22日

Terraform Data Sources

2024年4月19日

[New Video ?? ] How to Secure your Container & Kubernetes?

2021年6月20日

E-Business Suite (R12) On Oracle Cloud- Day 1 Live Session Review

2021年6月3日

Oracle Cloud DBA - What Do You Want To Learn?

2021年5月25日

[Video] Oracle Exadata Cloud Service(ExaCS) Networking

2019年7月30日

社区洞察

其他会员也浏览了

Capture Data Changes in Azure Data Factory and Azure Synapse Analytics

Replicating data from RDS SQL Server to Redshift using AWS DMS in compliance with PCI DSS

SQL Server Big Data Clusters on Azure

Snowflake is not a data warehouse

Data Replication Tools in Azure SQL Databases

Migrating On-Premise Data From Various Database Platform To AWS Utilizing Snowflake

How to test Azure Data Pipeline?

The Evolution of Microsoft Data Products and Services