Azure Data Factory- Data Pipeline to move files from Azure Data Lake Storage Gen2 to Azure SQL Database.

Azure Data Factory- Data Pipeline to move files from Azure Data Lake Storage Gen2 to Azure SQL Database.

In this article I have performed data movement from Azure Data Lake storage to Azure SQL database by using the Azure data factory.

Below are the Required resources in Azure to perform the task.

1.Azure Data lake Gen2 storage --->SOURCE

2.Azure Data Factory -->Data movement Orchestration by using Linked service.

3.Azure SQL Database---> Destination.


Source:- Created Azure Data Lake Gen2 Storage

No alt text provided for this image
No alt text provided for this image


Destination:- Created Azure SQL Database

No alt text provided for this image

Created Cloud ETL Resource tool - Azure Data factory.

No alt text provided for this image

After creating Open ADF studio. Click on Ingestion.

No alt text provided for this image

Data ingestion is?the process of obtaining and importing data for immediate use or storage in a database.?

Now, I am using the Sample Log.csv file which contains RAW data. I have stored this file in the azure Data lake Gen2 storage.

No alt text provided for this image

Next, Go to ADF tool and open ingestion then click on Built in Copy TASK as follow.

No alt text provided for this image

Next, We need to select the source in our case it is azure data lake storage gen2. where our source data (Log.csv) file resides in the container of the blob.

No alt text provided for this image

Next, We need to choose the subscription, storage account and we need to create the Linked service as follow.

No alt text provided for this image

What is Linked service:- Linked services are?much like connection strings, which define the connection information needed for the service to connect to external resources.

After clicking on create then we need to browse the source location from ADF.

No alt text provided for this image

Now, We have the dataset for the source file from the data lake gen2 storage.

Dataset:- A dataset is?a named view of data that simply points or references the data you want to use in your activities as inputs and outputs.

Next, We need to select the File format. In this example I am using the .csv file so I am using the Delimited text as follows.

No alt text provided for this image

Next, we have to create the destination data store(AzureSQLDB) as follows.

No alt text provided for this image

Next, I have created data set for the destination (AzureSQLDatabase) I have created a table with the name of datalake2azuresql. Once the pipeline runs from ADF the data will be storing in the datalake2azuresql table in azure sql database .From the above Snap you can observe the database name which i have selected.

No alt text provided for this image

Next, We need to enable the staging.

STAGING:- This zone is usually a database and/or a schema in it that used to hold a copy of the data from the source systems.

No alt text provided for this image

Finally, We have to deploy the pipeline.

No alt text provided for this image

We can monitor the pipeline by using the ADF monitor.

No alt text provided for this image

Now the RAW log.csv data file which we have uploaded in the Azure data lake Gen2 storage will be moved to Azure SQL database with the help of ADF.

I can see the datalake2azuresql.log table in the AzureSQL Database.

No alt text provided for this image

Conclusion: - In this article I have created the ADF Pipeline to move the RAW data from Azure Data lake Gen2 storage to Azure SQL Database. This will automate your work to move the data from the ADLS gen2 to Azure SQL. We can also schedule the pipeline as per the requirement.

Rajmohan Pandav

Alwayson||clstering||miroring|| logshipping|| replication|| Azure|| Mysql, Troubleshooting,PTO, SSRS, SSIS POWESHELL, SSMS, SQL audit, compliance, Patching in sql,

2 年

Thanks

回复

要查看或添加评论,请登录

Manaswin Sakamuri的更多文章

社区洞察

其他会员也浏览了