Azure Data Factory- Data Pipeline to move files from Azure Data Lake Storage Gen2 to Azure SQL Database.
Manaswin Sakamuri
Azure Machine Learning & Artificial Intelligence |Azure Ai| Azure Data-Engineer |Azure Data-Bricks |Azure-Data Factory |Azure SQL Database
In this article I have performed data movement from Azure Data Lake storage to Azure SQL database by using the Azure data factory.
Below are the Required resources in Azure to perform the task.
1.Azure Data lake Gen2 storage --->SOURCE
2.Azure Data Factory -->Data movement Orchestration by using Linked service.
3.Azure SQL Database---> Destination.
Source:- Created Azure Data Lake Gen2 Storage
Destination:- Created Azure SQL Database
Created Cloud ETL Resource tool - Azure Data factory.
After creating Open ADF studio. Click on Ingestion.
Data ingestion is?the process of obtaining and importing data for immediate use or storage in a database.?
Now, I am using the Sample Log.csv file which contains RAW data. I have stored this file in the azure Data lake Gen2 storage.
Next, Go to ADF tool and open ingestion then click on Built in Copy TASK as follow.
Next, We need to select the source in our case it is azure data lake storage gen2. where our source data (Log.csv) file resides in the container of the blob.
Next, We need to choose the subscription, storage account and we need to create the Linked service as follow.
领英推荐
What is Linked service:- Linked services are?much like connection strings, which define the connection information needed for the service to connect to external resources.
After clicking on create then we need to browse the source location from ADF.
Now, We have the dataset for the source file from the data lake gen2 storage.
Dataset:- A dataset is?a named view of data that simply points or references the data you want to use in your activities as inputs and outputs.
Next, We need to select the File format. In this example I am using the .csv file so I am using the Delimited text as follows.
Next, we have to create the destination data store(AzureSQLDB) as follows.
Next, I have created data set for the destination (AzureSQLDatabase) I have created a table with the name of datalake2azuresql. Once the pipeline runs from ADF the data will be storing in the datalake2azuresql table in azure sql database .From the above Snap you can observe the database name which i have selected.
Next, We need to enable the staging.
STAGING:- This zone is usually a database and/or a schema in it that used to hold a copy of the data from the source systems.
Finally, We have to deploy the pipeline.
We can monitor the pipeline by using the ADF monitor.
Now the RAW log.csv data file which we have uploaded in the Azure data lake Gen2 storage will be moved to Azure SQL database with the help of ADF.
I can see the datalake2azuresql.log table in the AzureSQL Database.
Conclusion: - In this article I have created the ADF Pipeline to move the RAW data from Azure Data lake Gen2 storage to Azure SQL Database. This will automate your work to move the data from the ADLS gen2 to Azure SQL. We can also schedule the pipeline as per the requirement.
Alwayson||clstering||miroring|| logshipping|| replication|| Azure|| Mysql, Troubleshooting,PTO, SSRS, SSIS POWESHELL, SSMS, SQL audit, compliance, Patching in sql,
2 年Thanks