Azure Data Factory – CI/CD [Part 1]
Azure DevOps is a set of tools for collaboration, continuous integration, and continuous delivery. Azure Repos allows you to work on code development using free Git repositories, pull requests, and code reviews. Azure Pipelines helps you create a pipeline for building, testing, and developing any app.
In this article, you will learn how to set up CI/CD for your data analytics solutions in Azure Data Factory using Azure DevOps. You'll start by creating an Azure DevOps account, organization, and project, and then linking it to your ADF. You'll then learn how to publish Git changes to ADF, deploy new features with Azure Repos, and set up the CI/CD processes for Data Factory pipelines using Azure Pipelines.
Set up Azure DevOps
2. Login with your Azure account and select your Country/region to continue.
3. Enter your organization name and select the location for hosting your project. It's recommended to choose the same location where your ADF is hosted to avoid syncing issues.
4. Enter a name for your project and create it.
5. Go to Organization settings, select the Default Directory option in the Azure Active Directory field, and click Connect.
6. Sign out and sign in again to see that your organization is connected to Default Directory.
7. In your ADF, click on Data Factory and then "Set up code repository." In the dialog box that appears, select the following settings:
8. Click "Apply" and select "Use existing" when prompted to select the working branch.
领英推荐
9. Your ADF is now connected to Azure DevOps Git, and the master branch is selected.
That's it! You are now ready to use Azure DevOps for your project.
Publishing Changes to ADF
Collaborating on code development typically involves using Git. In this section, you'll learn how to create an ADF pipeline in Azure DevOps Git and publish changes from your master branch to ADF.
1. To begin, create a new ADF pipeline with the Wait activity in the master branch and click Save all. This will save your changes in the master branch of Azure DevOps Git.
2. Next, switch from Azure DevOps Git to Data Factory mode by clicking the button in the top-left corner of the screen. You'll notice that there are no newly created pipelines.
3. To see your newly created pipeline, go to Azure DevOps > Repos > your repository > Files > pipeline. Here, you'll see your pipeline created in the master branch. It is saved as a JSON file in DevOps. You'll also notice that only the master branch has been created in the current repository.
5. To continue working with your changes, navigate to your ADF and select Azure DevOps Git mode. You'll see that your changes have been saved in the master branch and can be used to continue working on your pipeline. To publish your DevOps pipeline, click the Publish button. ADF will create a new branch called adf_publish inside your repository and publish the changes to ADF directly. You'll see a message about the Publish branch in the Pending changes dialog box.
Once the publish is completed, click OK. Then, switch to Data Factory mode to see that the pipeline has been successfully deployed.
6. When you publish your ADF pipeline from the master branch to Data Factory, a new branch called adf_publish is automatically created in the repository. The adf_publish branch contains the ARM template, which is a code representation of your ADF and Azure resources. You can find the ARM template in the adf_publish branch of your project in DevOps. The ArmTemplateForFactory.json and ARMTemplateParametersForFactory.json files, which are templates for your ADF, are saved only in the adf_publish branch.
Conclusion
In the first part of the "Azure Data Factory – CI/CD" series, I've provided a comprehensive guide on setting up Azure DevOps and publishing changes to ADF. In the upcoming part of this series, I'll be delving into more advanced topics, such as deploying features into the master branch, preparing for the CI/CD of ADF, and creating an Azure pipeline for CD. Be sure to stay tuned for the upcoming article, and I hope you found the first part informative and enjoyable.
Senior Project Manager- IT|Robert Bosch GmbH|TOGAF? 9.2 Enterprise IT Architecture | ISAQB?Software Architecture | Microsoft? Azure Architecture Expert | Scrum.org? Scrum Master - PSD DevOps | SAFe?POPM | PMI PMP?
1 年Thanks for the great demonstrations, how can see all your previous content ?
Immediate Joiner||Azure Databricks Data engineer|| ADF||Pyspark|| Python ||SQL|| Hadoop||Hive||Sqoop||Snowflakes||S3|||Lambda||Glue||Athena||Linuix||Git||GCP
1 年Great work, pls share PDF of this if possible
Data Engineer-2 @ Porter | Building Near Real Time ingestion and Data Platform
1 年Great share