Monitoring Azure Pipelines
Ankit Ranjan (DevOps Engineer)
Actively Seeking Azure DevOps/Cloud Role | DevOps Engineer | Automating & Reducing Developer Toil | Modernising IAC like Jam on the Bread | Microsoft Certified: Azure Admin Associate | Certified Terraform Associate |
This Post builds on my previous exploration of Azure Pipelines' core components for CI/CD in different posts. By the end of this post, we will understand the practical knowledge of efficiently managing build and release pipelines. We'll focus on leveraging built-in features to assess agent health, ensure prompt job execution,-
Key topics include:
1. Understanding essential monitoring principles
2. Analyzing pipeline task execution and performance metrics
Understanding monitoring concepts
When using Azure Pipelines, several key concepts are crucial for effective monitoring:
1. Pipeline Status: Ensures continuous, error-free pipeline operation by checking for build failures, test failures, and deployment issues.
2. Code Quality Metrics: Examines factors like code coverage, complexity, and code smells to identify potential performance or functionality problems before deployment.
3. Security Vulnerabilities: Assesses security risks in application code, dependencies, and pipeline configuration to maintain pipeline and application security.
4. Resource Utilization: Monitors CPU and memory usage of agents to prevent pipeline execution from impacting other jobs due to high resource consumption or extended run times.
5. Deployment Health: Oversees the deployed application to ensure proper functionality, connectivity, and availability.
6. Release Cycle Time: Tracks the duration from initial development to production deployment, identifying and addressing delays promptly.
These concepts are vital for optimizing Time to Detect (TTD), Time to Mitigate (TTM), and Time to Remediate (TTR) metrics, which measure an organization's ability to deliver applications efficiently and resolve issues quickly.
This post will mainly focus on select concepts, starting with pipeline task monitoring and performance analysis.
Monitoring Pipeline Tasks and Performance
In this section, we will cover two different approaches to monitoring tasks and performance:
1. Using the Pipeline's User Interface
2. Utilizing Dashboards
Using the Pipeline's User Interface
The UI prominently displays metrics on pipeline, job, and task durations, highlighting the importance of execution time efficiency.
The displayed duration metrics offer immediate insight into the overall pipeline execution time and the runtime of individual jobs within it.
By clicking on each job, you can access a detailed breakdown of step durations. This granular view allows you to identify specific tasks that may require review and optimization, as illustrated in the following image:
Minimizing the total execution time of both build and release pipelines is crucial. Faster pipeline completion enables quicker software shipping and deployment across environments.
Any increase in pipeline duration may signal the introduction of issues through recent changes. In such cases, it's important to review individual task execution times to determine if the increase is expected and justified or if corrective action is needed.
To get an overview of total execution times for all your pipelines:
This view provides a comprehensive list of pipeline runs and their durations. For more specific analysis, you can use the filter options (highlighted in the accompanying image) to locate particular runs.
Continuously monitoring these metrics manually can be time-consuming. To address this, Azure Pipelines offers an Analytics view for each pipeline. To access this feature:
These steps will guide you to the Analytics view, which provides a more efficient way to track pipeline performance over time.
Next, open the Analytics tab:
Once the Analytics view loads, as shown in the following screenshot, you have three different reports that provide insights into the pipeline:
The Analytics view offers various reports that aggregate data over time. These reports can be filtered to display information for the last 7, 14, 30, or 180 days. The available reports include:
Each of these reports provides valuable insights into different aspects of your pipeline's performance and reliability.
Using Dashboards
An alternative method for monitoring your pipelines is through the Dashboards feature, located in the Azure DevOps Overview section of each project. This feature allows you to create custom dashboards using various widgets, each displaying different data points. These dashboards provide a macro-level view that's easily accessible to all team members.
Azure DevOps offers three built-in widgets specifically for Azure Pipelines:
1. Build History: A histogram of builds, indicating success or failure, with links to individual builds.
2. Deployment Status: Presents a combined view of deployment status and test pass rates across multiple environments.
3. Release Pipeline Overview: Allows tracking and viewing of a release pipeline's status.
The image below showcases a custom dashboard named "Pipelines," incorporating all the aforementioned widgets. This dashboard presents information from various pipelines:
NOTE: Another effective way to monitor your pipelines is through the Azure Pipelines Microsoft Teams app marketplace extension, available in the marketplace catalog. Clicking on it will redirect you to the Microsoft App Source store. This Teams app is installed in your Teams tenant, which is beyond the scope of this post. After installation, you can configure subscriptions to receive notifications on pipeline status or approvals for the pipelines you monitor.
Monitoring pipeline agents
In Azure DevOps, pipeline agents provide some general reporting capabilities. They can be accessed by clicking Organization settings
Once you are inside Organization settings, you will have access to the Agent Pools option in the navigation menu under the Pipelines section:
Let's go through each of the available reports.
领英推荐
Job Runs
The job runs report for each agent pool provides a summary of the jobs being executed. This report includes details such as the job ID, pipeline name, project, agent specification, queue time, wait time, and duration:
One of the most crucial metrics in this report is Wait Time. This metric deserves close attention. An increasing wait time between jobs may indicate a need for additional concurrency and more agents.
We will discuss how to acquire concurrency and approaches for increasing the number of agents later in this section. For now, let’s continue reviewing the available reports.
Agent Status
In the agent pool details, the Agents tab provides information about each agent, including its name, availability, last run, current status, version, and options to enable or disable it.
The following screenshot shows an unavailable or Offline agent:
The following screenshot shows an available or Online agent:
You need to ensure that self-hosted agents are online and enabled when the agent pool is active in your projects. Otherwise, pipeline jobs will be queued and remain unexecuted if no agents are available.
Agent jobs
From the previous agent status report, you also have access to the Jobs report for a specific agent, as shown in the following screenshot:
This information is valuable when determining if a specific agent is behaving erratically or experiencing intermittent failures while running jobs. In such cases, an upgrade to the agent version might be necessary, dependencies or tooling installed on the agent might need attention, or, as a last resort, the agent along with its infrastructure might need to be removed and replaced with a new one.
Next, let’s examine the most important report available for agent pools: analytics.
The Analytics report allows us to understand the aggregated usage of the agents in the pool over time with histograms indicating concurrency, queued jobs, and running jobs.
Purchasing Concurrency
Adding concurrency to your agent pools applies to both Microsoft-hosted and self-hosted agents. The decision to increase concurrency depends on your business's need to eliminate wait times between job executions. To do this, follow these steps:
1. Begin by setting up billing at the organization level, as illustrated in the following figure:
2. Clicking the "Set up billing" button will open a dialog where you can link your Azure DevOps organization to an Azure subscription, which is used to pay for services in Azure DevOps. If you have access to an Azure subscription, select it and click "Save":
It's important to note that the Active Directory tenant you are logged into is used to locate Azure subscriptions you have access to and link them to the organization. You must be a member of the Project Collection Administrators group to complete this step.
3. If you do not have an available Azure subscription, you will see a message similar to the following:
4. You can click the "New Azure subscription" button to set up a new subscription and provide your credit card details for billing. Once billing is configured, you can purchase concurrency by navigating to the "Parallel jobs" option under Project Settings. From there, you can adjust the number of parallel jobs as needed.
Code Quality Metrics
Creating a comprehensive unit test pipeline involves analyzing every data point generated by the unit test runner framework and the tasks used to execute them. However, there are often limitations in the metrics provided by the task for identifying issues beyond the actual unit tests being run.
Deployment Health
CI/CD enables you to automate every aspect of the deployment process, including validating the application in the target environment after deployment. This automation ensures that no human intervention is needed to verify that a new application version is functioning as expected and that no new errors or bugs have been introduced by developers or the environment configuration.
Integration with Azure Monitor
Azure Monitor is a comprehensive monitoring solution for collecting, analyzing, and responding to logs and metrics from both cloud and on-premises environments. It helps you understand the performance of your applications and services and provides tools to manually and programmatically address conditions that need attention, ensuring that applications continue to function as expected.
In Azure Pipelines, integration with Azure Monitor is facilitated through the AzureMonitor task. This task allows you to query rules for active alerts and determine if the deployment of a new application version has triggered any new alerts.
To get started, follow these steps:
1. Create a new release pipeline:
2. Locate the Azure App Service deployment with the continuous monitoring template by searching for "monitor" in the search field. Then, click on the "Apply" button.
3. You will be presented with a stage, where you need to provide the App Service name, Resource Group name for Application Insights, and the Application Insights resource name fields:
4. The key step is the "Configure Application Insights Alerts" task, which utilizes the Azure CLI to create four different metrics-based alerts.
5. Once this stage is configured, switch to the pipeline view and click on "Post-deployment conditions" to set up the gates. In this case, the "Query Azure Monitor alerts" option should already be enabled.
6. Next, adjust the deployment gate settings as needed, including specifying the required Azure subscription and Resource Group name. Pay particular attention to the "Delay before evaluation" setting.
7. With this setup, Azure Pipelines will execute the gate after the deployment steps are completed. This verifies that the monitoring alerts have been configured and provides a visual indicator.
In this post, We explored monitoring concepts crucial for your CI/CD projects and various methods for tracking your pipeline tasks and their performance. You learned how to build dashboards with graphical widgets to visualize behavior over time and integrate with collaboration tools for real-time notifications.
Additionally, We covered how to monitor job runs, task performance, and agents, including when to purchase concurrency and options for increasing the number of agents to ensure prompt pipeline execution. Lastly, you discovered how to measure pipeline quality using code quality metrics, application runtime checks, and monitoring tools.
In the next post, we will look into Provisioning Infrastructure
Using Infrastructure as Code.