How to Implement DevSecOps on Azure Kubernetes Service (AKS)

How to Implement DevSecOps on Azure Kubernetes Service (AKS)

Solution Brief:

Secure DevOps, also known as DevSecOps, is an extension of the DevOps approach that integrates security measures into various stages of the traditional DevOps lifecycle. Incorporating security practices into DevOps processes provides numerous benefits, such as:

  1. Enhancing the security of your applications and systems by identifying potential security threats and preventing vulnerabilities from reaching deployed environments.
  2. Boosting security awareness within your development and operations teams.
  3. Integrating automated security procedures into your software development lifecycle.
  4. Decreasing the cost of remediation by detecting security issues early in the design and development stages.

When implementing DevSecOps on Azure Kubernetes Service (AKS), different organizational roles may have unique considerations for security implementation. Examples of these roles include:

  1. Developers who create secure applications to run on AKS.
  2. Cloud Engineers who build secure AKS infrastructure.
  3. Various Operations teams responsible for managing clusters or monitoring security issues.

I have structured this article to cover various stages of the DevOps lifecycle and provide suggestions and considerations for integrating security controls and best practices. My guide encompasses typical procedures and resources to include in continuous integration and continuous delivery (CI/CD) pipelines, favoring user-friendly built-in tools whenever possible.

In my DevOps environment, I have implemented the following security measures for enhanced security:

No alt text provided for this image

  1. I have configured Azure Active Directory (Azure AD) as the identity provider for GitHub and enabled multi-factor authentication (MFA) for extra authentication security.
  2. Developers are using Visual Studio Code or Visual Studio with security extensions to proactively analyze their code for security vulnerabilities.
  3. Application code is committed to a GitHub Enterprise repository, which is corporate-owned and governed.
  4. GitHub Enterprise is integrated with automatic security and dependency scanning through GitHub Advanced Security.
  5. Pull requests trigger continuous integration (CI) builds and automated testing via GitHub Actions.
  6. The CI build workflow via GitHub Actions generates a Docker container image that is stored to Azure Container Registry.
  7. As part of the continuous delivery (CD) workflow in GitHub Actions, manual approvals for deployments to specific environments, like production, can be introduced.
  8. GitHub Actions enables CD to AKS, and GitHub Advanced Security detects secrets, credentials, and other sensitive information in application source and configuration files.
  9. I use Microsoft Defender to scan Azure Container Registry, AKS cluster, and Azure Key Vault for security vulnerabilities. Microsoft Defender for Containers scans the container image for known security vulnerabilities upon uploading it to Container Registry. Defender for Containers also performs scans of the AKS environment and provides run-time threat protection for AKS clusters. Microsoft Defender for Key Vault detects harmful and unusual, suspicious attempts to access key vault accounts.
  10. I apply Azure Policy to Container Registry and AKS for policy compliance and enforcement. Common security policies for Container Registry and AKS are built-in for quick enablement.
  11. I use Azure Key Vault to securely inject secrets and credentials into an application at runtime, separating sensitive information from developers.
  12. The AKS network policy engine is configured to help secure traffic between application pods by using Kubernetes network policies.
  13. I set up continuous monitoring of the AKS cluster by using Azure Monitor and Container insights to ingest performance metrics and analyze application and security logs. Container insights retrieve performance metrics and application and cluster logs, and diagnostic and application logs are pulled into an Azure Log Analytics workspace to run log queries.
  14. I use Microsoft Sentinel, which is a security information and event management (SIEM) solution, to ingest and further analyze the AKS cluster logs for any security threats based on defined patterns and rules.
  15. For penetration testing of web applications and services, I use open-source tools such as Open Web Application Security Project (OWASP ZAP).
  16. To manage DevOps security across multi-pipeline environments, including GitHub and Azure DevOps, I use Defender for DevOps, a service available in Defender for Cloud.

Team members overview and responsibilities:

In managing the complexity of our DevSecOps on Kubernetes-based solution deployments, we need to consider a separation of concerns. We should determine which team in our enterprise environment is responsible for each aspect of the deployment and identify the necessary tools and processes to best achieve our objectives. In this section, we will cover the common roles of developers, application operators (site reliability engineers), cluster operators, and security teams.

Developers:

Developers are responsible for writing the application code and committing it to the designated repository. They also need to write and run scripts for automated testing to ensure their code functions as intended and integrates well with the rest of the application. As part of the automation pipeline, they also need to define and script the building of container images.

Application operators (site reliability engineers):

Application operators or site reliability engineers (SREs) build solutions to automate the oversight of large software systems. They play a crucial role as a bridge between development and cluster operator teams. SREs help establish and monitor service-level objectives and error budgets, manage application deployments, and often write Kubernetes manifest (YAML) files.

Cluster operators:

Cluster operators are responsible for configuring and managing the cluster infrastructure. They use infrastructure as code (IaC) best practices and frameworks like GitOps to provision and maintain their clusters. They use various monitoring tools like Azure Monitor Container insights and Prometheus/Grafana to monitor overall cluster health. They are also responsible for patching, cluster upgrades, permissions, and role-based access control on the cluster. In DevSecOps teams, they ensure that the clusters meet the security requirements of the team and work with the security team to establish those standards.

Security team:

The security team is responsible for developing security standards and enforcing them. They create and select Azure Policy that's enforced in the subscriptions and resource groups holding the clusters. They monitor security issues and work with the other teams to ensure that security is at the forefront of every step of the DevSecOps process.


DevSecOps lifecycle stages:

I implement security controls in each phase of the software development lifecycle (SDLC). This implementation is a key piece of my DevSecOps strategy and of my shift-left approach.

No alt text provided for this image

Plan Phase:

In the Plan phase of the software development lifecycle, we need to collaborate between security, development, and operations teams to ensure that security requirements are appropriately addressed.

As we design and plan, it's important to consider building a more secure platform for our AKS-hosted system. This includes incorporating both internal and external components, such as runtime security and network firewalls.

To help identify potential security risks, we recommend implementing threat modeling into our process, using the STRIDE threat model methodology to identify, mitigate, and validate risks.

Additionally, we can apply the Azure Well Architect Framework (WAF) security and operational best practices to provide guidance for identity management, application security, infrastructure protection, data security, and DevOps monitoring in cloud native environments.

By implementing these best practices, we can help ensure that security is built into the system at every layer, starting with the platform itself, and reduce the potential cost of dealing with security issues found in later SDLC stages.

Develop Phase:

In the develop phase, "shifting left" is a key tenant of my DevSecOps mindset. I start adopting secure coding best practices and using IDE tools and plugins for code analysis during the development phase to address security issues earlier in the development lifecycle when they're easier to fix.

I enforce secure coding standards by using established secure coding best practices and checklists to protect my code from common vulnerabilities like injection and insecure design. I adopt the OWASP foundation's industry standard secure coding recommendations when writing code, especially when developing public-facing web applications or services. I also look at secure coding practices for my specific programming language runtimes, such as Java and .NET. I enforce logging standards to protect sensitive information from being leaked into application logs using filters and plugins provided by most popular logging frameworks like log4j and log4net.

To automate security checks, I use IDE tools and plugins. Most popular IDEs like Visual Studio, Visual Studio Code, IntelliJ IDEA, and Eclipse support extensions that give me immediate feedback and recommendations for potential security issues I might have introduced while writing application code. For example, SonarLint is an IDE plugin available for most popular languages and developer environments that provides valuable feedback and automatically scans my code for common programming errors and potential security issues.

I also establish controls on my source code repositories. I establish a branching methodology so there's consistent use of branching across the enterprise, such as Release flow and GitHub flow, to support team and parallel development. I ensure there are established merge policies for certain branches, like main, before changes can be merged or committed into them. I prevent other developers from committing code directly into my main branch and establish a peer review process that requires a minimum number of approvals before changes can be merged to a main branch. I use pre-commit hooks to check for sensitive information within my application source code and prevent a commit from happening if a security issue is found. I also establish role-based access control within my version control system and create well-defined roles based on the principle of least privileges.

To secure my container images, I use lightweight images with a minimal OS footprint, such as Alpine or even distroless images, that only contain my application and its associated runtime. I use only trusted base images retrieved from a private registry that is frequently scanned for vulnerabilities. I evaluate image vulnerabilities locally using developer tools like Trivy, an open-source tool that analyzes security vulnerabilities within my container images. I prevent root user access/context for an image and consider using an AppArmor profile within my Kubernetes cluster to further help enforce security for my running containers.

Build Phase:

During my build phase, I work with my site reliability engineers and security team to integrate automated scans of my application source within my CI build pipelines. I configure my pipelines to enable security practices such as SAST, SCA, and secrets scanning by using the CI/CD platform's security tools and extensions.

To find potential vulnerabilities in my application source code, I perform Static Code Analysis (SAST) and use GitHub Advanced Security scanning capabilities for code scanning and CodeQL.

Code scanning is a feature that I use to analyze the code in my GitHub repository to find security vulnerabilities and coding errors. Any problems identified by the analysis are shown in GitHub Enterprise Cloud. If code scanning finds a potential vulnerability or error in my code, GitHub displays an alert in the repository.

I can also configure branch rules for required status checks to ensure that my branch has always been tested with the latest code before merging any new code.

To analyze my Kubernetes deployment objects, I use tools like kube-score, which is a tool that does static code analysis of my Kubernetes object definitions. The output is a list of recommendations of what I can improve to help make my application more secure and resilient.

To prevent the fraudulent use of secrets that were committed accidentally to a repository, I perform secret scanning. When secret scanning is enabled for a repository, GitHub scans the code for patterns that match secrets used by many service providers. GitHub also periodically runs a full git history scan of existing content in repositories and sends alert notifications.

For Azure DevOps, Defender for Cloud uses secret scanning to detect credentials, secrets, certificates, and other sensitive content in my source code and my build output. Secret scanning can be run as part of the Microsoft Security DevOps for Azure DevOps extension.

To track open-source components in my codebase and detect any vulnerabilities in dependencies, I use software composition analysis (SCA) tools.

Dependency review lets me catch insecure dependencies before I introduce them to my environment and provides information on license, dependents, and age of dependencies. It provides an easily understandable visualization of dependency changes with a rich diff on the "Files Changed" tab of a pull request.

Dependabot performs a scan to detect insecure dependencies and sends Dependabot alerts when a new advisory is added to the GitHub Advisory Database or when the dependency graph for a repository changes.

To minimize cloud misconfigurations reaching production environments, I enable security scans of Infrastructure as Code (IaC) templates. I proactively monitor cloud resource configurations throughout the development lifecycle. Microsoft Defender for DevOps supports both GitHub and Azure DevOps repositories.

To identify known vulnerabilities in my workload images in container registries, I scan them with Defender for Containers.

Defender for Containers scans the containers in Container Registry and Amazon AWS Elastic Container Registry (ECR) to notify me if there are known vulnerabilities in my images. Azure Policy can be enabled to do a vulnerability assessment on all images stored in Container Registry and provide detailed information on each finding.

To automatically build new images on base image update, I use Azure Container Registry Tasks. Azure Container Registry Tasks dynamically discovers base image dependencies when it builds a container image.

As a result, it can detect when an application image's base image is updated. With one preconfigured build task, Container Registry tasks can automatically rebuild every application image that references the base image.

To digitally sign my container images and configure my AKS cluster to only allow validated images, I use Container Registry, Azure Key Vault, and notation.

Azure Key Vault stores a signing key that can be used by notation with the notation Key Vault plugin (azure-kv) to sign and verify container images and other artifacts. Container Registry lets me attach these signatures by using the Azure CLI commands.

The signed containers let users make sure that deployments are built from a trusted entity and verify an artifact hasn't been tampered with since

Deploy Phase:

During the deployment phase, I, as a developer, work with my application operators and cluster operator teams to establish the right security controls for the continuous deployment (CD) pipelines, so that we can deploy code to a production environment in a more secure and automated manner.

To control the access and workflow of the deployment pipeline, I can protect important branches by setting branch protection rules.

These rules define whether collaborators can delete or force push to the branch, and also set requirements for any pushes to the branch, such as passing status checks or a linear commit history. By using environments for deployment, I can configure environments with protection rules and secrets.

I can take advantage of the Approvals and Gates feature to control the workflow of the deployment pipeline. For example, I can require manual approvals from a security or operations team before a deployment to a production environment.

To secure deployment credentials, I can use OpenID Connect (OIDC) to allow my GitHub Action workflows to access resources in Azure without needing to store the Azure credentials as long-lived GitHub secrets. By using environments for deployment, I can also configure environments with protection rules and secrets.

I can shift security credentials to my Kubernetes cluster by using a pull-based approach to CI/CD with GitOps, which reduces the security and risk surface by removing credentials from being stored in external CI tooling. I can also reduce allowed inbound connections and limit admin-level access to my Kubernetes clusters.

To find vulnerabilities in my running application, I can run dynamic application security testing (DAST) tests using GitHub Actions in deployment workflows. I can also use open-source tools such as OWASP ZAP to do penetration testing for common web application vulnerabilities.

To deploy container images from trusted registries only, I can use Defender for Containers to enable Azure Policy add-on for Kubernetes, and enable Azure Policy so that container images can only be deployed from trusted registries.

Operate Phase:

During the Operate phase, I perform tasks to proactively monitor, analyze, and alert on potential security incidents. I use production observability tools like Azure Monitor and Microsoft Sentinel to ensure compliance with enterprise security standards.

To enable automated scanning and monitoring of my production configurations, I use Microsoft Defender for Cloud. I run continual scanning to detect drift in the vulnerability state of my application and implement a process to patch and replace the vulnerable images. I also implement automated configuration monitoring for operating systems.

I follow Microsoft Defender for Cloud container recommendations to perform baseline scans for my AKS clusters and get notified in the Microsoft Defender for Cloud dashboard when configuration issues or vulnerabilities are found. Additionally, I follow its network protection recommendations to help secure the network resources being used by my AKS clusters.

To keep my Kubernetes clusters updated, I have a lifecycle management strategy in place and use the AKS platform’s planned maintenance features to have more control over maintenance windows and upgrades. I ensure AKS worker nodes are upgraded more frequently and apply weekly OS and runtime updates.

To secure and govern my AKS clusters, I use Azure Policy. After installing the Azure Policy Add-on for AKS, I apply individual policy definitions or groups of policy definitions called initiatives to my cluster. I also use built-in Azure policies for common scenarios and create custom policies for specific use cases. I apply policy definitions to my cluster and verify those assignments are being enforced.

I use Azure Monitor for continuous monitoring and alerting to collect logs and metrics from AKS, gain insights on the availability and performance of my application and infrastructure, and monitor my solution's health and spot abnormal activity early. I also use it to gate or rollback releases based on monitoring data and to ingest security logs and alert on suspicious activity.

For active threat monitoring, I use Microsoft Defender for Cloud for the AKS at the node level and for internals. I also use Defender for DevOps for comprehensive visibility and Defender for Key Vault to detect unusual, suspicious attempts to access key vault accounts and alert administrators based on configuration. Additionally, I use Defender for Containers to alert on vulnerabilities found within my container images stored on Container Registry.

To monitor for real-time security threats, I enable centralized log monitoring and use SIEM products. I connect AKS diagnostics logs to Microsoft Sentinel for centralized security monitoring based on patterns and rules. I also enable audit logging to monitor activity on my production clusters and integrate user authentication for AKS with Azure Active Directory. Finally, I enable diagnostics on my Azure resources to have access to platform logs that provide detailed diagnostic and auditing information for my Azure resources.


Credit to the actual authors from Microsoft:?Alessandro Segala,?Adnan Khan,?John Poole,?Bahram Rushenas,?Ayobami Ayodeji, PMP,?Abed Sau,?Ahmed Bham,?Chad Kittel ([email protected])

Is any one really implemented this I see there is challenge in step 10 Azure

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了