登录查看更多内容

PlatformOps in a Microsoft Enterprise-scale landing zone

Anders Bonde

WE GPS CTO hos Microsoft

发布日期: 2020年7月1日

+ 关注

The main goals with this blog are to illustrate:

PlatformOps: How you build a datacenter in Azure with the new opinionated, prescriptive and code-based Enterprise-scale Landing Zone option(s) in Microsoft's Cloud Adoption Framework (CAF).
"Compliance-as-Code": How you with Enterprise-scale can implement a compliant Azure platform with guardrails and policies in "code".

The blog will also illustrate how you can implement an advanced "DevOps" pipeline with GitHub and GitHub Actions to build, test, and deploy your Enterprise-scale platform and application landing zones to Azure.

However, it is NOT a goal to make this look easy!

Building a foundation for a full data center in Azure is a very complex task, but the goal here is to make it "as simple as possible, but not simpler".

"... as simple as possible, but not simpler!"

Expectations to the reader

This blog assumes that you have a broad understanding of generic Cloud and DevOps terms like Infrastructure-as-Code, landing zones, pipelines etc.

The first part will focus on compliance and it is an advantage to have knowledge of and/or interest in Cloud compliance, maybe including knowledge about security and risk assessment standards like FedRAMP, NIST, ISO 27001 etc.

The second part is more technical and you should be familiar with key Azure constructs and services as well as a basic understanding of and/or interest in GitHub.

Finally, it assumed that you either already are familiar with Microsoft's new Enterprise-scale architecture or are prepared to use the many links in this blog to learn more about the Enterprise-scale platform while you are reading.

Enterprise-scale Landing Zones: The North Star

The June edition of Microsoft's Cloud Adoption Framework (CAF) includes a new recommended implementation option - the Enterprise-scale Landing Zone (ESLZ).

In short, the goal of Enterprise-scale is to provide detailed guidance on how to build a complete datacenter in Azure, not 'just' an individual landing zone.

Note that an Azure 'landing zone' is never 'done' as the underlaying Azure platform will keep changing by adding new capabilities to drive the innovation, that you and your business want to benefit from. In other words, you should see Enterprise-scale as a direction - a "North Star" - not the end-state.

You will find additional details about Enterprise-scale Architecture as well as reference implementations in GitHub here.

As you can see in the picture, Enterprise-scale is part of the "Ready" phase in CAF, but Enterprise-scale will also have huge impact on how you handle "Adopt", "Govern" and "Manage".

Enterprise-scale is …

Prescriptive - with very detailed guidance and recommendations in a "Where do I start on Monday"-style
Opinionated - recommendations based on experiences from the numerous customer engagements over the last couple of years.
Code-based - with ARM (Azure Resource Manager) as "Management & Control Plane" and with reference implementations, ready to deploy in your environment. Refer to Part 2b below for much more details.

As with all architectural decisions, there are trade-off's.

In order to be Prescriptive, you can't boil the ocean, but have to focus on what you think are most important. You should expect that Enterprise-scale will get smarter over time as we learn from real-world experiences. See how you contribute here and the roadmap here.

Being Opinionated is simply giving your best recommendations to the most important architectural decisions and I know that many have been looking for exactly that. It is one of the Enterprise-scale principles to suggest an Azure native approach, where possible. Some will have strong arguments to take a different architectural decision - and it is absolutely fine - but it comes with a price: The more, you deviate, the less value you will get from current and future versions of Enterprise-scale.

Last, it is one of the Enterprise-scale principles to be "Cloud Native" and to use ARM in our Code-based approach. Some may decide to use Terraform or other "Management/Control Planes", typically to support use of several Public Clouds, and again, it is absolutely fine.

Important: I am convinced that most major organizations over time will benefit from using the innovation from several Public Clouds, but I strongly recommend that you start with ONE and "go native" with each Cloud. I will argue that very few organizations will have the capability (or resources) to implement more than one Cloud at the same time. Please see my Cloud Strategy blog for more details about Multi-Cloud considerations.

Start with ONE and Go Native!

Metropolis

Using an analogy, an Enterprise-scale platform is similar to how city utilities such as water, gas, and electricity are accessible before new houses are constructed.

In this context, the network, IAM, policies, management, and monitoring are shared 'utility' services that must be readily available to help streamline the application migration and innovation process.

An Enterprise-scale platform consists of the two areas:

The Platform: The general "plumbing"; e.g. the general identity, security, governance, networking, monitoring … services, to be used by all workloads
The Landing Zones (*): The application specific "plumbing"; e.g. everything needed by the specific application archetype on top on what is already provided by the Platform

(*) In Enterprise-scale the landing zone is implemented as an "Azure subscription".

Compliance-as-Code

All organizations have today very high compliance and security requirements. In regulated industries like Public Sector, Finance, Pharma etc., you often have to document the compliance formally. However, this documentation process is often manual and very time-consuming and with a significant "after-the-fact" focus; e.g. to describe what you have done to avoid incompliance and how you will identity potential breaches if/when they happen.

In this blog, I will illustrate how you with Enterprise-scale can implement most (all?) of these requirements "as code"; e.g. "Compliance-as-Code". See also here.

In other words, instead of sending documents to your auditors, you can now not only document what you wanted to do, but also that it is actually implemented, by sending your "code" - or by giving read-only rights to your GitHub :)

Azure policies

"Compliance-as-Code" is based on the core Azure policy concept.

As Azure is software-defined, you can code your compliance requirements, like in the simple "Allowed locations" example here, that will restrict ("Deny") any use of resources outside a specific location/Azure region, here "westus2". You will see further examples of "Compliance-as-Code" later in this blog.

Subscription democratization & Policy-driven governance

The Enterprise-scale architecture is based on the five design principles. The two first (Subscription democratization and Policy-driven governance) represent a fundamental shift in how we (IT/Ops) today offer services to the "business".

With Subscription (or Cloud) democratization, we want to make it as simple as possible for the "business" to use Cloud; e.g. no back-level "golden images" or ticket/approval systems to get access to "shared services". It should all be "self-service" and app teams can create the necessary services (through portal or Infrastructure-as-Code) on demand, maybe as part of a DevOps process.

However, you can still be in full control and be compliant, for example by using Azure Policy effects to:

Deny not approved services or applications (Public IP, locations ...) and force encryption, BYOK etc.
Deploy components, we require (patching, backup, monitoring etc.)
Create Audit logs if workloads are "incompliant", as especially legacy workloads may not always run in a fully compliant environment

Azure Blueprints

Azure has already defined a number of "blueprints" for the guardrails/policies, needed to comply to international standards like FedRAMP, NIST, ISO 27011 etc.

In Enterprise-scale we will not directly use Azure Blueprints as we as a guiding principle want to use Azure Resource Manager (ARM) as our "Management & Control Plane".

However, you can absolutely benefit from using the Azure Blueprints as inspiration for the Azure policies, you will need to implement and assign in "code".

Part 1: From Portal to PlatformOps & Compliance-as-Code

In the section, I will walk you through a typical Azure maturity journey and illustrate how you can transform your Azure foundation from using the Azure Portal to build "Compliance-as-Code" in an Open Source community.

Step 1: The Azure Portal

Almost all will start their Azure journey by using the Azure Portal, typically by building a specific LZ for a specific workload by setting up AAD, RBAC, network etc. up from scratch in your subscription.

This may work for your first workload, but experience shows it does not scale if later you want to add additional workloads. It is very different to build LZ's for a simple sandbox environment, a "Lift & Shift" migration or an SAP implementation.

Step 2: Infrastructure-as-Code (IaC)

The next natural step is to reuse ARM templates, generated by the Portal.

This obviously speeds up the process of generating a 'copy' of existing Azure resources, but it is still a very manual process with lots of potential human errors.

As an example, you can execute your modified ARM template from a VS Code PowerShell terminal, if you connect to your Azure tenant, using the "Connect-AzAccount" command

Step 3: Introduce Platform development teams - and GitHub

This is a major step, to go from reusing templates to introducing a Platform development team.

In the Enterprise-scale reference implementation, we will use GitHub to build, test, and deploy the Platform code.

We will use GitHub Actions to deploy to Azure.

As many will still continue to use the Portal for some deployments, we can also use GitHub Actions to synchronize current Azure state to GitHub.

This allows the Platform developers - the PlatformOps team - to be in full control, including being able to manage and code review the platform (including version control), to commit or rollback changes, and with the option to make branches to test new functionality.

Step 4: Compliance-as-Code

With the setup, you are now ready to bring in the compliance experts to define the requirements, you will need be compliant.

You will obviously need people, who can turn the requirements into code; e.g. xOps people who know "Infrastructure-as-Code" (IaC), but trust me: It is the easy part!

IaC is "easy", defining compliance is hard

The real challenge is to define what you need to be compliant, including

Which people need to be involved (Compliance, Legal, Security, Ops …)
Which standards do you want to lean on (ISO, NIST, FedRAMP, Government …)
What specific compliance requirements does your organization have?

Step 5: Adding the Application LZ's

When the Platform "plumbing" is ready, it is now time for the application LZ work to start.

They will naturally be able to use the same development environment, including GitHub.

As you see in the ESLZ Roadmap, it is planned to build LZ's for specific application archetypes, initially AKS, Windows Virtual Desktop (WVD), SAP, HPC and Analytics.

Step 6: "Compliance-as-Code" as Open Source

GitHub is today hosting the largest Open Source community in the world and it is a natural step to build on that to extend Enterprise-scale into communities, both the platform itself and over time even application archetype LZ's.

Microsoft invites already today individuals and organizations to contribute directly to the development of Enterprise-scale.

However, we also know that many organizations can benefit from collaborating directly. It could be Public Sector government organizations in countries with similar compliance requirements, co-developing a common "Compliance-as-Code" Platform, still allowing the individual organizations to implement own functionality where needed.

This is exactly what GitHub is built for and what it does every day!

Step 7: SaaS-as-Code

Many organizations are today using - or expecting to use - the innovation in the growing SaaS market without the overhead of local management and governance.

However, I see two areas where traditional SaaS vendors may be challenged in the near future:

Compliance requirements keep getting more and more advanced and demanding, including who have access to which data in which situations and from where. As a Hyperscale Cloud provider, we have these complex dialogs every day and I expect SaaS vendors to have to meet the same expectations, if not today, then soon.
Exclusive ownership of your own data seems like a natural ask, but it far from simple in a (traditional) SaaS world. In the new data-driven world, all your data must be in your own data lake, including raw/telemetry data. "Data has gravity" and for latency reasons your new innovative apps and AI models must be very close to the data you want to use for testing, training and production.

I foresee that innovative SaaS vendors will be able to build on the mature enterprise-scale platform and on the advanced community features in GitHub and give their customers a "SaaS-like" experience with both an evergreen platform and application within the customer's own environment.

And next: It is all about the "pipeline"!

Part 2a: The Enterprise-scale pipeline - overview

You will here get a high-level overview of how you with Enterprise-scale can develop and maintain your Azure platform in a PlatformOps team, utilizing GitHub and GitHub Actions.

As you can see below, you will be able to ...

Import the current state from your Azure environment, including management groups and assigned policies
Deploy changes from the PlatformOps team to Azure.

Step 1: From Azure - import current Azure state

Initiate the workflow using GitHub CLI
This will spin a Pull Request, and using GitHub Actions it will export current state to a new GitHub branch "System"
Merge "System" into your "main" branch in your "Org" GitHub repo.
Synchronize the changes back to your local "main" branch, using "git pull".

Step 2: To Azure - deploy changes to Azure

Create a new branch in your local environment, make your changes and commit the branch.
Push the changes to GitHub Org level
Create a Pull Request and kick off a GitHub Actions workflow to deploy your changes to Azure
If the deployment is successful, merge the new Azure state into your "main" branch
Synchronize the changes back to your local "main" branch, using "git pull".

Part 2b: The Enterprise-scale pipeline - under the hood

In this section, you will see how you can use the Enterprise-scale reference implementation as a starting point to get to a production ready Enterprise-scale Azure platform.

Note: Please refer to the "Getting Started" section for more details on how to setup this in your own environment.

I will go through these 4 steps

Deploy the Enterprise-scale reference implementation to an Azure environment, here based on two MSDN subscriptions
Synchronize the state of this Azure environment to the Org Github repo (2a) and then "pull" to the local platform team, with local developer GitHub repo's (2b)
Make changes to the Azure state in the Azure portal (3a) and then repeat Step 2; e.g. synchronize Azure state to the Org Github repo (3b) and "pull" to the local environment (3c)
Make changes to the Azure state in "code" in the local environment, commit and push these changes to the Org GitHub (4a). Now create a new PR for the change to kick off a workflow to deploy to Azure (4b) and finally merge changes back to Org GitHub (4c) and "pull" to the local environment (4d)

Note: This process will utilize an Enterprise-scale provided script ("AzOps") to process the integration between GitHub and Azure.

Step 1: The Enterprise-scale reference implementation

The best way of learning how to benefit from Enterprise-scale, is to see it in action. As mentioned earlier, Enterprise-scale contains a "Contoso" reference implementation that can be deployed directly to your Azure environment in a One-Click experience.

The reference implementation will deploy ...

The management group hierarchy - see the picture below.
Approx. 100 "custom" policies

In the Azure Portal, you can now see both management groups and the new custom policies

Step 2: Extract current Azure state to GitHub

We will start this process by using GitHub CLI from a command line to kick off a GitHub Actions workflow (2a), utilization the AzOps script.

You can follow the progress in the "Actions" part of your GitHub website (2b).

The workflow will eventually create a Pull request (PR) and create a new "System" branch, that you can merge into your "main" branch (2c).

Last, you can use "git pull" to synchronize the changes to your local GitHub repo and you now have a synchronized representation of the state of your Azure management groups and policies in your local environment (2d).

Step 3: Change Azure State using the Azure Portal

In this step, we will start by using the Azure portal to make a number of changes to the Azure state:

Assign policies to management groups

Two policies ("DenyPublic-IP" and "Allowed locations" to "northeurope"&"westeurope") to management group "AB-management"
One policy ("Allowed locations" to "northeurope") to management group "AB-sandboxes"

Assign subscriptions (LZs) to management groups

The subscription "Platform LZ" to management group "AB-management"
The subscription "Sandbox LZ" to management group "AB-sandboxes"

The management group hierarchy now looks like this …

The state should be like this in the Azure portal.

If we now synchronize the Azure state again by repeating Step 2 above, we can now see the changes in our local code …

Compliance-as-Code in action

As expected, you will now be able to create a Public IP in "Sandbox LZ", but not in "Platform LZ".

However, if you try to deploy the public IP in "westeurope" in the "Sandbox LZ", it will be denied, as the only allowed location in management group "AB-sandboxes" is "northeurope".

Step 4: Change my Azure State from code

The last step is to demonstrate how you deploy your own code into Azure from your local VSCode with a local GitHub repo. You can read more about how artifacts are deployed in Enterprise-scale here.

In this example, we will assign a new policy "Deploy-Log-Analytics" to the "AB-management" management group.

Note that the "Deploy-Log-Analytics" policy has a "DeployIfNotExist" effect; e.g. it will deploy a Log Analytics workspace to any subscription in the "AB-management" management group if it does not already exist. In order to do this, a "DeployIfNotExist" policy will need a "managed identity" and as you will see, this will created automatically through this process. The two sample policies, we have used so far, DenyPublic-IP and Allowed locations, both have a "Deny" effect and do not need a managed identity!

We will start the process by creating a new GitHub branch - "deployLA" (4a), add the sample code to the "policyAssignments" section in the "Management-managementGroups_AB-management.parameters.json" file (4b), commit and push the new branch to the "origin" GitHub (4c).

We can now create a new PR from the new branch in GitHub Origin website (4d). This will kick off a "GitHub Actions" workflow, that will deploy our changes to Azure, again using the "AzOps" script (4e). When the AzOps script is finished, we can merge the potential changes into our "main" branch (4f).

In our local environment (here VSCode), we can now "git pull" the changes to our local GitHub repo (4g). As you can see here (4h), this process has actually changed part of the code, we provided, here adding information to the "Identity" section; e.g. the managed identity.

Last, but not least, we can also see the new policy assignment (4i) and the managed identity (4j) in the Azure portal.

A little history ...

Three years ago, in May 2017, I published an article with high-level guidance on how to build a production-ready Azure platform - Azure Onboarding AKA ‘The House.

I hope you will agree that we have come far since then :)

Want to hear more - or have feedback/suggestions?

As always, I am very interested on your feedback. Please feel free to add a comment to this blog, reach out to me ([email protected]).

Other Cloud related blogs

Service Manager - Identity & Cloud services at Clas Ohlson

4 年

Anders Bonde Hi Anders. Are there any plans to incorporate or make use of Project Bicep (https://github.com/Azure/bicep) in the Cloud Adoption Framework? Have a nice day!

1 次回应

Philippe Devantier Becker

4 年

Great writeup Anders, thanks for sharing!

Hans Bjerner

Service Manager - Identity & Cloud services at Clas Ohlson

4 年

This is good stuff regarding Azure governance!

1 次回应

Ilja Summala

4 年

Good stuff though I would say there are not many compliance schemes that require posture controls (e.g. ensure that x is not done) alone e.g. you still need the operative mitigations and ability evidence both posture and operative controls.

1 次回应

Kamil Wiecek

CCoE Azure Platform Engineer ??

4 年

I am also enthusiastic about the democratization of subscriptions. Managing budgets, permissions, and policy assignments is simpler and scales better when the resource container in a given application is a subscription rather than a resource group. What's more, you can automatically create 2000 subscriptions from one EA account. It would be even better if you didn't have to wait 48 hours to set your budget. I am planning to share the workaround with the community that we used in one of the projects. The concept of 'Step 2: Extract the current Azure state to GitHub' is interesting. I will check it for sure.

2 次回应

查看更多评论

要查看或添加评论，请登录

Anders Bonde的更多文章

Azure Migration: From Cloud First to Cloud Now

2025年3月13日

Azure Migration: From Cloud First to Cloud Now

Suggested audience for this article I see two obvious candidates ..

3 条评论
Azure migration: Quick, affordable and with low risk

2024年3月18日

Azure migration: Quick, affordable and with low risk

Introduction I have worked fulltime with Azure for more than a decade, and I have had Azure migration discussions with…
AI without filter - Use Copilot for M365 to prepare your organization for AI

2024年1月28日

AI without filter - Use Copilot for M365 to prepare your organization for AI

Note: This is the third article in my "AI without filter" series. First, the article "From ChatGPT to…
The FinOps Handbook v. 2023

2023年11月10日

The FinOps Handbook v. 2023

This is a major update to my FinOps Handbook, initially published here in 2019. Note: This is still a (very) short…

6 条评论
Azure Marketplace without filter - Procurement Best Practices!

2023年10月24日

Azure Marketplace without filter - Procurement Best Practices!

This article is part of my "Azure Marketplace without filter" series, which now includes one video and three articles…

9 条评论
AI without filter - From ChatGPT to Prompt Engineering

2023年8月21日

AI without filter - From ChatGPT to Prompt Engineering

Note: This is the second article in my "AI without filter" series. First, the article "From ChatGPT to…

4 条评论
AI without filter - From ChatGPT to Superintelligence

2023年8月10日

AI without filter - From ChatGPT to Superintelligence

Note: This article is part of my "AI without filter" series, initially with two articles. First, this article will "as…

3 条评论
Azure Marketplace - Setup for ISVs with a Global Scope

2023年5月4日

Azure Marketplace - Setup for ISVs with a Global Scope

This article is part of my "Azure Marketplace without filter" series, which now includes one video and three articles…

5 条评论
Azure Marketplace without filter - Start Simple but Start Now!

2022年7月28日

Azure Marketplace without filter - Start Simple but Start Now!

Note: This article is a major rewrite in July 2024 of an article published for the first time in July 2022. This…

10 条评论
Calling all .NET developers!

2021年3月19日

Calling all .NET developers!

The .NET revolution Millions of .

1 条评论

See all articles

Expectations to the reader

Enterprise-scale Landing Zones: The North Star

Metropolis

Compliance-as-Code

Azure policies

Subscription democratization & Policy-driven governance

Azure Blueprints

Part 1: From Portal to PlatformOps & Compliance-as-Code

Step 1: The Azure Portal

Step 2: Infrastructure-as-Code (IaC)

Step 3: Introduce Platform development teams - and GitHub

Step 4: Compliance-as-Code

Step 5: Adding the Application LZ's

Step 6: "Compliance-as-Code" as Open Source

Step 7: SaaS-as-Code

Part 2a: The Enterprise-scale pipeline - overview

Step 1: From Azure - import current Azure state

Step 2: To Azure - deploy changes to Azure

Part 2b: The Enterprise-scale pipeline - under the hood

Step 1: The Enterprise-scale reference implementation

Step 2: Extract current Azure state to GitHub

Step 3: Change Azure State using the Azure Portal

Step 4: Change my Azure State from code

A little history ...

Want to hear more - or have feedback/suggestions?

Other Cloud related blogs

Cloud Strategy:

Digital Transformation Delivered

Cloud Economics series

Anders Bonde的更多文章

Azure Migration: From Cloud First to Cloud Now

Azure migration: Quick, affordable and with low risk

AI without filter - Use Copilot for M365 to prepare your organization for AI

The FinOps Handbook v. 2023

Azure Marketplace without filter - Procurement Best Practices!

AI without filter - From ChatGPT to Prompt Engineering

AI without filter - From ChatGPT to Superintelligence

Azure Marketplace - Setup for ISVs with a Global Scope

Azure Marketplace without filter - Start Simple but Start Now!

Calling all .NET developers!

社区洞察

其他会员也浏览了

All Things Modern Applications

How Companies are get benefitted from Azure Kubernetes Services (AKS)

Automation of Website Deployment by Terraform

Enterprise Docker

What is Azure Kubernetes Service?

Configure Kubernetes cluster using Ansible Role

Review "Mastering Elastic Kubernetes Service (EKS) with AWS" (Packt, 2023)

Azure Kubernetes Services Case Study

Leveraging Kubernetes Open Policy Agent