Building Enterprise Cloud Strategies for Digital Transformations

Building Enterprise Cloud Strategies for Digital Transformations

Overview

Digital transformation is a high priority for many enterprise organizations to optimize business processes, improve customer experiences, enrich employee engagements, etc. Needless to say, the cloud plays a key role in any digital transformation initiative as it provides opportunities for modernizing the organization’s digital infrastructure, increasing business agility and speeding up innovation. At the same time, as enterprises embark and continue their cloud journey in migrating, optimizing, modernizing or building workloads, there can also be significant challenges in assuring security, controlling costs, achieving regulatory compliances, governing organizational policies, acquiring and retaining talents, etc. As such, it is vitally important for an organization to have a well-defined enterprise cloud strategy to provide the needed guidance in order to maximize cloud benefits while minimize or avoid its pitfalls, regardless if the organization just started migrating workloads to the cloud or already have many workloads running in the cloud.

The enterprise cloud strategy of an organization should define the vision for the roles of cloud and clear paths to realize the cloud vision. Specifically, a well-defined enterprise cloud strategy should include the following three facets:

  • Strategic cloud drivers
  • Organization level cloud strategies
  • Workload level cloud strategies

Each of the above facets should be articulated by considering the following five perspectives:

  • Business perspective
  • Technical perspective
  • Security perspective
  • Operational perspective
  • Financial perspective

Strategic Cloud Drivers

The strategic cloud drivers define the shared understanding on what drives the adoption of cloud within the organization. While each organization’s cloud drivers may be different at various points in time, the following examples may serve as references to define your own.

  • Business drivers: Increase business agility, reduce time to market
  • Technical drivers: Leverage cloud elasticity to better support development and testing, keep pace with latest computing infrastructure and technologies, leverage latest cloud AI/ML advancements for innovation
  • Security drivers: Leverage cloud security infrastructure to achieve better governance, meet compliance requirements and protect from attacks
  • Operational drivers: Use infrastructure-as-code and managed services to reduce IT operations complexity, help attract and retain technical talents
  • Financial drivers: Reduce IT capital expense budget, improve cost efficiencies, reduce infrastructure waste

Organization Level Cloud Strategies

The organization level cloud strategies include the following aspects:

  • Overall cloud strategy
  • Public cloud strategy
  • Cloud positioning strategy
  • New workload strategy
  • Cloud governance strategy

Overall Cloud Strategy

The overall cloud strategy defines which of the following strategies will be adopted:

  • Public cloud only
  • Private cloud only
  • Hybrid?

In general, a private cloud is of limited utility to most enterprise organizations since it does not provide the cost efficiencies that a public cloud can provide due to significant capital expense and low economies of scale. For this reason, a private cloud only strategy is generally not recommended unless there is a compelling reason to do so.

For most enterprise organizations with legacy IT footprints, hybrid is probably the best strategy due to business, technical, security, operational and/or financial constraints, such as existing partnerships (business), legacy mainframe applications (technical), special compliance requirements (security), insufficient cloud skills (operational), existing multi-year data center leases (financial), etc.?

If your organization does not have these kinds of constraints, the public cloud only strategy may be a good choice.

Public Cloud Strategy

The public cloud strategy defines how you intend to leverage the public clouds for running your workloads. Given the availability of multiple public clouds, there can be many options for such a strategy. The following are some strategy options, arranged from the most restrictive to the most flexible:

  • Single cloud - A single public cloud for all workloads.
  • Preferred cloud - One public cloud is designated as a preferred cloud for all workloads, but other clouds may also be considered for special use cases.
  • Multi-cloud split - Workloads are split among two or more public clouds, each designated for certain types of workloads.
  • Multi-cloud best fit - A target cloud is selected for each workload based on best fit from a select pool of public clouds.
  • Multi-cloud agnostic - Any workload should be freely movable among a select pool of public clouds.

Each option has its advantages and disadvantages. For example, from business and financial perspectives, multi-cloud best fit is a good option since it offers flexibility for business expansion and has the potential to optimize cost. But from technical, security and operational perspectives, single cloud and preferred cloud may be the preferable strategies since they offer simplicity in cloud architecture and security control and it is less demanding on cloud skills. With the multi-cloud agnostic strategy, the organization will need a way to enforce rules and policies to ensure that workloads can be easily moved between different clouds. As such, this strategy probably only makes sense for large organizations with highly skilled engineering teams. You will need to weigh all the factors to determine the best public cloud strategy for your organization and to assess each of the cloud candidates by asking business, technical, security, operational and financial questions that are relevant to your use cases.?

Cloud Positioning Strategy

The cloud positioning strategy defines what you will use each cloud for. There can be many different ways you can use for each selected cloud. The following are some possible options:

  • Backup and/or DR only
  • Non-production workloads only
  • Data analytics and AI/ML workloads
  • Internet facing workloads
  • Internal facing workloads
  • Any workload

For example, many Google Cloud customers choose to host all their workloads with GCP. There are also customers who choose GCP as their cloud of choice for data analytics and AI/ML workloads. For whatever choice you make, the key reasons should be documented from business, technical, security, operational and/or financial perspectives to ensure that the choices are properly thought through and justified.

New Workload Strategy

The new workload strategy defines the preferred target destination for building new workloads. The possible options are as follows:

  • Public cloud first
  • Private cloud first
  • On-prem first
  • Case by case

If your overall cloud strategy involves public cloud, then a public cloud first strategy is generally a best practice as it aligns with the overall cloud strategy and avoids future migrations for the new workloads. Public cloud may also provide services and/or features that are not available in on-prem or private clouds. For example, many Google Cloud customers choose to build their new workloads with BigQuery, GKE, Cloud Run, Cloud Functions, Vertex AI, and/or other managed services that offer scalability, no/low-ops, and ease of use.

Cloud Governance Strategy

The cloud governance strategy defines how you plan to use rules and policies to control the provisioning of cloud services and resources for various projects. In general, there are four strategy options you can choose from:

  • Centralized governance - All services and resources are controlled by a centralized IT department.?
  • Decentralized governance - Cloud services and resources can be provisioned by anyone of the organization with no centralized control.
  • Hybrid governance - A combination of centralized and decentralized governance where rules and policies are centrally controlled while services and resources can be provisioned by anyone as long as the rules and policies are not violated.?
  • Hierarchical governance - A governance structure where rules and policies can be controlled at multiple levels of an organizational hierarchy.?

From security and financial perspectives, centralized governance may be preferred as it is easier to control security and cost. However, centralized governance may not be a good choice from business, technical and operational perspectives since it is not scalable for any enterprise organization with a large number of teams and projects.?

Decentralized governance offers the most flexibility from business, technical and operational perspectives, but it can easily get out of control from security and financial perspectives.

Hybrid governance is a better strategy as it maintains centralized control on rules and policies but still offers scalability and flexibility in supporting multiple projects.

The best practice is to use a hierarchical governance strategy, which optimally balances simplicity, scalability, flexibility and proper controls. Google Cloud’s Resource Hierarchy and Shared VPC constructs together provide great support for such a hierarchical governance strategy. With the resource hierarchy, all cloud resources are always provisioned at the project level, but organizational policies and IAM policies can be applied at any level of the hierarchy to control what can or cannot be done for any project, groups of projects, or all projects within the organization by any user. Shared VPC provides an additional mechanism for effectively sharing and managing virtual cloud networks and firewalls above the individual projects.

Workload Level Cloud Strategies

Workload level cloud strategies define what you intend to do next for each workload from its current state as part of its cloud journey. This could be migrating from on-prem to cloud, migrating from one cloud to another cloud, optimizing operations and/or cost within the same cloud, modernizing application architecture within the same cloud, or building something brand new in a target cloud.??

Defining a complete workload cloud strategy includes the following six tasks or steps:

  • What - Define the current and desired states of the workload with key attributes
  • Who - Identify the stakeholders of the workload and talent/training needs
  • Why - Align the key drivers for moving the workload from its current state to the desired state
  • Where - Map the desired state of the workload to a target cloud, region(s) and cloud services
  • How - Define the approach for moving the workload from its current state to the desired state
  • When - Define the roadmap for moving the workload from its current state to the desired state?

Each of these tasks should be carried out by considering the business, technical, security, operational and financial perspectives.

What Is a Workload?

For the purpose of this framework, a workload is defined as a collection of related IT resources (infrastructure/software/data) that together deliver a well defined business or functional value.

A workload may exist in many forms. Following are a few examples of workloads:

  • A single VM running one or more applications
  • A business application composed of multiple tiers (e.g., web tier, app tier, DB tier)
  • A proxy layer handling external traffic for one or more business applications
  • An API management layer for one or more business applications
  • A data analytics application?
  • A centralized data warehouse serving multiple analytics applications
  • A centralized database system hosting databases for multiple applications
  • A centralized backup system
  • A standby environment for disaster recovery?
  • A non-production (e.g., dev, test, training, staging, etc.) environment
  • A virtual desktop infrastructure
  • A VMware cluster
  • A Kubernetes cluster

Details on each of the six tasks/steps are outlined below.?

What – Current and the desired states

For each workload, you start by identifying the workload with a name, defining its current state with key business, technical, security, operational and financial attributes, and the desired states that help eliminate or reduce current pain points. A workload can be an existing workload on-prem, already running in a cloud, or a new workload yet to be built in the cloud.

Business attributes may include things like workload state (existing or new), user type (Internet facing, partner facing or internal), location (e.g., Equinix DC in Manhattan, GCP US East 1 Region), user base (e.g., consumers, business customers, employees, HR employees, internal applications), user distribution (e.g., US, North America, global, etc.), business criticality (e.g., low, medium, high), etc.

Technical attributes may include type of environment (e.g, dev, test, staging, production, DR), workload architecture (e.g., three-tier: web/app/DB), compute platform (e.g., current: VMware, desired: cloud native compute), DB platform (e.g., current: Oracle on bare metal, desired: managed PostgreSQL), OS platform (e.g., current: Linux, desired: no change), storage platform (e.g., current: SAN, desired: cloud block storage), app scalability (e.g., current: horizontally scalable, desired: auto-scalable), DB HA (e.g., current: 2-node failover, desired: 2-zone failover), etc.

Security attributes may include IdP (e.g., current: Active Directory, desired: no change), security zone placement (e.g., current: web tier in DMZ, app and DB tiers in secure zones, desired: cloud equivalent), compliances (e.g., current: no compliances, desired: PCI DSS and GDPR compliant), etc.?

Operational attributes may include operational model (e.g., current: IT managed, desired: cloud provider managed), backup and recovery (e.g., current: appliance, desired: cloud object storage), RPO (e.g., current: 15 min, desired: 0 for system failure, 5 min for regional disaster), RTO (e.g., current: undefined, desired: 15 min for system failure, 1 hour for regional disaster), logging/monitoring (e.g., current: Centralized with Splunk, desired: no change), CI/CD (e.g., current: Github, Chef and Jinkens, desired: cloud native tools), etc.

Financial attributes may include infrastructure costs, software licenses, support contracts, personnel costs, etc. One desired state financial attribute could be to dynamically manage non-production environments to minimize infrastructure costs.

The outcome of this exercise is a workload with well defined business, technical, security, operational and financial attributes for its current and the desired states.?

Who – Stakeholders and Talent Needs

Identifying the stakeholders should be done in parallel with the previous task as soon as the workload is identified. All key stakeholders for a workload should participate in defining the cloud strategies for the workload to ensure proper alignment.

Business stakeholders should include a workload business owner who represents the user community of the workload and an executive sponsor who can provide guidance and help to resolve management and cross-organizational issues.

Technical stakeholders may include a software owner, a data owner, and an infrastructure owner for the workload. Depending on the organization’s team structure, additional technical stakeholders may also include an enterprise architect, a network architect, a cloud architect and/or the CTO.

Security stakeholders may include a workload security owner, a regulatory compliance owner, an internal compliance owner, a security architect and/or CISO.

Operational stakeholders may include an IT operations owner and a business operations owner.

Financial stakeholders may include someone with budgeting and cost control responsibilities for the workload.

Each stakeholder should identify if and what additional talents or training are needed for their teams to successfully support moving the workload from its current state to the desired state.

The outcome of this exercise should be a list of clearly defined business, technical, security, operational and financial stakeholders and talent needs.?

Why – Drivers

This is where you document the specific business, technical, security, operational and/or financial drivers for moving the workload from its current state to the desired state.?

For example, if you want to move a customer-facing application to the cloud, one important business driver could be to leverage the public cloud to expand to new markets quickly. If the workload is a VMware cluster hosting non-production environments for development, testing and training, a key operational driver for moving it to cloud native compute could be to allow the engineering teams to spin up non-production environments quickly via self-service without waiting for IT to provision the needed resources. If your goal is to optimize a workload already running in the cloud, the driver could be a financial one of reducing infrastructure and operational costs.?

Where - Cloud Mappings

The objective for cloud mappings is to identify the target cloud, region(s) and cloud services for moving the workload from its current state to the desired state.??

At the business level, you identify the target cloud, if not already defined by the organization level public cloud strategy, and determine if the cloud’s regions provide the geographic coverage that can effectively support the user base of the workload.?

For each cloud candidate, you map the desired technical, security, operational and financial attributes to cloud services in the target region(s) and determine if the desired state can be supported.

If the public cloud strategy dictates a particular public cloud for the workload but all desired key attributes cannot be properly mapped to the selected cloud, alternative clouds should be identified and evaluated to determine the target cloud.

If the public cloud strategy is multi-cloud best fit, you should evaluate and compare the results from the cloud mappings to select a cloud that is best suited for the workload by considering all factors from the five perspectives.?

If the public cloud strategy is multi-cloud agnostic, then technical differentiations will be less important than security, operational and financial considerations.

In any case, the outcome should be one target cloud with one or more target regions and a list of cloud services for moving the workload from its current state to the desired state.

How - Approach

In this step, you define how you plan to move the workload from its current state to the desired state with the target cloud, region(s) and services as defined in the last step.

From a business perspective, if this is the first workload of its kind, a good practice is to conduct a proof of concept (POC) or a pilot before starting the migration or build to ensure that there are no unforeseen issues. The objective of a PoC is to ensure that the key elements of the target cloud architecture work as expected. The objective of a pilot is to build a production-like environment and make it available to a small portion of the user base (e.g, friendly users) or a small portion of traffic (e.g., 10% of total user traffic) to test out key functional and non-functional configurations before full scale deployment.

Depending on the situation, there are several technical approaches to choose from for moving the workload from its current state to the desired state.

  1. Lift & shift - Typically used to migrate a workload from on-prem or another cloud to the target cloud.
  2. Rip & replace - Typically used to replace a legacy workload on-prem with a COTS, SaaS, or custom built solution in the target cloud.
  3. Build - Build a new workload in the target cloud.
  4. Enhance - Typically used to add new capabilities to an existing workload or replace legacy components of an existing workload with cloud services or APIs.
  5. Extend - Leverage cloud for scaling out, backup or DR of an existing workload.
  6. Optimize - Typically used to improve operations and/or reduce cost of workload already running in the cloud without application architecture change by just leveraging cloud capabilities such as moving DB from VM to managed DB service, enabling auto-scaling, automating infrastructure provisioning, enabling automatic backup, etc.
  7. Modernize - Typically used to improve workload application and/or deployment architectures by leveraging cloud native services and tooling (e.g., containers, orchestration, serverless platforms, CI/CD tools, etc.) to achieve better availability, scalability, performance, maintainability, agility, etc.
  8. Combinations of the above, e.g., Lift & shift first and then optimize/modernize.

From the security perspective, the task is to identify security measures to minimize exposure and risk during the process of moving the workload from its current state to the desired state. The following are some examples:

  • Set up proper IAM roles for development, testing and deployment engineers to have access to cloud resources with just enough privileges for performing their jobs during this process.
  • Leverage zero trust services (e.g., Google Cloud IAP) to enable access to private VMs from the Internet for the engineers to do their job but without exposing the environment to people not involved in the project.
  • Keep clear separation of different environments (e.g., dev, test, prod) to avoid inadvertent misconfigurations.

From the operational perspective, you should identify the automation tools that can be used to automate migration, build, deployment, etc. Such tools may include but are not limited to

From the financial perspective, migrating to cloud means that you may have new monthly cloud bills to pay and you will need to make sure that your payment system is set up properly to ensure on-time payments.?

When - Roadmap

The final step of your workload cloud strategy is to create a roadmap for moving the workload from its current state to the desired state. This is not intended to be a detailed plan but just the planned start and completion time frames. If the strategy for moving the workload from its current state to the desired state includes multiple phases, the roadmap should include planned start and completion times for all phases.

You can start with a desired timeline from the business perspective and validate it for technical feasibility, security and operational readiness, and financial soundness. Adjustments should be made as needed to determine a realistic roadmap.

Summary

Building a comprehensive enterprise cloud strategy is an important exercise that should not be ignored or trivialized for any organization that is serious about leveraging cloud for its digital transformations. On the other hand, there is no need to include all workloads when you get started on building your enterprise cloud strategy. A good practice is to start with those workloads that are ready for the next moves and gradually add other workloads over time. The key is to get started sooner rather than later. An enterprise cloud strategy is not something static and should be revisited as the organization’s digital transformation journey evolves overtime.

Ryan Mulholland

AI Specialist at Google Cloud

2 年

Great article, Michael!

回复
Mike Rhodes

Director, Customer Engineering @ Google Cloud

2 年

Awesome article and layout Michael!!

回复

great analysis Michael!

回复

要查看或添加评论,请登录

Michael Hao的更多文章

社区洞察

其他会员也浏览了