Building a Scalable and Sustainable Strategy for Cloud Adoption — Part Two — Process

Building a Scalable and Sustainable Strategy for Cloud Adoption — Part Two — Process

Having laid the groundwork for a culture & mindset in the previous blog, the next focus is process.

But first, let's talk about "Process". Processes can be rigid and prescriptive, outlining step-by-step guidelines to complete tasks, or defining engagement models for external teams. More often, process aren't revisited on regular basis. However, the cloud is highly dynamic and rapidly evolving, and it requires something more flexible and adaptive.

I recommend that you think in terms of "Practices" rather than "Processes" for the cloud. Practices are more adaptable and can be tailored to suit different contexts or situations. Practices evolve based on experience, knowledge, and insights, and enable the teams to leverage new learnings and information for continual enhancement of their best practices.

With this mindset, changing the triad from "People", "Process", "Technology" to "People", "Practice", "Technology" seems more appropriate for cloud.

No alt text provided for this image

If you are convinced, then let's rethink and convert existing processes into "practices", or create new ones. While building practices, we should take in the best of what "processes" offer - consistency, repeatability, and measurability. Our objective should be to build structured practices, and wherever required, mini-processes that can help mitigate unnecessary risks and complexities while providing flexibility and adaptability.

The following list is not exhaustive, it highlights critical areas to consider during your transition to the cloud. Each of these topics are quite vast, and entire book series have been dedicated to just one of them. The aim here is to get your started with few basics.

SDLC

Cloud brings changes in autonomy and decentralization, which, in turn, impact the Software Development Life Cycle (SDLC). If you intend to use specific cloud services like serverless, you'll find that while they help to reduce maintenance overhead, they also introduce new challenges for local development. Furthermore, the cloud offers various deployment methods such as blue-green and canary deployments. As such, your existing SDLC needs to be revisited and revised to accommodate these new elements and effectively pave the way for frictionless cloud development.

Some guiding questions you should be able to answer on this section:

  1. What aspects of your existing SDLC might be challenged or enhanced by the decentralization and autonomy provided by the cloud?
  2. How does your team currently handle local development and how might this be affected if you were to utilize serverless cloud services?
  3. How familiar is your team with different cloud deployment methods like blue-green and canary deployments, and how could these methods be incorporated into your revised SDLC?
  4. Have you identified specific areas of your SDLC that need revising to fully exploit the benefits of the cloud, and what steps are you taking towards this revision?

Architectural Decision Records (ADRs)

Following AWS's recommendation (see ADR Process), incorporating Architectural Decision Records (ADRs) into your practice can streamline decision-making for cloud projects. Utilizing ADRs not only decentralizes decision-making and accelerates solution development and keeps projects on track, but it also ensures that design decisions undergo thorough guidance and review. Consequently, ADRs generate a traceable documentation trail, capturing the critical decisions made throughout a project's lifecycle.

Some guiding questions you should be able to answer on this section:

  1. Have you considered incorporating Architectural Decision Records (ADRs) into your project management process? If not, what holds you back?
  2. How might the use of ADRs streamline decision-making for your cloud projects and accelerate solution development?
  3. What mechanisms do you currently have in place to ensure critical design decisions undergo thorough guidance and review? Could these be enhanced by implementing ADRs?

Change Management

As your applications grow and evolve, an adaptable yet clearly defined change management practice is vital. This ensures quality and encourages collaboration and communication amongst teams. In the cloud, change management should not be limited to controlling infrastructure or code deployments in isolation, but should cover a wider spectrum such as communication, and defining a de-coupled subscriber <> publisher model. The distributed nature of the cloud often means your systems might be spread across different regions, and possibly multiple accounts or subscriptions. This means your focus should be on incremental and small changes.

Some guiding questions you should be able to answer on this section:

  1. How adaptable is your current change management practice to the evolving needs of your cloud applications?
  2. Given the distributed nature of the cloud, how are you managing changes across different regions, accounts, or subscriptions?
  3. How have you optimized your change management practice to support incremental and small changes?

Testing

Distributed development in cloud means end-to-end testing will be slow, and might not even be possible or comprehensive. Think modern approach to testing in the cloud. Pivot towards techniques like contract testing, API testing, automated regression, and chaos engineering. These strategies better fit the expansive and dynamic nature of the cloud. Aim to cultivate Site Reliability Engineering (SRE) capabilities within your team as opposed to relying on Change Approval Boards as the only control. This forward-thinking approach will ensure your team is equipped to effectively manage the ongoing evolution of your cloud setup.

Some guiding questions you should be able to answer on this section:

  1. What strategies are you currently using to ensure the comprehensive testing of your distributed systems, and how are they adapting to the demands of the cloud environment?
  2. How are you incorporating modern testing techniques like contract testing, API testing, automated regression, and chaos engineering into your testing practices?
  3. How is your team developing Site Reliability Engineering (SRE) capabilities, and how is this influencing your approach to change management in your cloud setup?

Security & Compliance

Security must not be an afterthought; it should be built into your applications and reviewed regularly. The landscape of vulnerabilities and security exposures changes at a rapid pace. Establishing strong security and compliance management to regularly review the security posture of applications and take proactive actions is vital to preempt any incident. Proper patching practice is a no-brainer.

Think more broadly by defining an effective security practice requires a robust asset management process, which entails a clear understanding of the services used and their deployment regions and accounts. Implement proactive monitoring to identify potential security incidents before they escalate. Additionally, incorporate command and control into your design to ensure robust control of any security incidents.

Some guiding questions you should be able to answer on this section:

  1. How is security integrated into your applications during the development process?
  2. How comprehensive is your patch management process, and how does it contribute to overall security?
  3. How effectively do your current systems identify potential security incidents before they escalate?
  4. How is command and control incorporated into your application designs, and how does it enhance security control?

Incident Management & Response

In the cloud, you can automate virtually any type of incident response, taking advantage of enhanced elasticity and auto-healing capabilities. Start by defining known failure modes and creating playbooks for appropriate actions. Continual retrospective sessions are essential to review manual responses, and to create and adjust your standard operating procedures. This practice not only maintains your system's resilience but also creates a backlog for further automation efforts.

Some guiding questions you should be able to answer on this section:

  1. How are known failure modes defined and documented within your system?
  2. How are playbooks developed to guide responses to different incident types?
  3. How often are retrospective sessions conducted to review and improve incident responses?
  4. How are manual responses assessed and improved based on retrospective sessions?
  5. What steps are being taken to incorporate cloud capabilities for automated incident response?

Capacity, Cost, and Performance Management

The cloud's rapid scalability means you'll be paying just as swiftly. If you reserve more capacity than you need, you waste money. On the other hand, misaligned scaling patterns can fall short of demand since auto-scaling for most services are reactive. Take, for example, AWS Aurora Serverless, which scales its capacity unit based on usage. But scaling occurs only after hitting a threshold and requires time. If your traffic pattern isn't well-aligned, your application's performance can degrade. It's crucial to match capacity with your application's performance and establish a practice for regular review and adjustment. Stay updated with the latest capabilities to manage your capacity and performance more effectively.

Some guiding questions you should be able to answer on this section:

  1. How have you incorporated continuous deployment into your cloud operations?
  2. How does your continuous deployment approach align with your change management strategy?
  3. What challenges have you encountered in implementing continuous deployment, and how have you addressed them?
  4. How has continuous deployment impacted your ability to harness the potential of cloud technology?


Next, we will take a look at how both People and Practice impact the Technology choices available to us and build a strong platform and architectural decisions.

要查看或添加评论,请登录

Brinthan Yoganathan的更多文章

社区洞察

其他会员也浏览了