Building a Scalable and Sustainable Strategy for Cloud Adoption — Part Two — Process
Brinthan Yoganathan
Engineering Leader focused on Cloud, Serverless, Data, and AI/ML
Having laid the groundwork for a culture & mindset in the previous blog, the next focus is process.
But first, let's talk about "Process". Processes can be rigid and prescriptive, outlining step-by-step guidelines to complete tasks, or defining engagement models for external teams. More often, process aren't revisited on regular basis. However, the cloud is highly dynamic and rapidly evolving, and it requires something more flexible and adaptive.
I recommend that you think in terms of "Practices" rather than "Processes" for the cloud. Practices are more adaptable and can be tailored to suit different contexts or situations. Practices evolve based on experience, knowledge, and insights, and enable the teams to leverage new learnings and information for continual enhancement of their best practices.
With this mindset, changing the triad from "People", "Process", "Technology" to "People", "Practice", "Technology" seems more appropriate for cloud.
If you are convinced, then let's rethink and convert existing processes into "practices", or create new ones. While building practices, we should take in the best of what "processes" offer - consistency, repeatability, and measurability. Our objective should be to build structured practices
The following list is not exhaustive, it highlights critical areas to consider during your transition to the cloud. Each of these topics are quite vast, and entire book series have been dedicated to just one of them. The aim here is to get your started with few basics.
SDLC
Cloud brings changes in autonomy and decentralization, which, in turn, impact the Software Development Life Cycle (SDLC). If you intend to use specific cloud services like serverless, you'll find that while they help to reduce maintenance overhead, they also introduce new challenges for local development. Furthermore, the cloud offers various deployment methods such as blue-green and canary deployments. As such, your existing SDLC needs to be revisited and revised to accommodate these new elements and effectively pave the way for frictionless cloud development.
Some guiding questions you should be able to answer on this section:
Following AWS's recommendation (see ADR Process), incorporating Architectural Decision Records (ADRs) into your practice can streamline decision-making for cloud projects. Utilizing ADRs not only decentralizes decision-making and accelerates solution development and keeps projects on track, but it also ensures that design decisions undergo thorough guidance and review. Consequently, ADRs generate a traceable documentation trail, capturing the critical decisions made throughout a project's lifecycle.
Some guiding questions you should be able to answer on this section:
Change Management
As your applications grow and evolve, an adaptable yet clearly defined change management practice
Some guiding questions you should be able to answer on this section:
领英推荐
Testing
Distributed development in cloud means end-to-end testing will be slow, and might not even be possible or comprehensive. Think modern approach to testing in the cloud. Pivot towards techniques like contract testing, API testing, automated regression, and chaos engineering. These strategies better fit the expansive and dynamic nature of the cloud. Aim to cultivate Site Reliability Engineering (SRE) capabilities within your team as opposed to relying on Change Approval Boards as the only control. This forward-thinking approach will ensure your team is equipped to effectively manage the ongoing evolution of your cloud setup.
Some guiding questions you should be able to answer on this section:
Security & Compliance
Security must not be an afterthought; it should be built into your applications and reviewed regularly. The landscape of vulnerabilities and security exposures changes at a rapid pace. Establishing strong security and compliance management to regularly review the security posture of applications and take proactive actions is vital to preempt any incident. Proper patching practice is a no-brainer.
Think more broadly by defining an effective security practice requires a robust asset management process, which entails a clear understanding of the services used and their deployment regions and accounts. Implement proactive monitoring
Some guiding questions you should be able to answer on this section:
Incident Management & Response
In the cloud, you can automate virtually any type of incident response, taking advantage of enhanced elasticity and auto-healing capabilities. Start by defining known failure modes and creating playbooks for appropriate actions. Continual retrospective sessions are essential to review manual responses, and to create and adjust your standard operating procedures. This practice not only maintains your system's resilience but also creates a backlog for further automation efforts.
Some guiding questions you should be able to answer on this section:
Capacity, Cost, and Performance Management
The cloud's rapid scalability means you'll be paying just as swiftly. If you reserve more capacity than you need, you waste money. On the other hand, misaligned scaling patterns can fall short of demand since auto-scaling for most services are reactive. Take, for example, AWS Aurora Serverless, which scales its capacity unit based on usage. But scaling occurs only after hitting a threshold and requires time. If your traffic pattern isn't well-aligned, your application's performance can degrade. It's crucial to match capacity with your application's performance and establish a practice for regular review and adjustment. Stay updated with the latest capabilities to manage your capacity and performance more effectively.
Some guiding questions you should be able to answer on this section:
Next, we will take a look at how both People and Practice impact the Technology choices available to us and build a strong platform and architectural decisions.