Shorticle 954 – Deployment architecture in Site Reliability Engineering (SRE)
When you are talking about SRE with High availability and Scalability, you should understand the application architecture on how it is built and the deployment architecture on how it is setup in order to decide how you can innovate in Availability and scalability solution.
Operations team is moving towards Engineering route with Infra as a code and Site Reliability Engineering (SRE) in which Deployment architecture is suitable for Application, Data platforms, Services (API) and more. Deployment architecture involves Infrastructure provisioning, Packaging, Deploying and Runbook automation as a unified package to build infrastructure, platform and application components.
For example, Canary deployment is used to deploy to a subset of servers first called as ‘canary’. In African mines, Canary bird is used to send to the mine first to find if there is any toxic gas before sending actual miners. Canary deployment saves time in deploying to multiple servers without validating and it is suitable for clustered deployment. It is similar to staging except that canary is a production server and used as Live system after complete deployment.
On the other side, developing a blue green deployment architecture is also popular in cloud based application development. Blue Green deployment is the most commonly preferred approach in target platform due to flexibility and benefits that it provides in the deployment architecture. Typically, we will have two set of deployment instances one called as Blue and the other called as Green such as:
领英推荐
·????????Blue is serving as ACTIVE node and Green is serving as STANDBY. At one time, blue will have one version (n) and Green will have its previous version (n-1). After deploying n-version in Blue, it will be tested and if there is any issue, blue services will be isolated and green services will become active. ?This makes sure that you will have better turn-around to rollback during failures.
·????????If blue service installation is stable, we can convincingly use Green services as DR standby by having same version so that if Blue service goes down, Green service can be used as DR recovery site.
In modern cloud lifecycle, SRE and cloud operations aims at operational excellence and moving from timed wait activities in operations, monitoring and management to self-healing services to autonomously manage cloud operations. This will lead CloudOps to move towards intelligent operations through NoOps.?