Thoughts on SRE
Vamshi Yemula
Senior DevOps Engineer | SRE | Kubernetes, Docker ,Terraform ,AWS and CI/CD Specialist | Driving Reliability & Performance
What exactly is SRE?
Site reliability engineering (SRE) empowers software developers to own the ongoing daily operation of their applications in production. The goal is to bridge the gap between the development team that wants to ship things as fast as possible and the operations team that doesn't want anything to blow up in production.
SRE is a prescriptive way to do DevOps
Roles and responsibilities of SRE
The main roles of SRE engineer would be
SLI, SLO and SLA ?
Service Level Indicators (SLIs) are the quantitative measures defined for a system, also knows as “what we are measuring"
Most important SLI's would be
领英推荐
Service level objective(SLO): a target value or range of values for a service level that is measured by an SLI
Service Level Agreement is service level agreements, the promise you make about your service’s health to your customers.?
However, once you dive into the details, SAs, SLOs, and SLIs are clearly different types of entities:
What is TOIL?
“TOIL is the kind of work that tends to be manual, repetitive and tactical devoid of enduring value and that scales linearly as a service grows”. — Vivek Rau, Google.
Examples of toil are something related to manual intervention like manual releases, physically connecting to infrastructure to check something, manual resets, on-call response, extracting data, manual scaling of infrastructure, etc. We need to eliminate the TOIL as manual work reduces the quality
What is Error Budget?
“100% is the wrong reliability target for basically everything” — Ben Treynor
Error Budget means the amount of Time Budget we have where service can get affected. This is the time that is used to bring in new features or make architectural changes. If we tend to spend more than the budget, there has to be a consequence. One such consequence is to stop new features and get the system stable.?
References: You can have a look at below resources which gives more insight on SRE concepts
Senior Software Engineer(Sr. AVP) at Wells Fargo | Microsoft Certified Azure Developer| Ex- Deloitte
2 年Great Start