Is SRE right for you?
I remember when I was first approached for a role in SRE, back then I had no real idea what SRE was. If you are a software developer wondering whether you should make the transition to SRE, this post is for you.
SRE stands for Site Reliability Engineering but there seems to be a lot of confusion about what that means, some people and companies seem to think it’s just glorified Ops or Dev Ops.? Others may describe it as the lords of production.? Before I give my take,? I want to first mention an old story from Jewish tradition.? A person who wishes to convert to Judaism asks some rabbis to teach him all of the Torah (Bible/old testament in Hebrew) in just a few minutes (originally “while standing on one foot”). After getting thrown out by the first rabbi, Rabbi Hilel, a known moderate, tells him: "What is hateful to you, do not do to your neighbor. That is the whole Torah; the rest is the explanation of this—go and study it!". Similarly, at the core of SRE is a simple concept: take Software Engineers and put them in charge of Ops work with the expectation that they use their engineering skills to make themselves redundant.? Everything else is just an explanation of this.??
While the basic description is truly that simple, of course there is a lot of detail derived from it.? Detail you may need to figure out if this is something that you're interested in.? So let's dive in a little deeper.
领英推荐
I believe it may be easier to start with who IMHO would not be a good match for SRE.? If all you care about is writing code that solves hard problems but don’t care about how that code is getting in front of users or making sure it continues to run smoothly for months or even years then SRE is probably not for you.? If however you find seeing your code running in production thrilling.? If you enjoy tracking its health and keeping it healthy and especially if you enjoy the pre thought needed to make it run well over time, you should consider SRE.? SRE is about the full lifecycle of software systems.? How functionality gets into users hands, how to make sure it runs reliably over time and how to fix things when they go wrong.? All of these are of course hard problems that take a lot of coding to solve. It's an engineering discipline that has at its core practices that should be familiar to any good software engineer.? Source Control, CI/CD, Monitoring and Alerting, Large Scale System design, just to get started.?
One more aspect that is not a classic engineering discipline is Incident management and the structured approach to dealing with production emergencies when they happen. It’s this aspect of SRE work that tends to attract people who run towards the fire. People who enjoy addressing problems even when there is some pressure and you know that critical systems are in your hands.? Note that this is a learned skill and practices like blameless postmortems help develop it without causing nervous breakdowns.? Also note that unless you have been in that kind of situation you probably don’t yet know if you enjoy it or not. The only way to find out is to try.
If you want to learn more about SRE I recommend the Google SRE book.? And of course you are welcome to reach out to me.?
???? ???????? ???"? ????? ?? ???? ???????? ????? ??????
2 年Joey, thanks for sharing!