Making On-call fun

Making On-call fun

In my previous article around the Culture of DevOps, I had made a subtle reference to on-call but if you have been in a DevOps culture long-enough, you’ll know that on call is a keyword that often doesn’t excite your engineers. So how do you make it exciting and drive towards a culture where the on-call routine is fun?  

While we are not perfect in our on-call routine, we are in the continuous journey to strive for operational excellence. We are genuinely trying to make on call fun, exciting, incentivized, disciplined and rewarding.  

  • Have structured tiers around your on-call routine. Tiers necessarily doesn’t mean more layers, but established practices (standard operating procedure) on what determines how you will go up the ladder of escalation.  
No alt text provided for this image
  • Establish a graduation program for your tiered-on call system. 
The graduation concept is a simple way to incentivize Engineers in their on-call journey and let them aspire towards next tier.  
  • Everyone on the team is on call, irrespective of your job title/function. For example, Engineering Leaders on my team are on the hook for ALL kinds of escalation anytime, any day. 
A good leader will take ownership and drive outcomes. 
A good leader will head-on deal with the politics during an incident and strive for outcomes.  
  • Lead by example, have a strong bias for action. On call is fun, when you demonstrate you care and put yourself in the front-line when incidents happen.
In my current org, I take the pleasure of doing what’s called as the preliminary assessment when a P1 or P2 incident is triggered and collect the facts. This is not an expectation of my role, but I find this to be fun, rewarding because I get to learn more about our application Eco-system. And let my engineers work on real problems and empower me with data, dashboards with which I can complete my preliminary assessment in a nimble way.  
  • Run-book is a living document that documents your standard operating procedures, links to dashboards and other relevant system documentation. If they don’t exist, start somewhere or start here. They will never be perfect, but your entire team should know that
If you see something, say something and write it down.
  • Have a comprehensive on call calendar which is the single source of truth for schedule for your tiers.
Empower Engineers to collaborate with their peers directly when schedule needs adjust, and managers are kept on an inform basis.   
  • At end of an on-call rotation, encourage the Engineer to do a hand-off of the torch to the next one in the line.
A simple practice could be a quick huddle with members of the team review what went well the last week, what was annoying & what to do better.   
  • Celebrate individuals that took time to make existing documentation better based on their observations.  
  • Figure out an incentive strategy to make on-call interesting. On-call takes a toll on the individual and one way you can motivate the individual is use the lag time he or she has the week they are on-call to work on a pet project or creative idea.  
  • Recognize someone that’s gone above and beyond.
These people are your un-sung heroes, so the least you can do is give them a comp time-off where applicable and get a mental break.  
  • When engineers present you problems about on-call, ask them how to fix it. Often in this collaborative chat the result is innovation leading to excellence in operations.  
  • Celebrate failures. As mentioned in a diff article, failures are success in many ways because of the learning, so encourage your team to fail-fast and fail-forward. You can make it fun by having a face-palm idol that’s rewarded to the “oh sh*t” moments, but when you do this do it with common sense and understand the emotions of the employee under context before executing on it. 

If there are other practical tips, I genuinely welcome them as feedback/comments.  

Subodh Verma

Engineering Leader | Software Quality, Reliability & Performance | Champion Rapid Innovation | Achieve Clarity from Chaos | Incremental Growth

4 å¹´

Very well said Ramesh.... “A good leader will take ownership and drive outcomes”

要查看或添加评论,请登录

Ramesh K的更多文章

  • Day 1, 2024

    Day 1, 2024

    17-Aug-2024 is a special day for me and my Day 1, 2024. Four years ago, I began my career with AWS.

    3 条评论
  • Lessons learned summiting Mt. Rainier

    Lessons learned summiting Mt. Rainier

    In H2’ 2023, I attempted to climb Mt. Baker (10,786 feet) and Mt.

    25 条评论
  • Super Foods Routine

    Super Foods Routine

    Disclaimer - I am not a doctor nor a certified dietician. This post is a reflection of my personal routine for my own…

    2 条评论
  • Adios 2023. Hello 2024.

    Adios 2023. Hello 2024.

    On a cold, drizzling pacific northwest afternoon, I was sitting on my computer bored and without a purpose. I was…

    7 条评论
  • Does a Manager change mean a career reset?

    Does a Manager change mean a career reset?

    A few weeks ago, I wrote a post on “The Promotion Process - Demystified” and provided some guiding principles to use…

    2 条评论
  • The Promotion Process - Demystified

    The Promotion Process - Demystified

    Disclaimer - This might be a bit of a controversial topic so I want to start with a bit of a disclaimer here that the…

    2 条评论
  • Failing Gracefully

    Failing Gracefully

    Throughout this summer, I have been training vigorously for my very first glacier mountaineering summit. They say that…

    2 条评论
  • Day 1, 2023.

    Day 1, 2023.

    Today (17-Aug-23) is my Day 1 at AWS. I will be completing my 3-year anniversary today as the Engineering Leader for…

    2 条评论
  • Boosting Team Morale

    Boosting Team Morale

    The motivation for this post came from a LinkedIn News team who wanted unique perspectives from people across the…

    1 条评论
  • What I enjoy about hiking

    What I enjoy about hiking

    The terrains of Washington state are fascinating. We are surrounded by 5 active volcanoes in the cascade range - Mt.

社区洞察

其他会员也浏览了