Thoughtful Leadership Series - SRE and DevOps Excellence Center Management

Thoughtful Leadership Series - SRE and DevOps Excellence Center Management

After 2 years of break from blogging and an amazing experience with the Microsoft Defender SRE team, this is my new series of blogs and I would like to start with Thoughtful Leadership in Infrastructure and Services Management that might be linked to your own specialized domain in IT. SRE and DevOps are the new matured practices in a few organizations where many organizations are still in the very early phase where they are still struggling in the 2nd or 3rd phase of the DevOps Magic Quadrant Maturity model.

I spent my time in teams having the strength of 1 to 100+ team members and can visualize what it takes to evangelize engineering teams and culture. This blog will help you to organize yourself to deal with the following defined areas so that you can lead independent areas as per the scoped requirement. Most of the time folks mix these areas that create confusion during delivery and convert into Chaos in deliverable management.

I didn't find any single company from small to big that only does SRE and its mix of Sysadmin, DevOps, SRE. I know these all are a subset of each other and many organizations are in just the DevOps implementation phase where SLI, SLO, MTTx discussion are far away.

This blog is to just create clarity for Engineering Leaders who are driving engineering efforts for the sake of a better customer experience and will touch on some broader points that they should keep in their mind.

Team Management to Work Management Practices and building Excellence Center.

What is Work Management and how it is different from Team Management?

a. Work Management - I am categorizing Monitoring, Security, Reliability, DevOps practices, SRE Practices, Costing all such items as Work Management items where LT should drive these OKR(Objective Key Results) and team formation should also be in aligned toward this or otherwise multiple service team under same Engineering org will duplicate same efforts and will face standardization adoption challenge.

b. Team Management is about people management who are working on Engineering, Operations, and Project work and the mental state in such structure is totally different in comparison to Work Management.

Work Management == Excellence Center

c. Engineering Excellence Center - Many service-based companies like HCL, TCS, Wipro, or other similar companies have developed and utilized such excellence centers to provide mature services and the same practice can be leveraged in situations where you are struggling with engineering deliverable management and that is what we are talking about in Work Management and both are synonymous words for our discussion so you can drive Work items under Excellence delivery center. For example, one team will be an expert in Monitoring practices that will consult and embed observability in new services where another expert team of DevOps will embed CI/CD in your services as through this you can build strong teams with a good learning curve where engineers can shift from A to B team to satisfy their personal growth needs.

No alt text provided for this image

Excellence center setup for SRE and DevOps practices will need a dedicated blog to have more definition and I will cover more in detail in this series. For example, Monitoring Maturity excellence center where the team will focus on KPI's, Deployment techniques for Observability agents, Instrumentation library selection, Consolidation of monitoring solutions, Data retention policy to save Cost, Scrubbing of data to avoid security issues, Monitoring Data at Scale Management techniques and Creating Awareness about the availability of resources and Best practices are some of the examples that you can visualize how deep you can go in each specific Work Management area as part of your excellence center.

CI/CD techniques, Security Techniques are just some of the other centers that your leadership can build as part of the excellence center, and the benefits of such centers are not hidden and it helped service-based companies to grow at X rate.

Driving Engineering deliverable for a team of 5 vs 100 folks strength: Leaders should not try the same practices on Big teams and that they have practiced on Small Teams. For example, a startup can leverage 3rd party products to accelerate the adoption of some tools and practices, whereas a Big team needs practices to join the forces to achieve the common goal, take an example Monitoring Maturity, here if you have a big org having 100+ Infra folks supporting different services then each team should not duplicate same efforts and your Leadership should think toward Work Management practices where Monitoring maturity will be done by specific v-team(Virtual Team) within the group. Trying Team Management Model will lead to inefficient use of Management resources and molding this structure toward Work Management can benefit a lot on efficiency and standardization front. I know this block itself requires more clarity, examples so just highlighting this point in a short note and will cover more in another dedicated blog.

  1. Categorization of Work - I see the whole Infrastructure Management work into 3 bigger categories that any leader can easily identify and categorize to set up strong practices in the mentioned area. It is very clear what you need to manage and drive in your team and as per your situation or state either you will be driving all 3 blocks or focus on one block and it totally depends on your project state. Managers and LT folks handle many other aspects during engineering team management and should not be restricted to these 3 categories but at the core these 3 blocks are present and you should keep yourself balanced while management of these blocks.

No alt text provided for this image

1.1 Operations: It is a well-known area for infra guys and its self-explanatory.

  1. Incident
  2. Post Mortems
  3. Change Management

1.2 Project Work: It can be any new ask to have something on your infrastructure and are like, Setup Service A, Deploy DR Region, Patch complete Windows and Linux Environment, etc.

1.3 Engineering Work - This is the main area of focus for leaders, Architects as this requires more focus, thoughtfulness, planning, and management, and the organization should really work toward pushing these Big Rocks. Here your team comes as strength and your Architects come in the picture that gives you the direction to hit on the right area to get maximum impact with fewer efforts. But this area is not about fewer efforts and it will take efforts that it usually takes to get it done, you should not try your management technique until you have much bandwidth at your hand so leave this area on your heavy lifters to manage and stand with them in support of their vision.

"We Can't do this because XYZ engineering foundation is not present"

Many times you may encounter that you can't do the specific tasks because some prerequisite work is not present and it completely blocks your progress and most of the time such work is in the scope of engineering work. For example, you can't deploy your second DR region within 1 day because Automation or Infrastructure as Code is not present and you can't provision resources in a short duration. Another small example of Password rotation or cert rotation in case of a security breach where you might have missed identification and implementation of tools and technologies to achieve auto rotation.

No alt text provided for this image

2. Early Identification of Engineering Foundational Work - Identification of your foundational work on time is very important due to the fact that your whole infrastructure will stand on these pillars and here you and your heavy lifters need thoughtfulness otherwise you know what happens to weak buildings. The below image is just an example where you can put your engineering deliverable for visualization and discussion purposes and the same technique can be applied to identify your operation and project deliverable. I am not focusing on operations as it is one of the smallest but most important areas to focus on where you just need to define some processes with minor tools help like defining your processes to handle incidents and the factors in which you have to start a Post Mortem on the issue.

SRE, DevOps Engineering deliverables are Constant and it is not N!
No alt text provided for this image

Believe me, SRE and DevOps are no more gray areas and these are well-established practices with tools and techniques available, and pre-defined problems are listed that you have to solve for your project. Many have already solved and shared best practices with the world so be confident and get it done with your team and move toward excellence and achieve a High and Elite performance bar on the maturity front.

3. Stand with your Heavy Weight Lifters as Coach and Mentor - SRE, DevOps, and newer technologies at scale need dedication toward implementation and achieving excellence and need heavy weight lifters in your team to attain your engineering goals so bet on your right horses to win the excellence game. Unknowns, Becoming vulnerable are some of the challenges that your heavy lifters might face but you should stand with them and help them to achieve your engineering goals.

Painting a good picture vs Becoming vulnerable by Heavy lifting and facing the technical Challenges!

4. Ask questions with the right Intent on quick wins during shortcuts- When your team is updating back with quick success on core engineering deliverables then please positively ask them about what they have done and how they have done as painting good and taking shortcuts on core engineering items can badly impact your team image and business execution in case it is not done properly so the definition of done(DoD) with extra questions about how they have done will help in building right engineering culture.

I am forcibly stopping my thoughts here and will continue on a similar topic in my "Thoughtful Leadership Series" blogs and will touch on more related topics so that we can build a community of the right leaders in the engineering domain so please stay tuned for more upcoming blogs and we will touch on other engineering items and will create more clarity.

Read such more topics here .

要查看或添加评论,请登录

社区洞察

其他会员也浏览了