Blameless Feature Reviews

Blameless Feature Reviews

Have you ever wondered if what you build has the right impact on the customers? Engineering is often demanded to be on time and cost-effective. Bugs and incidents are known for disrupting the customer experience. Less bugs, the better; fewer incidents, the better. So when do we reduce bugs and incidents? Devops has an interesting practice called Blameless Incident Reviews (BIR). Considering the devops culture and movement, blameless incident reviews are great because they drive the right culture shift from feat and blame to sharing and understanding. BIR is often pull-based, which happens when we have a number of production bugs that are worth sharing and driving lessons learned to the whole org. Devops is all about better ways of building and operating software. We cannot do better if we are stuck with the same practices all the time; practices need to be changing and evolving and a way to keep us fresh and learning at all times.

How does Blameless Incident Review work?

Usually, there is a classification of severity; imagine a scale from 1 to 5, for instance. S1 will be the most severe bugs and incidents that should be reviewed. S5 might be less severe and maybe we dont need to worry so much. How can we make a difference between S1 and S5? Customer disruption, Financial Loss, and Mean Time to Recover ( MTTR) are good criteria.

Once there is some value of incidents or critical bugs, let’s say 3–5, for instance, then we can have a meeting where we talk about such issues. Now imagine that this meeting does not need to be fixed(push model). It could be pull-based (happens on demand).

Then, one will present what happens, usually in the form of a timeline of events. The most important part of this practice is twofold. First, we need to drive lessons learned. Second, we need to take action to improve things.

Let me say this very clearly to take this out of the way. Blameless incident reviews are RETROSPECTIVES. You cannot have effective retrospectives with our good facilitation, and lessons learned need to be driven. What did we learn from incident X? BIR can go bad if we just create tickets and do not drive lessons learned. Plus we need actions, we need start review previous incidents to see what we did it wrong and if we made it better or not. Otherwise, you just have a meeting to present tickets.

Done right, BIR can be amazing. It helps the company learn and really improves the product and the process. This is one of the best practices that the DevOps movement gifted us. Now think about this, if BIR is great for us to learn from bugs and incidents and make sure they do not happen again, what about features? What do we do with features?

What about Features?

Features are massive sources of investment in software. People are expensive, engineers are expensive. Product cares a lot about features. Now how do we know we are being effective with features? there are several things we can do in modern product development:

  • Proper Discovery: via discovery process, personas, user interviews, user research, prototypes, and rapid experimentation.
  • Metrics: what we call observability in engineering. How much growth are we having, and how much % user retention are we getting? Net Promote Score (NPS) and other proxy metrics. A great metric is counters, how many times the user clicks on the screen/button?
  • Feedback: for mobile apps, app store reviews, and review websites like Yelp, Google, and others. Direct customer feedback via support and many other sources.

Why does this matter? Because it costs a lot of money to make software. So, we need to know if we are going in the right direction or not. When we produce software, there is a chance it works, and there is a chance it does not work and turns out to be a bug. Bugs can cause incidents, and incidents can happen because of bugs or by some other mistake.

Incidents and bugs affect the user experience and need to be minimized. That’s why we need to learn and we use blameless incident reviews to learn and improve. Now, we can do a lot of things for features from discovery, metrics, and feedback, but is that enough?

Blamess Feature Review (BFR)

Think about this idea. What if in a regular cadence, 1x per month or after 10 features are delivered, we sit together or virtually and all review how the features are going? By definition, product management is or is supposed to be doing that. However, are engineers involved?

IMHO, they should be because some features are also sources of technical debt. Being able to delete features means deleting technical debt. Software needs a counterforce, where we reduce complexity. Adding features all the time means adding complexity all the time. The easiest way to reduce technical debt is by decommissioning software, but you cannot decommission software that is being used.

But what if software is not being used? What if the software does not drive the results we want? Would that be a low-hanging fruit opportunity for future cleanup? Plus, why did we build something that the users did not want? Should the builders know about it, or should only the product know? I would argue that the product is everybody’s responsibility. Imagine someone present feature X and say:

  • This is feature X
  • It cost $$$
  • It took X months to be done
  • It has Y bugs associated with it
  • It has Z many page views per day
  • It requires services A, B, and C
  • Customers are saying XYZ on the Apple Store, and XYZ in the Google Play store
  • We apply this discovery process: bla bla bla

Product view and engineering view all together. Imagine if there are two buckets, the top 5 best features, and the top 5 less used features. It would be great to compare them and see what we can learn. Doing better products is a cross-functional sport and requires engineering to know what works and does not work for the users. So we need a review process; why not start doing Blameless Feature Reviews?

Originally published at https://diego-pacheco.blogspot.com on March 30, 2024.

要查看或添加评论,请登录

Diego Pacheco的更多文章

  • The Dark Side of LLMs: part 2

    The Dark Side of LLMs: part 2

    July 2024: I wrote the first blog post about The Dark Side of LLMs. During these 7 months, many things have changed;…

  • The Monk and The Rockstar

    The Monk and The Rockstar

    I have been doing practical and real software architecture for more than 20+ years. Software architecture is a great…

    4 条评论
  • The Issue with Feedbacks

    The Issue with Feedbacks

    I love feedback. I believe in feedback a lot.

  • Quality Needs to be Managed

    Quality Needs to be Managed

    Quality often means something different to each person. My definition of quality revolves around technical excellence.

  • State

    State

    If you look up on dictionary.com the first two definitions of state are: 1.

  • Leaky Contracts

    Leaky Contracts

    Service contract design is hard. People do it all the time, but it is not always correct.

    2 条评论
  • Services

    Services

    We are in the holiday season. You walk into any Starbucks and see the Christmas decorations.

  • Proprietary Systems and Distributed Monoliths

    Proprietary Systems and Distributed Monoliths

    Distributed Monoliths are the predominant form of modern legacy systems. Sometimes distributed monoliths are created by…

  • Functional Programming

    Functional Programming

    There are many programming languages. Most of them are based on C.

    1 条评论
  • Proper Error Handling

    Proper Error Handling

    No matter what programming languages you use. Engineers need to make dozens to hundreds of small decisions every day.

社区洞察

其他会员也浏览了