登录查看更多内容

Blameless Feature Reviews

Diego Pacheco

Head Of Architecture at ilegra

发布日期: 2024年3月30日

Have you ever wondered if what you build has the right impact on the customers? Engineering is often demanded to be on time and cost-effective. Bugs and incidents are known for disrupting the customer experience. Less bugs, the better; fewer incidents, the better. So when do we reduce bugs and incidents? Devops has an interesting practice called Blameless Incident Reviews (BIR). Considering the devops culture and movement, blameless incident reviews are great because they drive the right culture shift from feat and blame to sharing and understanding. BIR is often pull-based, which happens when we have a number of production bugs that are worth sharing and driving lessons learned to the whole org. Devops is all about better ways of building and operating software. We cannot do better if we are stuck with the same practices all the time; practices need to be changing and evolving and a way to keep us fresh and learning at all times.

How does Blameless Incident Review work?

Usually, there is a classification of severity; imagine a scale from 1 to 5, for instance. S1 will be the most severe bugs and incidents that should be reviewed. S5 might be less severe and maybe we dont need to worry so much. How can we make a difference between S1 and S5? Customer disruption, Financial Loss, and Mean Time to Recover ( MTTR) are good criteria.

Once there is some value of incidents or critical bugs, let’s say 3–5, for instance, then we can have a meeting where we talk about such issues. Now imagine that this meeting does not need to be fixed(push model). It could be pull-based (happens on demand).

Then, one will present what happens, usually in the form of a timeline of events. The most important part of this practice is twofold. First, we need to drive lessons learned. Second, we need to take action to improve things.

Let me say this very clearly to take this out of the way. Blameless incident reviews are RETROSPECTIVES. You cannot have effective retrospectives with our good facilitation, and lessons learned need to be driven. What did we learn from incident X? BIR can go bad if we just create tickets and do not drive lessons learned. Plus we need actions, we need start review previous incidents to see what we did it wrong and if we made it better or not. Otherwise, you just have a meeting to present tickets.

Done right, BIR can be amazing. It helps the company learn and really improves the product and the process. This is one of the best practices that the DevOps movement gifted us. Now think about this, if BIR is great for us to learn from bugs and incidents and make sure they do not happen again, what about features? What do we do with features?

What about Features?

Features are massive sources of investment in software. People are expensive, engineers are expensive. Product cares a lot about features. Now how do we know we are being effective with features? there are several things we can do in modern product development:

Proper Discovery: via discovery process, personas, user interviews, user research, prototypes, and rapid experimentation.
Metrics: what we call observability in engineering. How much growth are we having, and how much % user retention are we getting? Net Promote Score (NPS) and other proxy metrics. A great metric is counters, how many times the user clicks on the screen/button?
Feedback: for mobile apps, app store reviews, and review websites like Yelp, Google, and others. Direct customer feedback via support and many other sources.

领英推荐

??Why Every Development Team Needs Continuous Delivery…

Clovity 8 个月前

DevSecOps? Speak the Language Your Team Understands

Henry Jiang 2 个月前

DevSecOps - Demystified

Ramesh Munamarty 5 年前

Why does this matter? Because it costs a lot of money to make software. So, we need to know if we are going in the right direction or not. When we produce software, there is a chance it works, and there is a chance it does not work and turns out to be a bug. Bugs can cause incidents, and incidents can happen because of bugs or by some other mistake.

Incidents and bugs affect the user experience and need to be minimized. That’s why we need to learn and we use blameless incident reviews to learn and improve. Now, we can do a lot of things for features from discovery, metrics, and feedback, but is that enough?

Blamess Feature Review (BFR)

Think about this idea. What if in a regular cadence, 1x per month or after 10 features are delivered, we sit together or virtually and all review how the features are going? By definition, product management is or is supposed to be doing that. However, are engineers involved?

IMHO, they should be because some features are also sources of technical debt. Being able to delete features means deleting technical debt. Software needs a counterforce, where we reduce complexity. Adding features all the time means adding complexity all the time. The easiest way to reduce technical debt is by decommissioning software, but you cannot decommission software that is being used.

But what if software is not being used? What if the software does not drive the results we want? Would that be a low-hanging fruit opportunity for future cleanup? Plus, why did we build something that the users did not want? Should the builders know about it, or should only the product know? I would argue that the product is everybody’s responsibility. Imagine someone present feature X and say:

This is feature X
It cost $$$
It took X months to be done
It has Y bugs associated with it
It has Z many page views per day
It requires services A, B, and C
Customers are saying XYZ on the Apple Store, and XYZ in the Google Play store
We apply this discovery process: bla bla bla

Product view and engineering view all together. Imagine if there are two buckets, the top 5 best features, and the top 5 less used features. It would be great to compare them and see what we can learn. Doing better products is a cross-functional sport and requires engineering to know what works and does not work for the users. So we need a review process; why not start doing Blameless Feature Reviews?

Originally published at https://diego-pacheco.blogspot.com on March 30, 2024.

要查看或添加评论，请登录

Diego Pacheco的更多文章

The Roads Approach

2025年3月23日

The Roads Approach

It’s normal for engineers to think of the best solution possible when we think of solutions. A bad engineer would…
The Dark Side of LLMs: part 2

2025年3月11日

The Dark Side of LLMs: part 2

July 2024: I wrote the first blog post about The Dark Side of LLMs. During these 7 months, many things have changed;…
The Monk and The Rockstar

2025年2月26日

The Monk and The Rockstar

I have been doing practical and real software architecture for more than 20+ years. Software architecture is a great…

4 条评论
The Issue with Feedbacks

2025年2月25日

The Issue with Feedbacks

I love feedback. I believe in feedback a lot.
Quality Needs to be Managed

2025年1月1日

Quality Needs to be Managed

Quality often means something different to each person. My definition of quality revolves around technical excellence.
State

2024年12月27日

State

If you look up on dictionary.com the first two definitions of state are: 1.
Leaky Contracts

2024年12月26日

Leaky Contracts

Service contract design is hard. People do it all the time, but it is not always correct.

2 条评论
Services

2024年12月24日

Services

We are in the holiday season. You walk into any Starbucks and see the Christmas decorations.
Proprietary Systems and Distributed Monoliths

2024年12月21日

Proprietary Systems and Distributed Monoliths

Distributed Monoliths are the predominant form of modern legacy systems. Sometimes distributed monoliths are created by…
Functional Programming

2024年10月22日

Functional Programming

There are many programming languages. Most of them are based on C.

1 条评论

See all articles

Blameless Feature Reviews

Diego Pacheco

Head Of Architecture at ilegra

How does Blameless Incident Review work?

What about Features?

领英推荐

Blamess Feature Review (BFR)

Diego Pacheco的更多文章

社区洞察

其他会员也浏览了

A Systematic Map To DevOps Success

DevSecOps

SRE concepts part 9 ( Stability versus Agility )

Tracking What Matters: How we're establishing DORA metrics and unlocking new opportunities

Chasing the elusive Continuous Deployment Part 2 The Second Phase

DevOps Vs DevSecOps Vs SRE

The Leadership Guide to Holistic Testing and Devops

Recipe for CI-CD — Chapter Four: Operations

Why DevOps Is Fragmented And How To Make It Work

Governance Operations

How does Blameless Incident Review work?

What about Features?

领英推荐

Blamess Feature Review (BFR)

Diego Pacheco的更多文章

The Roads Approach

The Dark Side of LLMs: part 2

The Monk and The Rockstar

The Issue with Feedbacks

Quality Needs to be Managed

State

Leaky Contracts

Services

Proprietary Systems and Distributed Monoliths

Functional Programming

社区洞察

其他会员也浏览了

A Systematic Map To DevOps Success

DevSecOps

SRE concepts part 9 ( Stability versus Agility )

Tracking What Matters: How we're establishing DORA metrics and unlocking new opportunities

Chasing the elusive Continuous Deployment Part 2 The Second Phase

DevOps Vs DevSecOps Vs SRE

The Leadership Guide to Holistic Testing and Devops

Recipe for CI-CD — Chapter Four: Operations

Why DevOps Is Fragmented And How To Make It Work

Governance Operations