Piled Higher and Deeper? ... Increasing Operational Faults and Technical Debt

Piled Higher and Deeper? ... Increasing Operational Faults and Technical Debt

Let’s take a short pause to think about the amount the operational and technical problems your team (your enterprise) is accumulating. I’m going to posit that you’re generating “problems” at a much greater rate than you are solving them.

???This should be alarming, but these problems (operational faults and technical debt) build up gradually over time. And unless we are looking at the broader context of our production baseline, we’ll probably even be comfortable with the mounting unknown risk within our production operations.

???Let’s use an example to show what’s going on in the production baseline of most software enterprises.

1. A production model

1.1 Each team

???Say that each team works with two-week sprints and produces one feature per sprint. Also, let’s say the average feature has ten stories and the average story has ten test points. So, the annual output of the team looks like this.

  • 26 features per year.
  • 260 stories per year.
  • 2,600 test points per year.

???Further let’s say that features from this team will be combined with features from other teams to form capabilities. (Note, please update this model if your experiences are different.)

1.2 An enterprise

???Now, let’s say we’re working in an enterprise with 100 development teams. So, now the combined output of the enterprise is as follows.

  • 2,600 features per year
  • 26,000 stories per year
  • 260,000 test points per year

???All this effort goes to produce 260 end-to-end capabilities per year (with the assumption that each capability is composed of an average of ten features.) ?Impressive.

2. A defect model

2.1 Defect definition

??We’ll use a definition of defect severity that’s pretty standard across the software industry. It has five levels of defects, each with varying severity of operational and developmental impact.

  • Sev-1 - Critical operational impact and no operational workaround
  • Sev-2 - Major operational impact with manual workaround.?????
  • Sev-3 - Minor operational impact with manual workaround.?????
  • Sev-4 - Technical debt/issue needs to be resolved at some point.
  • Sev-5 - Trivial technical nonconformance.??????????????????????????

2.2 Defect generation per capability

???Working with the defect model above, we’ll propose a conservative model for the number of defects generated per capability during each development.?(See if you agree, if not tune it up or down.)

  • 1 sev-1 defect generated but is usually caught during testing before release.
  • 2 sev-2 defects generated but these are also usually caught and corrected before release.?????????????
  • 4 sev-3 defects generated and may be detected and mitigated before release.?
  • 16 sev-4 defects -- Technical debt/issues.??????????
  • 32 sev-5 defects -- Technical nonconformance issues.??????????????????????????

2.3 Defect generation per year

???Now, multiply the defect numbers per capability (above) by the number of enterprise capabilities (260), we get the following model for problems/defects generated in development for the enterprise each year.????

  • 260 sev-1 operational defects per year.
  • 520 sev-2 operational defects per year.
  • 1040 sev-3 operational defects per year.?????????????
  • 4160 sev-4 technical debt/issues per year.?????????????
  • 8320 sev-5 technical nonconformance issues per year.???????????

3. Compounding problems

???The magnitude of the defects described above is just the first-order model of complexity within our system of capabilities. As multiple defects accumulate at one level, they can cause higher level problems at other defect levels within the system. We’ll model this second-order behavior as follows.

  • Every 10 sev-2 defects generate a new sev-1 defect.
  • Every 10 sev-3 defects generate a new sev-2 defect.
  • Every 15 sev-4 defects generate a new sev-3 defect.
  • Every 20 sev-5 defects generate a new sev-4 defect.

???And interestingly, defects introduced by combining of lower-level effects within a system can be the some of the most difficult to test, detect, isolate, and repair.

4. What gets fixed… and what doesn’t

???Clearly, based on the definitions above, an enterprise doesn’t release software with sev-1 or sev-2 defects, and rarely (temporarily) releases with sev-3 defects. The expectation is that we’ll catch these faults in the test process and correct them before release.

???But some sev-1/2/3’s and most sev-4/5’s are not tested and therefore not corrected/mitigated during development. And they will all move into the production baseline. Of course, the more critical faults will be detected during operations and be corrected. ?Still others will remain dormant waiting to be combined with new capabilities (and problems) in the next release.

5. What’s happening over time

???All this brings us to the discussion of what’s happening to the enterprise production baseline over time. The figure above shows this effect over a five-year period.

???The graph depicts the number of problems degenerated as compared to the number that are corrected or mitigated. Note the order of magnitude difference between these two curves and that the distance between them is opening (not closing) over time.

6. Leadership Lessons

???As a leader in this situation, your take-home points are these:

  • The time you spend each year fixing problems is steadily increasing, leaving less time to develop new capabilities.
  • And the new capabilities you are producing are built on a foundation with increasing technical debt.

???It’s essential that you model your enterprise and understand your risk… and your risk tolerance.?What are the trade-offs you can make to improve first-time quality and avoid costly operational failures?

??Only you can determine if you have a steaming pile of non-conformance… or a pending enterprise baseline collapse.

--------------

Thanks for reading. If you found this helpful in any way, please ‘share’ with your network.

要查看或添加评论,请登录

Ray Carnes的更多文章

社区洞察

其他会员也浏览了