Notes on technical debt analysis

Notes on technical debt analysis

This article focuses on just one element in the graphic above, technical debt (TD) which is usually addressed at the enterprise/portfolio level.

  • Paying off a TD = paying a premium for insurance against a technical risk.

Generally, you incur a TD when you don't do something (A) that would make a possible future (Z) cheaper, safer or easier. But there are infinite possible futures, and Z may never happen. To update software more often that you need to can be as costly as never updating it. So doing it now is a judgement call, and its benefit is often not convincingly quantifiable.

TD is badly named. TD is subtype of technical risk. Like other risks, and unlike a debt, TD can go down as well up. And sometimes, like an insurance premium you chose not to pay, TD can evaporate. If you forget about TD, or take the risk and survive, you may breathe a sigh of relief and feel thankful for the money you saved.

E.g. A few years ago we assigned a TD value to our failure to upgrade from Windows 7 to Windows 8. Later, it turned out that people didn't like Windows 8. And when Windows 10 came along, we decided to skip the 7 to 8 migration entirely. Lo! The debt shrank as Windows 10 got near, and evaporated when we migrated to it.

In this Medium article, a simple formula is presented

  • Technical Debt Ratio = (Remediation Cost / Development Cost) x 100%

The article says the two costs may be measured in hours or money, and a ratio of more than (say) 5 to 10 is unacceptable. The formula may look simple, but its result can be far from accurate. Estimating is a weak science. Both numbers are guesses. To quantity remediation costs implies knowledge of a) future requirements and b) what will be needed to address them.

This article is about technical data analysis in EA, rather than in software architecture. Suppose the remediation costs are package upgrade costs, then how does TD account for the savings made by skipping an upgrade?

Technical debt analysis in EA

TD management is a variety of risk management. To pay off some TD is to pay an insurance premium to mitigate a risk - that a current technology or system will fail, or will be impossible to amend, extend or integrate when needed - meaning that it will have to be recovered, replaced or refactored.

The risk may relate to a technology becoming out of support, or relate to older systems being less secure than newer ones. But what if the likelihood and impact of that unsupported technology failing remain low?

And what if those older systems are behind modern firewalls, and/or publishing the data they hold would do little or no harm? Where the likelihood and impact of a risk are low, then the urgency of and budget for work to mitigate that risk should be correspondingly low.

How to calculate technical debt?

Best express TD in business terms to get executive buy in. E.g. by giving numbers to:

  • Ongoing extra costs of operating, supporting or changing an old technology/system, compared with what might replace it. (Bear in mind that, for example, where an old system ran happily on 1 data server, its replacement coded in Java needed a more complex client-server stack with a cluster of 4 app servers above the data server).
  • Ongoing difficulties created by this technology/system, say, in changing or enhancing a related technology/system. (Bear in mind that various loosely-coupled application integration design patterns might be used.)
  • Costs of failure = business lost + cost of recovery. Cost of business lost = lost income and/or profit. Cost of recovery = cost of repair or replacement and catch-up operations.
  • The likelihood of failure in the next time period(s).

A scientist may demand the numbers be expressed as ranges, with a spread of different likelihoods across the range. However, risk management often falls short of what is scientifically provable. Sometime the cost is not a loss of sales; rather it is doing business less well, to some difficult-to-measure degree.

It is always possible to make up numbers, but often impossible to convince a skeptical manager the numbers are accurate. E.g. What is the cost of your email system failing - for a day, for a month, forever?

The question is: where, when and how much should we spend to reduce TD? There is rarely a provably correct answer. Where likelihood and impact numbers are questionable guesses, you have to make a judgement call.

How to calculate the business case for paying off the debt?

Business case development is a black art in itself, not addressed here. The cost of paying off TD can be considerable. The benefit often comes down to reduced costs (see above) or greater security. Add to that increased business profit or volume, or more efficient business and ease of working, or better quality of service and customer relationship management.

How to minimize technical debt build up?

Use stable, proven, design patterns and technologies. Generally, established technologies build up TD more slowly than newer ones. Given the ever-evolving IT domain (design patterns, languages, class libraries, middleware products, execution environments, web/app servers and data storage technologies) using the very latest design fashions, software frameworks and technologies are likely build up TD faster.

How far to design for extendability?

The concept of TD may be applied to the difficulty and cost of reshaping a current system to meet new requirements. If there is not enough time and budget to amend or extend an existing system, it may become out of step with how you want to do business.

You can mitigate the risk either by paying more up front, to design a system that is more readily extendable later, or by setting aside some time and budget to refactor the system now and then.

Designing to anticipate future requirement adds cost more up front. And designing to anticipate one kind of requirement may inhibit other kinds. E.g.

  • Designing for flexibility increases complexity and often slows down the response or cycle time.
  • Designing for scalability often increases the amount of data disintegrity.

And what if designed-for requirements never arise? You must weigh extra cost now against the cost of not being able to make future changes. Neither is likely to be easily measurable.

When to pay off technical debt?

TD repayment rarely amounts to a mandatory requirement. It is even possible that delaying the payment will reduce the amount. In practice, what is perceived to be a technical “debt” may shrink. E.g. an out-of-support OS may run fine for years. There may be no increase in operational cost or failure risk - for longer than you first think. Skipping an OS version or package release might prove cheaper than upgrading every time.

The result - big up front plans or agile pipeline?

The concept is badly named. You incur a"debt" of this kind the moment you do anything that makes some future action more difficult than if you do something else. And since there are infinite different possible futures, there are infinite possible debt estimations you could make.

TD isn’t debt; it is risk. A better name might be technical insurance premium. As with all insurance, you should do some risk analysis and make a judgement before paying the premium. The likelihood of failure is uncertain; the need to amend or extend a system in future is uncertain. And bear in mind that some residual risk will remain after you pay off some TD.

Whatever TD calculations you do and judgments you make, the end result is decisions about which systems to spend money on, and when to do it. These decisions may be reflected in application/technology road maps and/or specific requests for architecture work (as indicated in the graphic above).

Agile technical risk management? If you can't commit to long term plans and budgets, you can take a more agile approach, create a backlog of TD items, prioritize them and process them in Kanban style. Beware that time/cash boxing what is done may prevent you tackling the "big ticket" items.

If you don't pay TD down, it can be catastrophic for your organisation. On the other hand, since every system becomes legacy the moment it goes live, if you pay it all down as soon as you detect it, you'll go bankrupt. As Daryl Carr said to me, TD is not intrinsically good or bad. It is only a concept that helps us think about how to optimize where and when we spend money to replace or refactor business systems.

Further reading

If you want more analysis of enterprise and solution architecture concepts and principles, read articles you can find here.

If you want training to Professional Certificates in Enterprise and Solution Architecture, you can book on one of the ESA courses advertised?here.

Great article Graham, thanks for sharing

回复
Martin P.

Director Enterprise Architect at UCL | Making Business & Technology Solutions Work

5 年

I'd go as far as saying: "the very latest design fashions, frameworks and technologies DO build up technical debt faster.". It's pretty much a consequence of the hype cycle. What starts out today as the bright, unicorn like, future capability to cure all ills, becomes at best, a reliable old workhorse with a bit more pulling power. At worst it's the chronically ill nag that incurs enormous vet bills even to put it down let alone sustain it. Which begs the question "Why are businesses so keen to believe in unicorns?". Is it perhaps because so long as we can blame some technology fad or other we can avoid the hard questions that we should direct at ourselves and our processes?

回复
Craig Imlach

Scrum Master - NAB

5 年

A good start but I believe it is missing a few items: 1) Cost of change - negative impact on other changes 2) Cost of support and maintenance 3) Cost of Operations

I see two factors?that should be also be considered. Technical debt is incurred as the result of a trade off decision to not do all, part or none of a required investment to address a business need whilst achieving, or moving towards a defined target state. As a result there will be an anticipated cost of remediation at some future point to address this and in addition there may be residual needs that have not been met. The deferred cost of remediation towards target state can be estimated as an increase in technical debt, and secondly, there is the unrealised revenue/profit of the unmet needs.

回复

要查看或添加评论,请登录

Graham Berrisford的更多文章

  • The systems of interest

    The systems of interest

    "Systems thinking" is not one coherent and consistent discipline or science. Different system theorists have different…

  • Complexity in the Universe (Sean Carroll)

    Complexity in the Universe (Sean Carroll)

    "Systems thinking" is not one coherent and consistent discipline or science. Different system theorists have different…

    6 条评论
  • How we abstract systems

    How we abstract systems

    This article is about how - in System Dynamics, cybernetics, sociology, enterprise architecture and software…

  • How we assess truth

    How we assess truth

    "Nice and well-stated." It is fashionable to say (as Prince Harry did) that "my truth” is as true as any objective…

    1 条评论
  • Who we are

    Who we are

    "I think therefore I am." Descartes.

    1 条评论
  • How we typify things

    How we typify things

    "Very helpful" It is a law of nature that similar things, in similar situations, appear similar and behave similarly…

  • Entity Event Modeling (EEM)

    Entity Event Modeling (EEM)

    An entity event model not only relates entity types, but also specifies events that affect the entities, and so…

  • Determinism and free will

    Determinism and free will

    "A very interesting article, especially about rule violations and AI." One of the complexities a systems thinker must…

    7 条评论
  • On Peirce's categories

    On Peirce's categories

    I have very little to say about logic and linguistics and most of the many things that Peirce wrote about. This article…

    42 条评论
  • The structure/behavior dichotomy

    The structure/behavior dichotomy

    This article discusses how we can describe the state and progress of things by carving the world into discrete entities…

社区洞察

其他会员也浏览了