Creating and Using Quality Metrics in Software Delivery - Part III
In the previous articles, I wrote about some challenges associated with poorly chosen metrics and an over-reliance on outcome metrics. The latter tend to provide feedback far too late in the development cycle. Ideally, we also want metrics to tell us what is happening well enough in advance of a release to afford us the opportunity to take corrective actions. The advantage is simple: if we can identify potential problems in CI or testing, we can hopefully prevent deploying defective software to our customers. Finding issues in the development stage allows us not only to fix problems but also gives us the possibility to identify and address potential problems with our architecture. These two benefits are reason enough to adopt a meaningful, if not comprehensive, set of input metrics.
Technical Debt
The primary reason we want to monitor our development processes is to discover activities which may impact the quality of our released product or service: high-risk commits, rising code or architectural complexity, or emerging hotspots. Further, we want to find these things as early as possible - preferably in real-time. The above are all examples of accumulating technical debt. This is our real enemy. Long-term deferment of paying down technical debt is the scourge of modern software development. Dan Radigan of Atlassian called it a black hole.
Technopedia says technical debt is "the implied cost of additional rework caused by choosing an easy solution now instead of using a better approach that would take longer." Technical debt is an actual cost, and deferring repayment accumulates interest. It is not some abstract concept which may or may not impact your product. It is real. In some ways, you can think of it like a bank loan: the longer it takes to repay, the more interest accrues. At some point, your interest payments in terms of effort to repair become so large that it endangers your business. Also, like a bank loan, no single individual in a company can take out a loan without coordinating with the Board. Here is where the similarity ends. Unlike a bank loan, with technical debt, almost anyone involved in the development process can, and usually does, assume debt on behalf of the company.
But all technical debt is not equal. Having technical debt is, by itself, not particularly problematic. Think about buying something on your credit card: if you pay it off in 30 days, all is good. What is dangerous, however, is to put off fixing something in your code. If you leave problematic or risky code in your codebase, there is a good chance that you'll build new code (a module or service, for example) which relies on the risky code. You'll be building good code on top of bad. It could be that this is a risk you are willing to take. But, over time, you might forget the risk is there and debugging your code will become a nightmare. Historian Michael Howard wrote, "all we believe about the present depends on what we believe about the past." If we apply this idea to software and ignore the accumulated technical debt in our code base, we will probably believe our code is more stable and secure than it is. Better to find and address the risks early. Before we forget them.
So, how does one go about choosing and tracking meaningful input metrics? The correct ones for your organisation will vary depending on your application, infrastructure and architecture. Luckily, we can track some fundamental metrics to help us, and if we act on them, we will pay down our most dangerous technical debt.
Monitoring the development processes
Input metrics can be used to find and prioritise work to reduce risks to the delivered product or service. Input metrics can also be used to identify weak areas to improve code which may work fine for now but has a chance of becoming a severe problem later: paying down the above-mentioned technical debt. For the former, it should be evident that there are real benefits to knowing exactly where you are in software quality and delivery before the end of a project or delivery to production. For the latter, knowing what refactoring to prioritise will be the key.
There are a myriad of ways you might go about monitoring the quality of your code, but here are a few suggestions: of things to track
Trends over Time
Once you decide what metrics to watch, you must understand what is happening over time. It is important to remember that input metrics are not always good indicators of quality or risk at a particular point in time. Instead, the results should be analysed over a period of time to determine what is happening.
A good example is code complexity. One can plot a complexity trend over time. The graph below clearly shows that this code's complexity began to grow rapidly in October. It is also growing non-linearly as the lines of new code increase. This often means that the code will become harder and harder to understand over time and could become high-risk technical debt. It should be watched carefully and identified as a good candidate for refactoring.
Active instead of passive code analysis
To find high-risk technical debt in your code, you can run a nightly analysis of your codebase or, even better, during commits. As I mentioned earlier, a hotspot is?complicated, high-risk code that your developers are actively working on. [1] Once identified, hotspots can be monitored and discussed regularly for inclusion in your prioritised backlog.
But code health can also be monitored in the CI build and test phase. The Codescene example in figure 2 is a good example. The file already has a McCabe complexity of 68 and a high degree of code duplication, meaning that many functions are similar and can probably be expressed using shared abstractions. If a developer commits additional code to this file (or changes it), you would want to know that it is a potentially high-risk commit.
Code Health
Another value offered by Codescene is an input metric that measures the quality of your code and the risk implicit in the code itself. Though a subjective measure, according to Adam Tornhill, the creator of Codescene, code quality fills several vital gaps in code analysis.
领英推荐
Code analysis tools have been around for a long time and should be part of any developer's or team lead's toolbox. When using code analysis tools as the source for input metrics, try to follow two basic rules:
Toxicity
Toxicity is a measure of code with poor internal quality and is hard to maintain or extend. In many ways, this input metric is not a single measure but a collection or index of several metrics. Nevertheless, many teams find it useful, so look it over and see if it can help your product.
Toxicity in code can be indexed based on aggregated measures of common problems.[2]
Input Metrics from a Value Stream Mapping
I want to describe one final set of input metrics briefly, which comes from the Lean concept of value streams. The non-value added or waste time can be measured in real-time to give you a point-in-time view of development progress. Each gap (wasted time or hand-offs) in the value stream map can be used as a metric to gauge efficiency.
The gaps in the lower segment of the VSM represent wait states and often indicate extended handover times. These times impact your overall delivery times and must be reduced or eliminated.
You can also calculate and monitor summary metrics based on a Value Stream Map. These include Total Process Time, Activity Ratio, Total Lead Time and others. If you are unfamiliar with Value Stream Mapping or want to learn how to apply it to your development processes. In that case, I recommend you read Karen Martin and Mike Osterling's Value Stream Mapping: How to Visualize Work and Align Leadership for Organizational Transformation, 2014.
Continuous Monitoring
We have become accustomed to DevOps and Operations using real-time monitoring to assess the health of a production application or platform as input metrics. We should apply this same paradigm to the development process. We can do this by integrating and monitoring specific metrics directly in our CI and build systems for indications of impending doom. Indeed, all the tools I cited above can and should be integrated into the build process. Suppose a developer or test specialist can get real-time feedback on commits or tests. In that case, if, for example, it is to an already high-risk file - they can react early enough in the development cycle to minimise disruption and cost. When used effectively, these input metrics will help you identify potential risks in your product code and also help you prioritise and pay down technical debt.
A final word on metrics. Metrics cannot be the goal. Good metrics only provide hints to where problems may exist and help guide us to see risks. More important is to evaluate the processes and activities behind the metrics and ultimately to engage with the people working in your team: product owners, developers, testers, release managers, DevOps, and management. No tool or metric is going to solve process or behavioural problems. They can, in most cases, illuminate the problem to allow you to take action.
Note: As with any outcome-based metric, an input metric (or, more accurately, input measurements) is most useful when applied to teams, services or programs. They should never be used to compare individuals. [4]
[1] https://codescene.io/docs/guides/technical/hotspots.html
[2] https://erik.doernenburg.com/2008/11/how-toxic-is-your-code/
[3] Ibid.
[4] John Seddon, an occupational psychologist, researcher and professor.
Software Architect at APIS Informationstechnologien GmbH
6 年Hi Peter! Great articles on this complicated subject. Thanks!? I have a editorial note though.? Figure 2 is not showing up. The link to your github account may not be correct.
Digital transformation, in a secure way...
6 年Hi Peter, Thanks a lot for this series of articles about metrics. Me as a Test Manager (if there is something like this now-a-days) and even more as a Test Mentor like very much the power of metrics to control your test activities and the quality of the product. Nonetheless, I know that they only serve as an indicator, to let you know that you might need to analyse a situation further. However, there is one point which I don‘t understand that well: The difference between Outcome, Input, Output and Activity related metrics. In Part I I thought there was a typo and you meant Input wheneven you said output. But then you did the same in Part II and III and I was lost again. What I thought I understood was: Outcome means to measure the product. And activity to measure the effort/input/activities/resources I needed to achive the product as it is. But after all your articles I‘m afraid that I missed something and that Outcome, Output, Input and Activity based metrics are kind of four different dimensions. Can you please clarify? If not in writing, maybe during this years Agile Testing Days. Thanks a lot and regards, Tobias