The Story Point Problem
https://www.dhirubhai.net/pulse/understanding-agile-story-points-suren-gaur/

The Story Point Problem

When we hear of the difficulties of making decisions in the presence?of uncertainty, especially?about software?features and capabilities, there are straightforward ways to solve this problem.

The numbers we use on projects come in two types

  • Ordinal?measures tell us the relative difference between items.
  • Cardinal?measures is a?number?that says how many of something there are, such as one, two, three, four, or five.

Story points are Ordinal numbers, Stories are Cardinal numbers. Story Points are relative measures of effort. They are not duration?or cost. Story Points are measures of?relative?effort [1]

Story Points are arbitrary measures used by Scrum teams to determine the Relative (Ordinal) effort of the work. They tell the team how hard a story is, from its perceived complexity, risk, and unknowns – each related to effort. These?Relative?(Ordinal) measures are the antithesis of Business?Management measures of work planning and accomplishments. They are in Hours, and their?rated?dollars?for the direct labor needed to produce the outcome (assuming no material cost).

Story Points don't tell us the duration or cost of this relative effort. Story Points need to tell us the absolute effort to perform the work. They aren’t normalized across work efforts, across teams, or across the program.?Story Point effort estimates are not Calibrated across the project but rather are developed for the work at hand.?The calibrated units of measure for Story Points can and will change as the program progresses.

Business operates?in units of dollars and duration for the work needed to produce the needed capabilities in exchange of those dollars and time. Business does not operate in units of Story Points.

The killer question is,?what is a Story Point Worth to those paying for the work??Agile?teams rarely produce comparable?calibrated?Story Points for dissimilar or similar work.?This is a key difference between Business estimates?and Agile estimating. Most businesses?have an external Basis of Estimate process to?calibrate?the cost and duration of planned work. Business?teams working on different parts of the project, with different assessments of Effort, story point values, and project costs, result in dissimilar?units of measure?for a Story Point.

When Agile?teams have different approaches to applying Story Points, the physical effort?can still be calculated for each team and rolled up to the Total Story Point count for the project for an individual Feature Physical Percent Complete.

The program-level budget can?flow?down?to the planned?Work in the Product Backlog and?be connected?with the Total Story Point count built bottom-up from the Agile?Planning process.

From there, all Physical Percent Complete?calculations remain the same - units of Story Points and Dollars.?

No alt text provided for this image

With the proper application of Story Points, at the agile estimating level, the Business can produce a Cardinal estimate?of the cost of the work with some simple rules:

  • If a single team is doing the Story Point estimates, that single team remains intact during the Sprints, and their cost remains intact during the performance period. That team has a known capacity for work measured in story points from past performance, then the Story Points can be converted to Dollars for the Business.
  • If any of these conditions is false, the Story Point estimate (relative effort) is no longer valid for the business.

Measuring?The Project with Stories and Their Completion Rate

There is a conjecture that measuring Stories is better than measuring Story Points. Here's the simple answer to that conjecture

This can only be valid if the Stories are statistically similar enough?that their individuals variances (range of actual effort versus the estimated effort) in the collection of Stories is de minimis. That is the Stories are statistically identical.?

If this is not the case, then using Stories rather than Story Points as the measure of effort and conversation of those measures into units meaningful to the business is a fool's errand. It violates the principles of?statistical process control?since the unit of measures of plan and progress itself has statistical variance unaccounted for the data received is bogus.

If you don't have statistically?identical relative efforts for all stories, never use Stories, Only?use Story Points.
As well, a second caution is the false assumption that the future is always like the past. I got a book for Christmas?Fool Proof, Greg Ip and our false belief that the future will be like the past. On any non-trivial project this is never the case, so making the assumptions that all stories are of the same size turns us into the?fool?in?Fool Proof.

Decision Analysis - Ordinal and Cardinal Measures

When we hear of the difficulties of making decisions in the presence?of uncertainty, especially?about software?features and capabilities, there are straightforward ways to solve this problem.

Decision Analysis is a principle, technique, and application to address complex decisions in a structured manner. One approach to decision-making utilizes a form of multi-criteria decision analysis to evaluate multiple conflicting criteria in decision-making. This method is the Analytic Hierarchy Process (AHP). AHP was developed in the 1970s by Dr. Thomas Saaty. AHP has been studied, refined, and applied for over 40 years to be an effective approach?to complex group decision-making. [7]

In our agile software development world, AHP is rarely found. I came to AHP through a seminar by Dr. James T. Brown and his book?The Handbooks of Program Management: How to Facilitate Project Success with Optimal Program Management .

AHP structures decision problems as a hierarchy, shown below. At the top is the decision objective. The next level consists of decision criteria. Any number of options or alternatives follow this. AHP can be used as a Value Management System to organize the criteria and assess trades off costs and benefits in considering total value.

No alt text provided for this image

AHP is based on the principle that all measurements are relative. People are generally very good at comparing things relative to other things. AHP provides a framework to make relative comparisons using a rational decision structure based on scaled pairwise comparisons?(Borda Ranking) using?a?scale that converts stakeholder preferences and priorities into ratio measures. Using this method, the performance, cost, time, and risks of alternatives can be articulated as ratios that can then be compared with one another. This decision model for software development projects addressed: performance, cost, time, and risk.

Mathematically, the value equals performance over cost plus time. The performance, cost, and time risks must also be considered.

The relative measures of these values are?Ordinal?numbers. This is one role for Story Points in agile development. The Technical Readiness Level is a relative measure used in our space and defense domain of technical maturity.?

This is an Ordinal Measure used to make decisions, in the same way, Story Points?can?be used, if properly formed and properly used
No alt text provided for this image

Technology Readiness Levels (TRL) are a type of measurement system used to assess the maturity level of a particular technology. Each technology project is evaluated against the parameters for each technology level and is then assigned a TRL rating based on the project's progress. There are nine technology readiness levels. TRL 1 is the lowest, and TRL 9 is the highest.

When technology is at TRL 1, scientific research begins, and those results are translated into future research and development. TRL 2 occurs once the basic principles have been studied and practical applications can be applied to those initial findings. TRL 2 technology is very speculative, as there needs to be experimental proof of concept for the technology.

When active research and design begin, technology is elevated to TRL 3. Analytical and laboratory studies are required at this level to see if a technology is viable and ready for development. Often during TRL 3, a proof-of-concept model is constructed.

Once the proof-of-concept technology is ready, the technology advances to TRL 4. During TRL 4, multiple component pieces are tested with one another. TRL 5 is a continuation of TRL 4. However, technology at 5 is identified as a breadboard technology and must undergo more rigorous testing than the technology only at TRL 4. Simulations should be run in environments that are as close to realistic as possible. Once the testing of TRL 5 is complete, technology may advance to TRL 6. A TRL 6 technology has a fully functional prototype or representational model.

TRL 7 technology requires the working model or prototype to be demonstrated in a space environment. TRL 8 technology has been tested and "flight qualified" and is ready for implementation into an existing technology or technology system. Once a technology has been "flight-proven" during a successful mission, it can be called TRL 9.

Risk assessment can be made with Ordinal value as well. Each risk factor's severity can be rated on an Ordinal scale.?The ordinal risk values are then combined with additive weighting or multiplication to compute an aggregate measure of overall risk. [1]

Reminder of Core Issue with Ordinal Measures?

First, some definitions:

  • A Cardinal number?says how many of something there is
  • An Ordinal number?tells the position of something relative to something else

Measurement theory is a well-debated topic and has been going on long before agile software developers started using Story Points?[2], [4]. The core issue is that Ordinal measures are subjective and provide little predictive value. This issue starts with the process of assigning numbers to objects so that the unique relationships of the objects are reflected in the unique numbers themselves.?

One side of the conversation (at least in Stroy Points)?maintains that the scales on which the rest of the measurements are representative and unique and the nature of the scale determines the statistics that should be used. The opposing viewpoint is that within the measures, there is, at best, only a loose relationship between representatives and uniqueness. Therefore, no relationship exists between the scales and the statistics that should be used [3].

?Business Operates with Cardinal Numbers?

The business makes decisions with Cardinal numbers. Dollars, Hours, weight, throughput, all the other??...ilities .?

The Wrap Up?

Using Ordinal numbers in TRL, AHP, and similar decision-making processes is useful for making?relative?trade-offs.?

Monetizing those choices can be part of that decision-making process. Or it can be applied?after?the relative value measures, for example,?have been made.

But in the end, a Cardinal value is needed for the business to make a decision if that decision impacts the balance sheet.

And of course

No Credible Decision Can Be Made in the Presence of Uncertainty with making Estimates of the Impact of those Decisions. To suggest otherwise, willfully ignores the principles of Managerial Finance, Microeconomics of Software Development, and Probabilistic Decision Making.

A Final Reminder

Five Immutable Principles of Project Success no Matter the Domain, Context, Management Tools, or Processes

The basis of success for all projects, no matter the domain, project management process (Agile or Traditional), project management tools, and technology, start by answering the questions posed by the Five Immutable Principles, published in?Performance-Based Project Management , American Management Association, February 2014.

  1. What does?Done?look like in units of measure meaningful to the decision-makers, starting with Measures of Effectiveness (MOE), Measures of Performance (MOP), Technical Performance Measures (TPM), and Key Performance Parameters (KPP)?
  2. What is the?Plan?to reach?Done, with the needed outcomes fulfilling these measures on the needed date, for the needed cost, to deliver the needed Capabilities to accomplish a mission or fulfill a strategy?
  3. What?Resources?- staff, facilities, funding, and regulatory compliance will be needed to reach?Done?as needed?
  4. What are the Impediments?to reaching?Done?at the needed time, for the needed cost, with the needed Capabilities?
  5. How will?Progress to Plan?be measured in units of measure meaningful to the decision-makers? The passage of time, consumption of money, production of Stories, and Story points are not measures of progress to Plan. Delivery of Capabilities to accomplish the mission are.

The passage of time, consumption of money, production of Stories, and Story points are not measures of progress to Plan. Delivery of Capabilities to accomplish the mission are.

References?

[1] "Problem with Scoring Methods and Ordinal Scales in Risk Management," Douglas Hubbard and Dylan Evans,?IBM Journal of Research and Development, May 2010.

[2] "Evaluating Risk: A revisit of the Scales, Measurement Theory, and Statistical Analysis Controversy," J. D. Solomon, Daniel Vallero, and Kathryn Benson,?International Reliability and Maintenance Symposium, 2017.

[3]?“Measurement Scales and Statistics: The Misconception Misconceived,” J. T. Townsend and F.G. Ashby,?Quantitative Methods in Psychology, 1984.?

[4] “On the Theory of Scales of Measurement,”?S.S. Stevens,?Science, Vol. 103, No. 2684. June 7, 1946, pp. 677-680.?

[5]?How to Measure Anything:?Finding the Value of "Intangibles" in Business, Douglas Hubbard, John Wiley & Sons, 2014.

[6]?Decision Analysis for the Professional , Fourth Edition, Peter McNamee?and John Celona (download this book and read how to make decisions in the presence of uncertainties and stop listening to the nonsense of #Noestimates )

[7] "Application of the AHP in Project Management ," Kamal M. Al-Subhi Al-Harbi,?International Journal of Project Management, 19 (1002), pp. 19-27.

[8] "A Methodology for Project Selection Using Economic Analysis and the Analytical Hierarchy Process ," Jeffrey L. Battin, Captain USAF,?Air Force Institute of Technology, September 1992.

[9]?Scrum + Engineering Practices: Experiences of Three Microsoft Teams, Laurie Williams North Carolina University, Gabe Brown, Adam Meltzer, Nachiappan Nagappan, Microsoft Corporation

AHP is specifically used by a DoD sponsored group working on digital engineering. It is similar to a deep learning network, with the significant difference that, unlike deep learning, the links are explicit and scrutable. Practical Software measurement is attempting something that looks similar, but IMO lacks the explicit rigor or AHP to measure value. IMO, this looks too much like simple story point weighting and assumes much of the baggage. I've suggested using AHP more explicitly.

回复
Alex Lyaschenko

PMO | Portfolio Planning & Delivery | PMP | P3O Practitioner | AgilePM Practitioner | Six Sigma

1 年

Great article, Glen. >Story Points are measures of?relative?effort. Are they relative because there are two types of uncertainty: What & Who? What Required Volume of Work Who Team participants often have a different level of skill and the required effort (in hours) depends on who will perform the work. Duration = Volume / Productivity (volume units in hours)

Colin Hammond

Creator of AI Software Requirements Analysis Tools - automated estimation, QA and insight.

1 年

Glen Alleman great article and I learned about AHP, but I was rather disappointed not to see any acknowledgment of or guidance towards function points or Cosmic function points. FP, both ISO standards that provide a reliable and consistent absolute measure of size. Knowing an estimate or measure in FP or CFP allows you to plan and control most dimensions of a project namely scope, resources, schedule and quality. Furthermore applying these measures appropriately helps to reduce most risks too. The process of sizing from requirements or backlog is mostly automated. No brainier? HTTPS://cosmic-sizing.org Https://ifpug.org Https://www.ScopeMaster.com

回复
Keith Moody, CD2 MAMSc MDS PMP MSP

KCCKS Inc. Strategic Management, Program and Project Consultant

1 年

Great article which deserves more than a cursory reading, which I will do later. I am a firm believer in measuring, modeling and simulating, and employing such techniques, including some mentioned herein. The problem I've faced is that decision-makers often don't have the patience to support the process, while doers don't have the time to learn and apply it. Without executive authority, it's very difficult to apply these useful techniques. If we are to apply them successfully, they need to be simplified to shield the impatient executives and time-pressured workers from the gory mathematical details. Btw, thanks for reminding me of the AHP, which i studied over 20 years ago.

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了