Tool: Spreadsheet for Project Duration Estimation (and Cake!)

Tool: Spreadsheet for Project Duration Estimation (and Cake!)

Last week, I presented a process for doing project prioritization. Part of the process requires the estimation of the duration of the project, as the labour is likely the main fixed cost of a software development project. However, getting an accurate estimate of a project’s duration can be useful for more than just prioritization, for example, planning and coordination with other teams. This week, I’ll share a method I’ve used to produce those predictions. From my experience, using this method reduces the number of late projects significantly. I attribute the success of the method to three factors: clarity, communication of uncertainty, and decomposition.

Let’s use this example throughout the article. Imagine that you want to bake a carrot cake. How long is it going to take before you can serve someone a slice of cake? Actually take 20 seconds to think about it and write down your prediction.

Got it?

I’ll play along with you. I’ve baked a few cakes in the past, so I estimate it’ll take about two and a half hours to make the cake. That same level of effort used to produce the estimate is often all people do. Let’s see how we can refine that.

Clarify

Funny question: what’s a carrot cake? The above estimate is completely meaningless to you unless you have the same understanding of what’s a carrot cake as I do. Is it one of Claire Saffitz layered brown butter carrot cake or the simpler square carrot cake and whipped cream from Chef Jean-Pierre? Or maybe you’re already laughing because you planned on baking a boxed carrot cake and buying the shelf stable cream cheese frosting they sell next to it. All three would require a significantly different time investment.


??Make sure people have a shared understanding of the expectations regarding a project. No serious estimates should be made before this is the case.


For software projects, I find that clarifying the deliverables is a big help in making sure everyone agrees on the objective. If the project is very complex, then the first phase should be to document the work to be done in greater detail. If nothing else, this will help evaluate whether the solution is even feasible.

For my carrot cake, I’ll go with Chef Jean-Pierre’s recipe. I never tried it and it seems like a much simpler version than Claire’s (which I did make in the past; it’s delicious!) Figuring that it is a simpler recipe, I’ll reduce my estimate to two hours.

Take the time to clarify what you had in mind when thinking about a carrot cake. Please write your answer in the comments; I need more carrot cake recipes! ??

Communication of Uncertainty

When I write “two hours”, it’s implied that I don’t actually expect to be able to bake a cake in exactly 7200 seconds. The problem is that this fuzziness is ambiguous unless efforts have been made to clearly define my uncertainty. This is even worse when multiple people contribute estimates with varying degrees of fuzziness. Should you communicate the median of your duration estimates? Or maybe share a surer bet, something closer to the 90th percentile to avoid the pain of dealing with a project that is running late? If you’re a leader organizing project planning, this is very important to communicate to the people doing the predictions.

However, successfully communicating uncertainty isn’t easy. There needs to be:

  • an agreement about how to express it,
  • people need to be comfortable with uncertainty,
  • but also feel safe to communicate their genuine predictions without fearing retribution.

As a side note, I firmly believe that managers have a duty to protect the estimates provided by the team they support, especially if they seem high. Ask why the estimates are high rather than ask them to revise them. Estimates are a communication tool too!

As I wrote a few weeks ago, a good way to communicate uncertainty is to give a lower and upper bound on the duration. For example, when using the 10th and 90th percentiles, denoted p10 and p90 respectively, we say that, 8 times out of 10, the duration should fall between the p10 and the p90 bounds. The wider the interval, the more uncertainty people feel there is in the duration.

Specifically for durations, I like to characterize the shape of the uncertainty using a Weibull distribution. It is defined only for positive values, is skewed towards shorter times, but allows for a longer right tail, which represents the chance of a project running longer than anticipated. From the p10 and p90 estimates, you can find the parameters of the distribution, which, in turn, allows you to derive all the other statistics.

Obviously, my cake won’t take exactly two hours. In fact I think it’ll take between one hour if everything goes well and five hours if I have a big issue. Maybe my two hours estimate was a little on the optimistic side!

Decomposition

Even with a clear picture of the end goal, the way to get there can still be fuzzy. The best way to clear that fuzziness is to start decomposing the problem into smaller components, forcing ourselves to clarify the steps needed to produce the required deliverables.

Specifically, break down the steps into milestones. Describe each milestone in enough detail that someone else could pick up that task and execute it. If a given milestone is too complex, break it down further until you can easily express the work to be done in a few sentences.

For each milestone, I also recommend clarifying the definition of “done”. What needs to be true for someone to be completely done with the milestone?

  • Is there a deliverable? (There should almost always be one, even if it’s a tiny memo to report on the work done.) When is it considered “delivered”?
  • Does someone need to review it?
  • Is there cleanup that needs to happen before one can truly be done with that milestone?
  • Is there a presentation to your stakeholders to prepare?
  • Or maybe there is a patent application that needs to be filed?

For example, documentation and testing often cause delays (or worse, are skipped) because it wasn’t accounted for during the planning phase. Decomposition can help catch those instances.

Finally, instead of providing p10 and p90 estimates for the whole project, do so at the milestone level. Since they’re more granular, you are likely to have an easier time coming up with accurate estimates.

For my carrot cake, here is my decomposition into milestones. I looked at the recipe while doing the decomposition. I definitely got a few surprises!

Already, I can see that my initial two hours estimate was totally off! Decomposition (and reading the recipe) clarified what each step entailed and led to a more accurate picture of how long baking this cake is going to take! (And this wasn’t fake. Now I’m resigned to start at least a day in advance before being able to serve a nice slice of carrot cake.)

To get the duration of the whole project, resist the temptation to simply sum each percentile. The chance that everything goes almost perfectly (i.e., the duration is smaller than the p10 estimate) is much lower than the 1 in 10 chance communicated by the p10 total duration. It’s like saying that you’d have a 50% chance of getting five tails in a row when flipping a coin because the chance of getting a tail on each coin flip is 50%. Instead, you need to run a Monte Carlo simulation.


? Monte Carlo Simulation

Used to combine values produced by probability distributions. The simulated combined values are then used to generate statistics like the percentiles and the moments (mean and variance) of the resulting distribution. It is an alternative to deriving a theoretical formula (which is often impossible).


All you need to run this simulation is the following (although you’ll need some math background or rely on pre-built tools as I gloss over a lot of details):

  • A formula for finding both the scale and shape parameters of the Weibull distribution based on the duration estimates. That distribution characterizes the probability distribution for the duration of that milestone.
  • A formula to compute the inverse of the cumulative distribution function, which is used to generate samples.
  • Basic knowledge of how to use dynamic arrays in Excel or Google Sheets. Alternatively, you can get away with using formula over a large array of samples sprawled over as many cells as you have samples. It is slower because the software needs to render each sample, but it is simpler to prototype.


A snapshot of how this can be laid out. Available as a free template for readers! (See below.)

The process goes like this:

  1. For each milestone: Derive the shape and scale parameters from the p10 and p90 estimates. Generate at least 500 (preferably 10,000) samples from that distribution.
  2. Aggregate the values by summing up one sample from each milestone. This should give you as many results as you had samples for each milestone (e.g., 10,000). Each of those sums is a simulated outcome derived from the generation of likely generation for each milestone. Some take longer, some get done quickly. It’s as if you ran that project 10,000 times and could now use the observed durations to derive conclusions about the distribution of the duration for the whole project.
  3. Produce statistics using the generated sample.

I am providing my readers with a free template spreadsheet (.xslx) that showcases how such a spreadsheet can be created. Feel free to adapt it to your needs!

Applying this process to the carrot cake example, I get the following percentiles:

How to Use Simulated Statistics

Unless you have a strong background in mathematics, you are probably unfamiliar with having a function that can produce a slew of statistics rather than a single value. It can feel unwieldy. What does it get you?

If you have to keep a single value, I suggest you look at the 80th or 90th percentile for making plans requiring an estimate of the duration of the project. This way, you are less likely to suffer from late projects if you use a higher percentile rather than the mean or the median. If you finish early, I’m sure you can find something to do with the extra time.

The real value is when you use those numbers in the Decision Loops process to make structured decisions. When comparing projects, you might see that some projects have more uncertainty associated with their execution (i.e., a larger interval between their p10 and p90 values). This can be the source of useful conversations with the project team or indicate that you may want to use risk mitigating strategies to handle that uncertainty.

Comparing the distributions (e.g., by plotting the p10, p25, p50, p75, and p90) of multiple projects together may reveal that certain projects are much safer bets than others, even if their mean durations are similar. This tool will help you make more informed decisions.


As you may have seen from the link to the spreadsheet, I’ve finally set up my website. I’ll be posting my newsletter both here and there every week. With a bit more work on my part, it should be able to display mathematical formulas in a sane manner.

If you are interested in the derivation of the parameter and inverse cumulative distribution for the Weibull, let me know in the comments, and I’ll make an appendix to this post.

Have a wonderful week!

要查看或添加评论,请登录

Adam Salvail, MSc的更多文章

  • Case Study: How We Did Project Prioritization at Meta

    Case Study: How We Did Project Prioritization at Meta

    Working at Meta was a very interesting experience. One of the most unusual aspects of their culture was how much of a…

    1 条评论
  • Portfolios Aren’t Just for Stocks

    Portfolios Aren’t Just for Stocks

    I generally reach for structured decision-making tools whenever I’m overwhelmed by the decisions I need to make. This…

  • Everything Is Comparable With Maps of Indifference

    Everything Is Comparable With Maps of Indifference

    As a mathematician, I am still surprised when I discover an economic concept that originally seemed “fixed in reality”,…

  • Decision Loops - Your First Loop

    Decision Loops - Your First Loop

    This week, I want to equip you with your first decision-making tool, which I’ve named the First Loop. This method…

  • Avoid Going Bankrupt

    Avoid Going Bankrupt

    Zero is an interesting number. Due to the way multiplication work, it can be sticky, making it hard to get away from it.

  • A better way to make decisions

    A better way to make decisions

    I was in one of those meetings. Five managers and a director surrounding a table with a spreadsheet projected on the…

    2 条评论

社区洞察

其他会员也浏览了