Expected Value and Simulation Methods in Schedule Risk Analysis
Rasoul Abdolmohammadi, PMP, RMP, CCP
Principal in Planning & Scheduling at PETRONAS
Expected Value and Simulation Methods in Schedule Risk Analysis
1.??? Abstract
Despite the prevalence of Monte Carlo simulation methods for risk analysis today, the statistical concept of expected value continues to be utilized by some practitioners and endorsed by reputable sources. Integrating practical approaches with theoretical statistical concepts underlying expected value and Monte Carlo simulation, there appears to be a need for clarification. This paper aims to explore various applications of expected value in project schedule risk analysis, evaluate their advantages and disadvantages, and propose optimal solutions. By delving into mathematical concepts, it presents three methods of risk analysis using expected value, ultimately recommending a fourth method—Overall Schedule Risk Simulation (OSRS)—as the most effective solution, embodying the core concept of probabilistic expected value.
?2.??? Introduction
Despite its ongoing use among some practitioners and its inclusion in certain references, the expected value method remains a topic of debate in schedule risk analysis. This paper aims to elucidate the statistical concepts underlying the expected value method and to present various approaches for using expected value to estimate project schedule contingency. Both deterministic and probabilistic approaches in expected value analysis are explored, and ambiguities in their definitions are clarified.
?Following the discussion of mathematical concepts, the paper describes the following methods:
?1-??? Expected Value with Deterministic Impact: This method employs a deterministic number as the risk impact.
2-??? Expected Value with Probabilistic Impact using the PERT Formula: This approach uses the Program Evaluation and Review Technique (PERT) formula in calculations.
3-??? Expected Value with Probabilistic Impact using Monte Carlo Simulation: This method incorporates random sampling and Monte Carlo simulation for calculations.
4-??? Probabilistic Expected Value using Monte Carlo Simulation for Both Impact and Probability: This advanced method applies Monte Carlo simulation to both the impact and probability.
The paper also discusses the advantages and disadvantages of each method and recommends the best solution.
?3.??? Mathematical Concepts
Sample Space: If the set of possible outcomes of an experiment is known, this set is called Sample Space of the experiment.
Event: Any subset of Sample Space is known as an Event.
Probability of the Event: A number between 0 and 1 representing the likelihood of a specific event or set of events occurring within a sample space.
Independent Events: Two events E and F are independent if the knowledge that F has occurred does not affect the probability that E occurs.
Random Variable: in some cases, a function defined on the outcomes of an experiment is considered as outcome itself. These real valued functions defined on the sample space, are known as Random Variable.
Since the value of a random variable is determined by the outcome of the experiment, we may assign probabilities to the possible values of the random variable.
?The random variables can be discrete or continuous.
Discrete Random Variable: If a random variable can take a countable number of possible values, it is a discrete random variable.
Probability Mass Function: a function represents the probability of all possible values of a discrete random variable.
Bernoulli Random Variable: Suppose an experiment where outcomes are classified as either "success" (represented by 1) or "failure" (represented by 0). This forms a Bernoulli random variable.
?Binomial Random Variable: Suppose there are n experiments, each with a success probability p and a failure probability 1?p. The number of successes in n experiments forms a binomial random variable with parameters (n,p).
Continuous Random Variable: If the possible values of a random variable are uncountable, it is called Continuous Random Variable.
Probability Density Function: a function represents the probability of all possible values of a continuous random variable.
Expected Value: The expected value of a random variable is a weighted average of the possible values that it can take. Each value is weighted by its probability.
The expected value of a Binomial discrete function with parameters n and p is np.
?Reasonable Number of Experiments to rely on expected value:
Let's consider an example. A taxi driver wants to estimate his income during winter. He assumes earning $300 during clear weather and not working during rainy weather. The probability of clear weather during winter is 70%. What is his average income during winter?
Using the expected value concept, his average income will be (0.7*300) = 210.
Let's proceed with some questions based on our example.
How much will he earn if he works only one day? It is obviously not 210 but it is either 300 or 0. What is his average income for 2 days of work? What is his average income for two days of work? Considering all possible scenarios—both days working, both days not working, working one day and not the other—the average earnings could be 300, 0, 150, and 150. Noticeably, 210 does not appear in these outcomes.
Let’s extend this analysis to three, four, five, and six days to further explore the results.
?As observed in the table above, the average income does not reach 210. However, as the number of days increases, the range of possible averages tends to approach 210. To verify this trend, the standard deviation of each set of outcomes around 210 is displayed in the table below:
?The next question here is how many experiments are needed to rely on the expected value. This depends on the desired level of accuracy and confidence. Addressing this requires understanding two key theories: the "law of large numbers," which states that as the number of experiments increases, the sample mean converges to the expected value; and "confidence intervals and margin of error," which determine the necessary sample size to estimate the expected value within a specific margin of error.
Typically, a sample size of 30 or more is often considered sufficient.
?4.??? Expected Value (EV) in Project Risk Analysis
4.1. Example:
To explore the application of expected value in project analysis, a sample is defined as follows:
As shown in the picture below, the project consists of 6 activities. The entire project duration is 200 days. Activities 1, 2, and 3 are sequential, as are activities 4, 5, 6, and 3. Consequently, the critical path of the project consists of activities 4, 5, 6, and 3, while activities 1 and 2 have a float of 30 days.
Let's assume 5 risks with the following assumptions affect this project.
?4.2. First Method: Expected Value with Deterministic Impact:
This method includes the following steps:
1-????? Analyze risks in project schedule network to find its impact on schedule by considering CPM.
2-????? Apply the risk impact determined in step one together along with risk probability in EV formula which is EV = Impact * Probability
3-????? Aggregate all EVs calculated in step 2 to determine the total impact of all risks together.
4-????? Add the total risks impact to the total project duration.
?Let’s apply this method to the example.
1-????? Considering critical path of the schedule network, the impact of risks on project will be as follows:
?2-????? Multiplying the impact of risks by their respective probabilities yields the following results:
?3-????? By aggregating all risks impact together, the total risk impact is generated:
21d + 30d + 25d + 40d + 0d = 116d
?4-????? The total project duration, taking into account all risks will be 200d + 116d = 316d.
?Critique of the First Method:
First Point:
In this method, all risks' impact is aggregated. However, in reality, the occurrence of all risks simultaneously is very unlikely. This exaggerates the total impact of risks.?
Second Point:
If a risk occurs, it will typically affect the related activity with its full impact. However, in this method, risks are accounted for by multiplying their impact by their probability. This approach, discussed earlier in this paper under the concept of Expected Value.
Contrary to the first point, this method tends to underestimate the impact of risks on the project.
Third Point:
In this method, risks are analyzed individually, and their impacts are aggregated to determine the overall impact on the total project. However, in reality, multiple risks can occur simultaneously. For instance, if Risk No. 1 and Risk No. 5 occur together, their combined impact on the project could be 55 days, whereas the sum of their individual impacts is only 35 days (35 + 0 = 35 days).
?Conclusion:
Even if the counterbalancing effects of the first and second points mitigate their biases, the third point remains a fundamental issue in schedule risk analysis, making this method not recommended.
?4.3. Second Method: Expected Value with Probabilistic Impact Using PERT Formula:
In this method, instead of using a specific deterministic number for risk impact, a three-point estimate range is considered. This includes an optimistic impact (O), a most likely impact (M), and a pessimistic impact (P). The PERT formula, (O + 4M + P) / 6, is then used to calculate the expected value of the risk impact. For example, in the earlier example, the table would look like this:
?After calculating the effective impact of each risk using the PERT formula, (O + 4M + P) / 6, the same steps as described in the first method are followed here.
?By aggregating all risks impact together, the total risk impact is generated:
24d + 31d + 31d + 41d + 0d = 127d
?The total project duration, considering all risks will be 200d + 127d = 327d
?Critique of the Second Method:
The first, second and third points mentioned for the first method are also applicable to this method.
领英推荐
?Conclusion:
The only advantage of this method over the first one is its consideration of probabilistic impact rather than deterministic impact. This aspect makes this method slightly more accurate than the first. However, the fundamental issue highlighted in the third point of the first method still persists here. Therefore, this method is also not recommended.
?4.4. Third Method: Expected Value with Probabilistic Impact Using Random Sampling:
The inputs for this method are similar to the second method: a probable impact with a probability of occurrence for every risk. However, this method utilizes the random sampling method inherent in Monte Carlo simulation.
How does Monte Carlo Simulation work?
Monte Carlo Simulation generates random numbers using methods such as Pseudo Random Number Generators (PRNGs). PRNGs are algorithms that produce sequences of numbers approximating the properties of random numbers. While these numbers are not truly random, they are adequate for most practical purposes. Typically, these numbers are uniformly distributed between 0 and 1. To align with the specified distribution of risk impacts, techniques like Inverse Transform Sampling or Box-Muller Transform are employed to transform these numbers into the required distributions.?
In this method, the following steps are taken:
1-????? Analyze risks in the project schedule network to assess their impact on the schedule by considering CPM. Define the impacts as a distribution function.
2-????? Adjust the required formula in Excel to multiply the risk impact generated in step one together with risk probability, following the EV formula that is EV = Impact * Probability
3-????? Adjust the required formula in Excel to aggregate all EVs generated in step 2, to have the total impact of all risk together.
4-????? Adjust the required formula in Excel to add total risks impact to total project duration.
5-????? Run simulation to generate random numbers as risk impact for thousands of iterations.
Upon executing the above five steps, a distribution graph for the total project duration will be generated.
Referring to the same example mentioned earlier in this paper and using the duration impact distributions described in the second method, the inputs will be as follows:
?Running the simulation, the result will be as follows:
?Critique of the Third Method:
The first, second, and third points mentioned for the first and second methods are applicable to this method as well. The advantage of this method lies in its probabilistic distribution output, which offers valuable insights into the range of possible project durations.
?A Common Mistake:
As mentioned in the second point under criticizing the first method, one drawback of the expected value method is its lack of alignment with reality. For instance, if a risk with a 50% probability and a 100-day impact occurs, it will realistically affect the project by 100 days. However, in the expected value method, the calculated impact would be 50 days, which never occurs in reality. As described earlier in this paper under "Reasonable Number of Experiments to Rely on Expected Value," this issue is mitigated with a large number of experiments. The mistake lies in considering the number of risks as experiments and expecting to resolve this issue by increasing the number of risks. The number of risks cannot be equated to experiments. Even with hundreds of risks, this problem persists. What somewhat addresses this issue is the opposite approach taken by the first and second points; it is not the number of risks.
?Conclusion:
The main advantage of this method over the second one is its ability to generate a duration distribution output. This type of output provides decision makers with a clearer perspective on project duration, enabling them to make more informed decisions.
?4.5. Fourth Method: Schedule Risk Simulation (OSRS) or Probabilistic Expected Value with Using Random Sampling for both Impact and Probability:
The same inputs used in the third method are sufficient to run this method. The difference lies in the fact that in the third method, random sampling is used only for risk impact and then multiplied by risk probability. In contrast, in this method, random sampling is used for both risk probability and impact. The following steps are taken in this method:
1-????? Analyze risks in the project schedule network to assess their impact on the schedule using Critical Path Method (CPM). Define the impacts of these risks as a distribution function.
2-????? Define a broad schedule (which can encompass one or multiple activities) within a schedule risk analysis software.
3-????? Define risks with their probability and impact in the software and assign them to activity/ activities.
4-????? Run Simulation.
?Running this method with one activity:
Considering one activity with a duration of 200 days representing the total project duration, and assigning the risks as defined in the third method with the same probabilities and distribution functions, the final output will be as follows:
?The table below illustrates the differences in results between this method and the third method.
?As shown in the table above, the difference in the mean values between the fourth method and the third method is not significant (1%). However, the fourth method covers a wider range of results compared to the third method.
The broader range of results in the fourth method is due to multiplying the impacts by their probabilities, as mentioned in the first point under the first method. As discussed earlier, this effect should be balanced out by the second point, which emphasizes the importance of a larger number of risks to neutralize variability. This principle is elaborated in my paper titled "Overall Schedule Risk Analysis rather than Expected Value Analysis". The reliability of results increases with a higher number of risks, which differs from the concept of experiments described earlier in my paper under "Reasonable Number of Experiments to Rely on Expected Value". Therefore, mathematical solutions to determine a precise number of experiments for low error in results are not directly applicable here.
However, as discussed in my previous paper titled "Overall Schedule Risk Analysis rather than Expected Value Analysis", a practical rule of thumb suggests that the number of risks should ideally be above 10 for more reliable results.
?Running this method with several activities:
Since the schedule risk analysis software is utilized in this method, there is potential to define more activities to achieve a more reliable result. For example, considering the same schedule with 6 activities defined earlier in this paper and assigning risks to each activity as previously described, the result will be as follows:
?Below is the table illustrating the difference in results between having 5 activities (Fourth Method (2)) and having only one activity (Fourth Method (1)):
?Critique of the Fourth Method:
First Point:
In this method, the first issue of previous methods is addressed because random sampling determines whether risks exist. This means that based on the probability of risks, they only affect some iterations.
Second Point:
The second issue of previous methods is resolved because the probability is not multiplied by the impact.
Third Point:
The third issue of previous methods is resolved because various combinations of risks are applied in every iteration during simulation.
Fourth Point:
The issue of not fully considering the project network is not completely resolved in this method. However, defining several activities can somewhat mitigate this limitation.
Fifth Point:
This method involves risk-driven schedule risk analysis with Monte Carlo simulation, focusing on one or a few activities in the schedule network. It is distinguished from traditional Schedule Risk Simulation (SRS) because SRS emphasizes a reliable schedule network to ensure maximum accuracy. This philosophy is not fully adhered to here, as having only one or a very limited number of activities does not achieve the expected level of accuracy that SRS provides. Therefore, this method is termed Overall Schedule Risk Simulation (OSRS).
Sixth Point: What is Probabilistic Expected Value?
As mentioned in the fifth point, this method essentially involves risk-driven schedule risk analysis with Monte Carlo simulation. However, considering the following descriptions, it can also be viewed as a probabilistic expected value method.
"Probabilistic Expected Value" isn't a formal term, but it can describe a scenario where both the impacts and their probabilities are treated probabilistically, often necessitating advanced techniques like Monte Carlo simulations or integrations over multiple distributions.
In Monte Carlo simulation, the existence of a risk in each iteration is managed through the following process:
For each risk, generate a random number between 0 and 1. If this number is less than the probability of the risk occurring, then the risk exists in that iteration. Otherwise, the risk does not exist in that iteration. After running this process across many iterations, we can expect that, on average, the risk will exist in approximately r iterations, where r is the probability of the risk multiplied by the total number of iterations. Over many iterations, the occurrences of the risk follow a binomial distribution with parameters n and p, where n is the number of iterations in the simulation and p is the probability of the risk. As described earlier, the binomial distribution provides the probability of achieving a certain number of successes in a fixed number of trials.
The reason why this method is sometimes considered as a Probabilistic Expected Value method is due to the process of risk existence in Monte Carlo simulation, which can be modeled as a binomial distribution for the probability of the risk.
Conclusion:
Most of the issues with the first and third methods are resolved in this approach. The only remaining considerations in this method are:
1-????? In essence, this method is akin to the Schedule Risk Simulation method but operates at an overall schedule network level. Due to its application on a broader scale of the schedule network, it may not provide the same level of accuracy as an analysis conducted with a complete schedule network.
2-????? The Binomial distribution is not directly utilized as an input in this method, but the random sampling process in Monte Carlo simulation approximates the characteristics of a Binomial distribution.
3-????? ?However, while "probabilistic expected value" is not a formal concept in statistical theory, it can be loosely interpreted by considering the binomial distribution as a probabilistic distribution for probability within expected value calculations, alongside any distribution function for impact.
4-????? This method can be applied with just one activity in the schedule network or with multiple activities. As the number of activities increases, so does the accuracy of the model. When applied to the complete schedule, this model transforms into Schedule Risk Simulation (SRS).
?
5.??? Conclusion
All four methods can be utilized for the risk analysis of project-specific risks; however, the first and second methods are not recommended. The third method, while not fully addressing the issues of the first two methods, offers the advantage of generating a probabilistic distribution output. When employing this method, it's crucial to consider that the reliability of results increases with a greater number of risks. An empirical rule of thumb suggests having more than 10 risks for more dependable outcomes.
The fourth method is recommended because it resolves most of the problems of the other methods and has the potential capability to be converted into Schedule Risk Simulation (SRS).
?6.??? References
-????????? Introduction to Probability Models- Eleventh Edition, 2014, Sheldon M. Ross, University of Southern California
-????????? Probability and Statistics for Engineers and Scientists, 2012, Ronald E Walpole, Raymond H Myers, Sharon L Myers, Keying Ye, Prentice Hall
-????????? Monte Carlo Statistical Methods, 1999, Christian P Robert, George Casella, Springer
-????????? Simulation, 2022, Sheldon M Ross, Academic Press
-????????? Risk Analysis and Contingency Determination Using Expected Value, 2021, AACE International Recommended Practice No. 44R-08
-????????? Integrated Cost and Schedule Risk Analysis and Contingency Determination Using Expected Value, AACE International Recommended Practice No. 65R-11
?
?
Former Custodian (Planning & Scheduling / Project Control) at PETRONAS
7 个月Insightful!