Risks, Their Sources, Root Causes, and Handling Strategies
Glen Alleman MSSM
Vietnam Veteran, Applying Systems Engineering Principles, Processes & Practices to Increase the Probability of Program Success for Complex Systems in Aerospace & Defense, Enterprise IT, and Process and Safety Industries
Risk identification during early design phases of complex systems is commonly implemented but often fails to identify events and circumstances that challenge program performance.
Inefficiencies in cost and schedule estimates are usually held accountable for cost and schedule overruns, but the true root cause is often the realization of programmatic risks. A deeper understanding of frequent risk identification trends and biases pervasive during space system design and development is needed, for it would lead to improved execution of existing identification processes and methods.
Risk management means building a model of the risk, the impact of the risk on the program, and a model for handling of the risk, since it is a risk, the corrective or preventive action has not occurred yet.
Probabilistic Risk Assessment (PRA) is the basis of these models and provides the Probability of Program Success Probabilities result from uncertainty and are central to the analysis of the risk. Scenarios, model assumptions, with model parameters based on current knowledge of the behavior of the system under a given set of uncertainty conditions.
Since risk is the outcome of Uncertainty, distinguishing between the types of uncertainty in the definition and management of risk on complex systems is useful when building risk assessment and management models.
Incomplete knowledge about some characteristics of the system or its environment are primary sources of Epistemic uncertainty.
Naturally occurring variations associated with the physical system are primary sources of Aleatory uncertainty.
There is a third uncertainty found on some programs that is not addressed here, since this uncertainty is not correctable.
Ontological uncertainty creates risk from Inherent variations and incomplete information that is not knowable.
Separating Aleatory and Epistemic Uncertainty for Risk Management
Knowing the percentage of reducible uncertainty versus irreducible uncertainty is needed to construct a credible risk model. Without the separation, knowing what uncertainty is reducible and what uncertainty is irreducible inhibits the design of the corrective and preventive actions needed to increase the probability of program success.
Separating the types of uncertainty serves to increase the clarity of risk communication, making it clear which type of uncertainty can be reduced and which types cannot be reduced. For the latter (irreducible risk), only a margin can be used to protect the program from uncertainty.
As uncertainty increases, the ability to precisely measure the uncertainty is reduced to where a direct estimate of the risk can no longer be assessed through a mathematical model. While a decision in the presence of uncertainty must still be made, deep uncertainty and poorly characterized risks lead to the absence of data and risk models.
Epistemic Uncertainty Creates Reducible Risk
The risk created by Epistemic Uncertainty represents resolvable knowledge, with elements expressed as probabilistic uncertainty of a future value related to a loss in a future period of time. Awareness of this lack of knowledge provides the opportunity to reduce this uncertainty through direct corrective or preventive actions.
Epistemic uncertainty, and the risk it creates, is modeled by defining the probability that the risk will occur, the time frame in which that probability is?active, and the probability of an impact or consequence from the risk when it does occur, and finally, the probability of the residual risk when the handing of that risk has been applied.
Epistemic uncertainty statements define and model these event?based risks:
For these types of risks, an explicit or implicit?risk-handling plan is needed. The word?handling?is used with a special purpose. “We?Handle?risks” in a variety of ways.?Mitigation?is one of those ways. In order to mitigate the risk, new effort (work) must be introduced into the schedule. We are?buying down?the risk, or we are?retiring?the risk by spending money and/or consuming time to reduce the probability of the risk occurring. Or we could be spending money and consuming time to reduce the impact of the risk when it does occur. In both cases, actions are taken to address the risk.
Reducible Cost Risk
Reducible cost risk is often associated with unidentified reducible Technical risks, changes in technical requirements, and their propagation that impact cost. Understanding the uncertainty in cost estimates supports decision-making for setting targets and contingencies, risk treatment planning, and the selection of options in the management of program costs. Before reducible cost risk can take place, the cost structure must be understood. Cost risk analysis goes beyond capturing the cost of WBS elements in the Basis of Estimate and Cost Estimating Relationships. This involves:
Reducible Schedule Risk
Schedule Risk Analysis (SRA) is an effective technique to connect the risk information of program activities to the baseline schedule, to provide sensitivity information of individual program activities to assess the potential impact of uncertainty on the final program duration and cost.
Schedule risk assessment is performed in 4 steps:
Reducible Technical Risk
Technical risk is the impact on a program, system, or entire infrastructure when the outcomes from engineering development do not work as expected, do not provide the needed technical performance, or create higher than planned risk to the performance of the system. Failure to identify or properly manage this technical risk results in performance degradation, security breaches, system failures, increased maintenance time, and significant amount of technical debt [1] and addition cost and time for end item deliver for the program.
Reducible Cost Estimating Risk
Reducible cost-estimating risk is dependent on technical, schedule, and programmatic risks, which must be assessed to provide an accurate picture of program cost. Cost risk estimating assessment addresses the cost, schedule, and technical risks that impact the cost estimate. To quantify these cost impacts from the reducible risk, sources of risk need to be identified. This assessment is concerned with three sources of risk and ensure that the model calculating the cost also accounts for these risks:
Aleatory Uncertainty Creates Irreducible Risk
Aleatory uncertainty and the risk it creates comes not from the lack of information, but from the naturally occurring processes of the system. For aleatory uncertainty, more information cannot?be bought nor specific risk reduction actions are taken to reduce the uncertainty and resulting risk. The objective of identifying and managing aleatory uncertainty to be prepared to handle the impacts when risk is realized.
The method for handling these impacts is to provide?the margin?for this type of risk, including cost, schedule, and technical margin.
Margin is the difference between the maximum possible value and the maximum expected Value and is separate from Contingency. Contingency is the difference between the current best estimates and the maximum expected estimate. For systems under development, the technical resources and the technical performance values carry both margin and contingency.
Schedule Margin should be used to cover the naturally occurring variances in how long it takes to do the work. The Cost Margin is held to cover the naturally occurring variances in the price of something being consumed in the program. Technical margin is intended to cover the naturally occurring variation of technical products.
领英推荐
Aleatory uncertainty and the resulting risk are modeled with a Probability Distribution Function (PDF) that describes the possible values the process can take and the probability of each value. The PDF for the possible durations of the work in the program can be determined. Knowledge can be bought about the aleatory uncertainty through?Reference Class Forecasting?and?past performance modeling. This new information then allows us to update ? adjust ? our past performance on similar work will provide information about our future performance. But the underlying processes are still random, and our new information simply created a new aleatory uncertainty PDF.
The first step in handling Irreducible Uncertainty is the creation of a Margin. Schedule margin, Cost margin, and Technical Margin, to protect the program from the risk of irreducible uncertainty. Margin is defined as the allowance in the budget, and programmed schedule … to account for uncertainties and risks. [255]
Margin needs to be quantified by:
Irreducible Schedule Risk
Programs are over budget and behind schedule, to some extent because uncertainties are not accounted for in schedule estimates. Research and practice is now addressing this problem, often by using Monte Carlo methods to simulate the effect of variances in work package costs and durations on total cost and date of completion. However, many such program risk approaches ignore the significant impact of probabilistic correlation on work package cost and duration predictions.
Irreducible schedule risk is handled with Schedule Margin which is defined as the amount of added time needed to achieve a significant event with an acceptable probability of success.? Significant events are major contractual milestones or deliverables.
With minimal or no margins in schedule, technical, or cost present to deal with unanticipated risks, the successful acquisition is susceptible to cost growth and cost overruns.
The Program Manager owns the schedule margin. It does not belong to the client nor can it be negotiated away by the business management team or the customer. This is the primary reason to CLEARLY identify the Schedule Margin in the Integrated Master Schedule. [108] It is there to protect the program deliverable(s). The schedule margin is not allocated to over?running tasks, rather is planned to protect the end item deliverables.
The schedule margin should protect the delivery date of major contract events or deliverables. This is done with a Task in the IMS that has no budget (BCWS). The duration of this Task is derived from Reference Classes or Monte Carlo Simulation of aleatory uncertainty that creates a risk to the event or deliverable.
The Master Schedule, with a schedule margin to protect against the impact of aleatory uncertainty, represents the most likely and realistic risk-based plan to deliver the needed capabilities of the program.
Cost Contingency
Cost Contingency is a reserved fund held by the Government Program Manager, added to the base cost estimate to account for cost uncertainty. It is the estimated cost of the “known?unknowns” cost risk that impacts the planned budget of the program. This contingency is not the same as Management Reserve, rather this Cost Contingency is not intended to absorb the impacts of scope changes, escalation of costs, and unforeseeable circumstances beyond management control. Contingency is funding that is expected to be spent and therefore is tightly controlled. Contingency funds are for risks that have been identified in the program.
Irreducible cost risk is handled with Management Reserve and Cost Contingency are program cost elements related to program risks and are an integral part of the program’s cost estimate. Cost Contingency addresses the Ontological Uncertainties of the program. The Confidence Levels for the Management Reserve and Cost Contingency are based on the program’s risk assumptions, program complexity, program size, and program criticality.
When estimating the cost of work, that resulting cost number is a random variable. Point estimates of cost have little value in the presence of uncertainty. The planned unit cost of a deliverable is rarely the actual cost of that item. Covering the variance in the cost of goods may or may not be appropriate for Management Reserve.
Assigning Cost Reserves needs knowledge of where in the Integrated Master Schedule these cost reserves are needed. The resulting Integrated Master Schedule, with schedule margin, provides locations in the schedule where cost reserves are aligned with the planned work and provides the ability to layer cost reserves across the baseline to determine the funding requirements for the program. This allows the program management to determine realistic target dates for program deliverables and the cost reserves ? and schedule margin ? needed to protect those delivery dates.
Irreducible Technical Risk
The last 10 percent of the technical performance generates one?third of the cost and two?thirds of the problems ? Norman Augustine’s 15th Law.
Margin is the difference between the maximum possible value and the maximum expected Value and separating that from contingency is the difference between the current best estimates and the maximum expected estimate, then for the systems under development, the technical outcome and technical performance values both carry margin and contingency.
Technical Margin and Contingency serve several purposes:
For any system, in any stage of its development, there is a maximum possible, maximum expected, and current best estimate for every technical outcome. The current best estimate of a technical performance change as the development team improves the design and the understanding of that design matures.
For a system in development, most technical outcomes should carry both margin and contingency. As designs mature, the estimate of any technical resource usually grows. This is true historically and, independent of exactly why development programs must plan for it to occur.
The goal of Technical Margin (unlike Cost and Schedule Margin) is to reduce the margins (for example Size Weight and Power) to as close to zero as possible, to maximize mission capabilities. The technical growth and its handling include:
Expected technical growth ? contingency accounts for expected growth
Unplanned technical growth ? margins account for unexpected growth
Ontological Uncertainty
On the scale of uncertainty ? from random naturally occurring processes (aleatory) to the Lack of Knowledge (epistemic), Ontological uncertainty lies at the far end of this continuum and is a state of complete ignorance. [344] Not only are the uncertainties not known, but the uncertainties also may not be knowable. While the truth is?out there, [3] it cannot be accessed because it is simply not known where to look in the first instance. Ontological uncertainty is called it operating outside the experience base, where things are done for the first time and operate?in a state of ignorance.
Management of uncertainty is the critical success factor of effective program management. Complex programs and the organizations that deliver them organizations can involve people with different genders, personality types, cultures, first languages, social concerns, and/or work experiences.
Such differences can lead to ontological uncertainty and semantic uncertainty. Ontological uncertainty involves different parties in the same interactions having different conceptualizations about what kinds of entities inhabit their world; what kinds of interactions these entities have; how the entities and their interactions change as a result of these interactions. Semantic uncertainty involves different participants in the same interactions giving different meanings to the same term, phrase, and/or actions. Ontological uncertainty and semantic uncertainty can lead to intractable misunderstandings between people.
When new technologies are introduced into these complex organizations, concepts, principles, and techniques may be fundamentally unfamiliar and carry a higher degree of ontological uncertainty than more mature technologies. A subtler problem is these uncertainties are often implicit rather than explicit and become an invisible and unquestioned part of the system. When epistemic and ontological uncertainty represents our lack of knowledge, then reducing the risks produced by this uncertainty requires improvement in our knowledge of the system of interest or avoiding situations that increase these types of uncertainty. To reduce Epistemic and Ontological uncertainty there must be a reduction in the uncertainty of the model of system behavior (ontology) or in the model’s parameters (epistemology).
[1] ????The use of the Design Structure Matrix provides visibility and modeling of these dependencies. Many models consider the dependencies as statistic or fixed at some value. But the dependencies are dynamic driven by nonstationary stochastic processes, that evolve as the program evolves.
[2] ?Technical debt is a term used in the Agile community to describe the eventual consequences of deferring complexity or implementing incomplete changes. As a development team progresses there may be additional coordinated work that must be addressed in other places in the schedule. When these changes do not get addressed within the planned time or get deferred for a later integration, the program accumulates a debt that must be paid at some point in the future.
[3] ?Apologies to Mulder and Scully and the X?Files
Wichard Hulsbergen Daniela Mayan your thoughts?
-
2 年Glen Alleman this is very interesting.
Good stuff!
Author "Primavera P6 Practical Scheduling & Planning & Master Primavera P6" | Project Scheduling & Planning Expert
2 年Thanks for sharing this knowledge about uncertainty and risk.