Six Common Pitfalls of Risk Management
Glen Alleman MSSM
Applying Systems Engineering Principles, Processes & Practices to Increase Probability of Program Success for Complex System of Systems, in Aerospace & Defense, Enterprise IT, and Process and Safety Industries
Two Types of Uncertainty = Two Types of Risk
Managing in the presence of uncertainty that creates risk management allows decisions to be made in the presence of uncertainty.
This uncertainty is of two kinds:
A critical difference between the risk types is that epistemic risk may be reduced by improving our knowledge, but aleatory uncertainty represents an absolute limit to our understanding. To use the coin toss example, having thrown the coin a thousand times, we would be able to express the probability of heads occurring confidently, but that's all we can say about the next coin toss.
Sources of Risk Created by Epistemic Uncertainty
As with other risk domains, we can refine our description of epistemic risk into various areas that we know have traditionally been the source of problems.
Assumptions?? In engineering, we introduce epistemic risk whenever we assume the world. We may assume because we lack data, or we may make simplifying assumptions to make our job easier. In either case, the uncertainty we introduce also carries a risk.
A further problem with design assumptions is that often they are implicitly rather than explicitly stated. Thus, they become an invisible and unquestioned part of the landscape, unknown 'knowns' if you will.
Safety analyses?- Epistemic uncertainty is also a significant problem for safety analyses because while some data may be 'hard,' such as the failure rates of specific components, other data may be highly uncertain, such as human error rates or the distribution of environmental effects such as lightning. Such uncertainties then become buried in the analysis as assumptions, often leading to the effect known as 'tail truncation,' where the likelihood of extreme events can be significantly underestimated.
Subjective evaluation?? Epistemic uncertainty becomes even more dominant when we are asked to evaluate the likelihood of a rare event for which little or no empirical data exists and for which we must rely on subjective 'expert estimation' with all the potential for biases that this entails.
Design errors?? Design errors are yet another source of epistemic uncertainty. Here, we can view the design as a 'model' of the system. The possibility of an error in the model introduces uncertainty about whether the model system will correctly predict behavior.
Dealing with Risk Created by Epistemic Uncertainty
We are dealing with aleatory risk.?Because we express aleatory uncertainty as process variability over a series of trials, the aleatory risk is always expressed about the duration of exposure. The classical response to such variability is to build redundancy, such as backup components or additional design margins that reduce risk to an acceptable level over the duration of exposure.
But with aleatory risk, we also hit a fundamental limit of control. While we can reduce the risk exposure by, for example, introducing redundancy, if we keep playing the game long enough, eventually, we'll lose.
Dealing with epistemic risk?? If epistemic uncertainty represents our lack of knowledge, then reducing epistemic risk requires us to improve our understanding of the system of interest or avoid such implementations that increase uncertainty. In reducing such uncertainty, we seek to reduce uncertainty in our model of system behavior or in the model's parameters.
Looking at these two aspects of uncertainty, the reduction of epistemic risk provides a theoretical justification for at least two well-worn safety engineering principles, e.g., the avoidance of complexity and the use of components with well-understood performance.
Complexity in and of itself is not a direct cause of system accidents, but what complexity does do is breed epistemic uncertainty. Complex systems are difficult to completely understand and model, usually require more design assumptions, and are more likely to contain design errors leading to greater epistemic risk. So simplifying a system design has potentially more 'bang for the buck' in terms of enhancing safety.
Similarly, using components with well-understood and characterized behavior improves our certainty over parameters such as component failure rates and reduces our modeling uncertainty.
The uncertainty introduced by design assumptions can be reduced by making all assumptions an explicit part of the design and revisiting them regularly to see if they remain valid or can be removed and real data substituted. A key point at which assumptions should be checked is how we change the context of using a system. Formal or rigorous design methods and processes can reduce the uncertainty introduced by design errors.
领英推荐
Early successes but an uncertain future ? the difference between aleatory & epistemic risk goes some of the ways towards explaining the early successes of the safety engineering community in improving safety. Early efforts focused upon aleatory risk presented by the random failure of system components.
Significant gains in safety could be made through the improvement of reliability, the use of redundant components, and increased design margins to handle environmental variation.
Reliance on subjective judgment ? People see things differently: one person's risk may even be another person's opportunity. For example, using new technology in a project can be seen as a risk (when focusing on the increased chance of failure) or opportunity (when focusing on the opportunities afforded by being an early adopter). This is a somewhat extreme example, but the fact remains that individual perceptions influence the way risks are evaluated. Another problem with subjective judgment is that it is subject to cognitive biases – errors in perception. Many high-profile project failures can be attributed to such biases. Given these points, potential risks should be discussed from different perspectives with the aim of reaching a common understanding of what they are and how they should be dealt with.
Cognitive Biases
Post-Cognitive Bias and Project Failure
Using inappropriate historical data ? Purveyors of risk analysis tools and methodologies exhort project managers to determine probabilities using relevant historical data. The word relevant is important: it emphasizes that the data used to calculate probabilities (or distributions) should be from situations that are similar to the one at hand. Consider, for example, the probability of a particular risk – that a particular developer cannot deliver a module by a specified date. One might have historical data for the developer, but the question remains about which data points should be used. Clearly, only those data points from projects similar to the one at hand should be used. But how is similarity defined? Although this is not an easy question to answer, it is critical as far as the relevance of the estimate is concerned. ?More on the reference class problem goes here.
The Reference Class Problem
Focusing on numerical measures exclusively ? There is a widespread perception that quantitative risk measures are better than qualitative ones. However, the measures must still be based on sound methodologies even where reliable and relevant data is available. Unfortunately, ad?hoc techniques abound in risk analysis: see Cox's risk matrix theorem and limitations of risk scoring methods for more. Risk metrics based on such techniques can be misleading. As I point out in this comment, qualitative measures may be more appropriate and accurate in many situations than quantitative ones.
Ignoring known risks
It is surprising how often known risks should be addressed. The reasons for this have to do with politics and mismanagement.
Overlooking the fact that risks are distributions, not point values ? Risks are inherently uncertain, and a range of values represents any uncertain quantity (each with an associated probability) rather than a single number. Because of the scarcity or unreliability of historical data, distributions are often assumed a priori: that is, analysts will assume that the risk distribution has a particular form (say, normal or lognormal) and then evaluate distribution parameters using historical data. Further, analysts often choose simple distributions that are easy to work with mathematically. These distributions often do not reflect reality. For example, they may be vulnerable to "black swan" occurrences because they do not account for outliers.
The Black Swan Problem
Failing to update risks in real-time ? Risks are rarely static – they evolve in time, influenced by circumstances and events both in and outside the project. For example, acquiring a key vendor by a mega-corporation will likely affect the delivery of that module you are waiting on –and quite likely in an adverse way. Such a change in risk is obvious; there may be many that aren't. Consequently, project managers need to reevaluate and update risks periodically. To be fair, this is a point that most textbooks make – but it is advice that is not followed as often as it should be.
Estimation Errors
This brings me to the end of my (subjective) list of risk analysis pitfalls. Regular readers of this blog will have noticed that some of the points made in this post are similar to the ones made in previous work on estimation errors. Unsurprisingly, risk analysis and project estimation are activities that deal with an uncertain future, so they are expected to have common problems and pitfalls. One could generalize this point: any activity that involves gazing into a murky crystal ball will be plagued by similar problems.
References
Independent Oil & Energy Professional
1 年Can the pitfalls be numbered because the subheadings are many?
-
1 年Glen Alleman very interesting and true considerations about risk! Thanks for sharing!
Founder (2005); RIBA Client Adviser at George Stowell RIBA Chartered Practice.
1 年I’m curious why the pitfalls are described but can’t account for the uncertainty? The measures are, 1, the reference set, 2, the physical design (copy or proxy, 3, the degree-of-belief.