Thinking About Process Safety Indicators
This well-cited paper from Hopkins (2007) discusses some of the inconsistencies that existed at the time (and likely does still) about process safety indicators, and the definitions and uses of other indicators like leading and lagging.
This paper sparked a number of interested follow-up papers from other authors – I’ll cover some of these soon (I have David Wood’s and Linda Bellamy's responses written up to post sometime soon).
Note: I’ve done a poor job of summarising this and have skipped heaps, so recommend checking out the full paper.
First, he points out that, broadly speaking, two dimensions of safety indicators can be distinguished – being personal safety vs process safety, and lead vs lag.
He argues that between personal and process safety, it is “really a distinction between different types of hazards” [** some of the response articles disagree with this position, though]
Again he says, broadly, that process safety hazards result from processing activities in the plant, like escape of toxic chemicals and explosions. Personal safety tend to be more at the individual level, but typically have little to do with plant processing activities.
He argues that “most injuries and fatalities are a result of personal safety hazards rather than process hazards and, as a result, injury and fatality statistics tend to reflect how well an organisation is managing personal safety hazards rather than process safety hazards”.
Therefore, organisations who want to assess how well they manage process safety hazards “cannot therefore rely on injury and fatality data; it must develop indicators that relate specifically to process hazards”.
He points to the Esso Longford gas plant accident, who had an “impeccable lost time injury rate and yet was managing its major hazards quite poorly”. In contrast, the airline industry is said to clearly understand the distinction, where “no one would make the mistake of thinking that an airline’s lost time injury rate provided an indication of how well it was managing air safety”.
Using the Texas City Baker report, it’s noted how BP relied “exclusively or predominantly on lagging indicators to assess process safety performance”, and this was “ill-advised”. He points out how the Baker report sometimes confused the distinction between lead and lag, and process and personal safety hazards/indicators.
Using the quoted material from the Baker report, he says you’d be led to believe that BP’s issue was more around using lagging indicators instead of leading and not necessarily to the other critical finding that it was relying largely on personal indicators.
In any case, the Baker panel was said to be happy with lag indicators providing they were the “right kind”, e.g. process.
He points to other inconsistencies with lead/lag distinctions used in the Baker report and elsewhere – some focusing on the realisation of harm.
He points to a range of different kinds of failures which may not be regarded as lagging indicators, like the failure of a system to stay within safety critical parameters. ?Other arguments which are common is that lagging tend to provide feedback after an accident, whereas leading provide feedback on performance before an accident.
He doesn’t agree with this logic, because “there are many situations in which lag indicators provide a good indication of how well a safety management system is performing”. For the Baker panel, BP should have considered the use of a composite lag indicator made up of fires, explosions, LOC and more.
He again points out the inconsistencies with definitions and uses of these terms, including in the Baker report itself.
He suggests that whether something is an indicator, per se, depends on the relevant time period, as, by the nature of the term itself, it suggests there needs to be sufficient instances of the event being counted over a time period to meaningfully talk about a rate. Hence, if there are enough of these events, then charting the indicator over time can provide some evidence of the safety management system [** or reporting, and other factors.]
In contrast, fatalities and major incidents, which may be extremely rare and not occur more than once every few years (or less), make less sense as an indicator of safety. This is even more a limit for the most rare events, like major accidents. In these situations it makes sense for sites to develop more frequent indicators, like injuries, near misses, loss of containment etc. [** and especially via prospective learning, but this wasn’t directly covered in this argument]
Next he discusses some limits of indicators around reactive and active monitoring. Like Active monitoring of equipment past its date for inspection. He says that this is an example of a safety activity, but not necessarily indicative of safety itself. That is, “An organisation might score poorly on this indicator if it has fallen behind in its schedule of testing, even though all equipment might be functioning perfectly”.
Likewise, an “organisation might be scrupulous in its testing schedule yet find that many of the items tested in fact failed the test”.
He also provides an example of a measure of monitoring activity, and another indicator as a measure of equipment adequacy. One can be considered an input measure and the other an output measure; both can be relevant, but the Baker report doesn’t mention the latter.
Next he discusses a HSE UK guide on developing process safety indicators. It defines three types:
a. Measures of routine, safety-related activity like the proportion of safety critical instrument tests done on schedule
领英推荐
b. Measures of failures discovered during routine safety activity, e.g. alarms that failed during testing
c. Measures of failures revealed by unexpected events.
He considers these indicators in the context of leading and lagging. A leading indicator identifies failings on vital aspects of the risk control system during routine checks of operation, whereas lagging reveals failings following an incident or adverse event.
He points out limitations in this approach and examples that don’t fit neatly across the categories. In any case, as he argues, in the domain of process say, “the important thing is to identify measures of how well the process safety controls are functioning” hence, “Whether we call them lead or lag indicators is a secondary matter”.
Going back to the three categories of measures (ABC), he asks which one is the best indicator of safety? Interestingly, he argues category C; by his logic, this is closer to the incident than the other categories. [** Others in the rebuttal papers disagree with Hopkins here]
He draws on the defences-in-depth logics to make his case, that category C has shown failures in defences.
He argues in lieu of having a high enough frequency to talk of a rate, the priority should shift upstream to the more measurable activities designed to ensure that controls remain in place. Eg. Upstream to precursors. [** Again, other authors disagree with his logics that the category C are necessarily the best representations of ‘safety’].
In any case, his argument here is that, wherever possible, “safety indicators should be based on the undesired events themselves, rather than their precursors”.
Next he discusses the ‘truism’, that indicators are only worth developing if they’re going to be used to drive improvement. He uses the example of BP who tracked loss of containment events, where the figure worsened from 2002 to 2004.
For the senior managers, they had “individually constructed personal performance contracts with their immediate superiors which served as the basis of bonus payments”. These contracts used weighted metrics for several categories, like financial performance, reliability and safety. Nevertheless, the largest percentage of weighting was in financial outcome and cost reduction.
For safety – the measures included fatalities, days away, recordable injures and vehicle incidents. But, process safety measures weren’t included. Therefore, “Process safety, then, was completely missing from the incentive system at Texas City”.
Further, “For the most senior people in a corporation it is hard to see how safety bonuses can provide any significant financial motivation”. He notes that the financial benefits from targets can far outweigh other sources of income, and therefore the impact of safety bonuses must be symbolic (if they proportionally affect the overall bonus so little).
Moreover, the use of indicators in incentive systems must be carefully managed, since “there is an incentive to manage the indicator itself rather than the phenomenon of which it is supposed to provide an indication”. Hence, “Indicators of safety-related activity are inherently dubious from this point of view”.
He also points out how carefully indicators (and by extension, incident classifications) should be defined. For instance, the Baker report highlights that 'fire' can include a range of manifestations in the context of counting:
It also included what doesn't count as fire. Nevertheless, "It is clear that even such an apparently discrete and countable event as a fire needs to be carefully defined if it is to be part of an indicator used to drive performance".
Measures of safety-related activities can, of course, vary in quality, and it’s often “possible to increase quantity by sacrificing quality”. He gives the example of the number of audit corrective actions closed out or completed – which can be achieved by closing items out in the easiest way possible.
In concluding, he argues:
·?????? “I have examined the meaning of the terms leading and lagging in two recent influential publications and found that they are not used with any consistency”
·?????? And he doesn’t “think there is much point in trying to pin down a precise meaning since in different contexts these terms are used to draw attention to different things”
·?????? Based on the HSE document, this suggests “that process safety indicators must be chosen so as to measure the effectiveness of the controls upon which the risk control system relies”
·?????? Hence, “Whether they be described a lead or lag is ultimately of little consequence”
Author: Hopkins, A. (2009). Thinking about process safety indicators. Safety science, 47(4), 460-465.
Chief Human Factors Engineer at Beville.
4 个月Don't forget Goodhart's law - "When a measure becomes a target, it ceases to be a good measure".
Author, Safety Leadership and Outdoor Adventure Fiction
4 个月And the very same philosophy can be applied to operational/personal safety risks.
HSE Leader / PhD Candidate
4 个月Study link: https://www.processsafety.com.au/s/WP-53-Thinking-About-Process-Safety-Indicators.pdf My site with more reviews:?https://safety177496371.wordpress.com