Fact or Myth #1: All 'RCA' Thinking is Linear.
In my travels over the past 35+ years talking to RCA analysts around the world, as well as those outsiders who look into our 'RCA' bubble, I find many misconceptions about RCA. This happens in every space, just think about RCM, RBM, APM, CBM and the like; everyone experiences how other people view their craft.
One of the more popular myths about an effective RCA approach, is that 'RCA' is obsolete because it promotes linear thinking. I'll admit this is a more pervasive paradigm emanating from the Safety community, than from the Reliability community; but nonetheless it exists. I've seen this as well from leadership in cases where their RCA initiatives were not producing the results they expected. What they don't often consider is the many reasons why such efforts do not meet expectations, including lack of leadership support and/or lack of clear expectations.
Is this a Fact or a Myth? I will leave the conclusions up to those that are actual RCA veterans who will answer based on their experience, rather than from hearsay.
I believe that much of this linearity belief comes from those that consider all RCA to be the equivalent of the traditional use of the 5-Why's (not modified versions like 5x5).
Figure 1: Traditional Understanding of the 5-Whys Approach
If everyone considered all RCA as the equivalent structure of this 5-Whys approach, then our nay-sayers would be right...but unfortunately that is not the case. In a traditional 5-Why approach, one would simply ask themselves WHY, 5x deep and the would arrive at THE root cause. Let's look at this from a technical standpoint and not an 'us' versus 'them' perspective.
Asking only 'Why' is a very narrow line of questioning that promotes linearity. Because it connotes that we want a singular answer (linear) and that we want someone's opinion. The fact is that undesirable events that occur in complex organizations, don't happen linearly. Unfortunately cause-and-effect relationships happen in parallel most of the time. Things happen in various combinations, on any given day, and come together to form a unique sequence of factors that result in bad outcomes. So using a traditional 5-Whys to analyze such occurrences, will not yield a comprehensive understanding of what actually went wrong.
CONSIDER FIRST ASKING 'HOW CAN?'
This seems like semantics but really is an epiphany! Most undesirable outcomes are observable (they are not decision reasoning stored in someone's head that we can't see). Therefore, there were physics at play that lead up to that bad outcome, that we could see.
To make this point consider the difference between asking "How a crime occurred?' versus 'Why a crime occurred?'. Are your answers the same?
This change in initial questioning is the difference between linearity and non-linearity. Consider the use of Boolean Logic gates. I'm not going to make this over-complicated and get into the Boolean Algebra because it's not necessary. I'm going to use the basic Boolean Logic gates we use for our RCA's (using our PROACT RCA methodology).
Let's try some practical examples to make our points. If I was in the midst of an RCA (based on how I personally would approach an RCA), I might come across a process that failed due to a fatigued bearing. My next natural question would be 'How could a bearing fatique?' From a logic standpoint, because I don't know the correct answer yet until I have adequate evidence, I would use a Boolean AND/OR gate to explain my logic (Figure 2). We use this symbol when exploring what may have happened.
领英推荐
Figure 2: Utilizing a Boolean Logic Gate for representing AND/OR Logic
My possible answers (hypotheses) to that question may be, Resonance, Misalignment and/or Imbalance. Whatever the evidence proves to be true, we continue to delve deeper on. Oftentimes there are two paths are true and we follow them both. The evidence leads the analysis, not the analyst.
Let's try another logic gate, this time using an AND gate. I may be investigating an Event where a fire was involved. My question remains the same, 'How could a fire have occurred?' My logic tree branch may look like the following in Figure 3.
Figure 3: Utilizing a Boolean Logic Gate for representing AND Logic
We are all familiar with the fire triangle, where in order to have a fire we need adequate oxygen, fuel AND an ignition source. So this is how our logic may be expressed.
To make my final point, let's use another basic Boolean Logic gate, the symbol for OR. When using an OR gate, we are making binary statements. Let's assume we are investigating an unexpected process shutdown involving a critical valve. Our question may be 'How could the valve have failed?' (Figure 3).
At a higher level, until we get into the deeper physics, the valve either could have failed open or closed. We would have to let our evidence tell us which was verified to be true.
In summary, I just wanted to express the differences between logic expressed linearly versus the reality of complex environments where logic happens in parallel and in combinations.
Think about the above examples I provided using the logic gates. What if we applied the traditional 5-Why approach to those cases, would there be a difference in our results? Would we have only 1 conclusion (root cause)? Would there likely have been more root causes had we expanded our questioning and used evidence instead of hearsay to back up our hypotheses?
If you're interested in learning more about our PROACT RCA Approach, we'd love to hear from you...let's make it happen!
Principal at Prelical Solutions, LLC.
4 年Why True RCA Works on Any Undesirable Outcome! https://www.dhirubhai.net/pulse/why-true-rca-works-any-undesirable-outcome-robert-bob-latino/
De juiste match, duurzaam succes ?? | ReflACT | Gedragsanalyse & Coaching voor selectie die werkt ??
5 年Thanks for your article.?I would like to provoke you a little bit today ;-) How would an RCA go if you first ask: "How is this system supposed to work?" followed by the question: "What's happening?". Aren't these much stronger strategic questions? Furthermore, I find it especially enlightening to examine not only defective systems, but also systems that function normally. Did you know that the most valuable information is hidden in normally functioning systems, rather than in faulty parts? It is therefore crucial that first of all system characteristics are correctly defined (e.g. by using physical laws, first principles and a good model of reality). Then map out cycles and define contrasts. Depending on this, collect fresh data and interpret it graphically. At this point, I do not use frequentistic statistical methods, because the information content is poor and the underlying assumptions for engineering systems are invalid. Perhaps surprising: The observed output variation is NEVER a function of random chance. Machines don't roll dice! By not taking the probabilistic detour, further convergence can quickly be made in a next step (by isolation, dissection, etc.), until a rich causal explanation is found. This is not the same as a probabilistic "Root Cause", which is inductively demonstrated by brainstorming and hypothesis testing, with the inevitable uncertainty. You could improve the system on this basis, but zero defects will remain unreachable. With a rich causal explanation, zero defects is usually achieved and the solution is often much cheaper!
Sr Reliability Engineer, CWI
5 年I’ve found that the “how can” method normally leads to hypothesis that might be missed using other methods, normally due to preconceived ideas.
Assuring the peak performance, energy efficiency, and sustainability of your plant's physical assets is my job and my passion.
5 年Great article Bob. It seems to me that RCA differs from 5-Why in that with RCA, rather than try to identify what happened, I prefer to eliminate factors that I know DID NOT contribute using the DOE NE 1004 taxonomy, then start the Boolean mapping process.
Ik help je aan zelflerend en daadkrachtig leiderschap dat het verschil maakt.
5 年Thanks Rob, good insight into the limits of 5Why. If I may, I'd like to add 2 tips. THe first one is "be ware of the facts", the WHY question and the HOW CAN question yield both facts and hypotheses. The mixture of those two is one of the main pitfalls in Problem Solving and RCA. In most cases a lot of facts are already known and can just be asked and visualized. After the facts have been visualized we are left with one or more unknown causes. Analysing unknown cause requires a different approach. The second tip is: compare possible causes (HOW CAN) to the exact specification of the unknown cause (IS & IS NOT) to find the most probable cause. Example: Facts: I hit my foot to the door WHY because I didn't see the door post WHY because it was dark in the hallway WHY because the light in the hallway is not working WHY ... I don't know (<= unknown cause). Hypothesis: HOW CAN the lights not work? - House main powerfuse blown - Power station down the road is down - LIghtbulb broken - Lightswitch incorrectly wired Facts: IS / IS NOT The light in the hallway (IS the problem)- the lights in the warddrobe (IS NOT the problem) Not working (IS) - Only dimmed or flickering (IS NOT) My house - neighbors house This evening - Yesterday evening (IS NOT) etc Hypotheses: when you compare these facts and hypothesis we can conclude that the broken lightbulb is the most likely cause. We've eliminated other possible causes. The AND and OR boolean operators are crucial to visualize cause and effect, but we use those ONLY for proven facts. Just to keep things very clear.