Work Order Cause Codes – How Deep Should You Go?

Work Order Cause Codes – How Deep Should You Go?

What is the Right Level of Detail for Failure Data?

The question becomes, “How far down do you go when analyzing the cause?” Answer: It depends. SAE JA1012 states that failure modes should be described in enough detail for it to be possible to select an appropriate failure management policy (i.e., maintenance strategy) but not in so much detail that excessive amount of time is wasted on the analysis itself. Per figure 1, the component is the impeller and the component problem is loose. This would be my choice for the failure mode.

No alt text provided for this image

Unfortunately, some organizations do not capture any failure data. The lack of failure data in the CMMS greatly impedes leadership’s ability to leverage data to manage by exception. The other extreme is root cause failure analysis, but you certainly can’t do this on every functional failure. So, what is the ideal solution? If there was a middle ground it would be a strategy that focuses on smaller but recurring failures which over time impact O&M cost.

How much failure data do you need to run a bad actor report? The answer to “how much data” depends on which sort-metric you are going to use. If a reliability team ran this report once a month and they were looking at the top 10 worst offenders, what would be the very next question? I suspect they want to know why. Thus, for a given asset they would want to drill-down on the failure mode to arrive at a cause.?

No alt text provided for this image

The technician knows the component he replaced (or repaired) and, he should know the component problem. For example, if the failed component is a bearing, he might also recognize that the bearing was seized, worn or damaged. But what is the cause of the bearing failure? Was it overloaded, improperly installed, or lack of lubrication? There will be times when the cause is not known. But, is the cause truly “not known”, or is this the limit of the technician knowledge?

Why is it Important to Capture a 3-Piece Failure Mode?

The language of RCM is failure mode. Some organizations pursue accreditation for essential staff in asset management and reliability. This same organization has a CMMS for work order management. The trick is to get them speaking the same language. If you’ve been involved with RCM analysis, one of the questions is to identify likely failure modes. A cross-functional team is assigned to explore all the possible failure modes and how to avoid them by choosing a suitable task (maintenance strategy). The output from this RCM analysis is then translated to the CMMS where preventive maintenance (PM) records are created to auto-generate work orders (by time and usage-based criteria). In order to create a defendable PM application every PM-jobplan combination should address a failure mode.

Whenever a functional failure occurs for an asset, a work order is created. At job completion, the failure data is identified. If the work order is designed to capture the failure mode (failed component, component problem, and cause code) then comparisons can be made back to the RCM analysis document which contains the maintenance strategies and failure modes. RCM updates are expected with any facility as conditions change over time, hence a living program.

There are some RCM software programs that store this data outside the CMMS. Or, if the CMMS is configurable, you could set up the same features as an RCM strategies and failure mode application. Either way, the CMMS work order (failure mode) should be compared to the RCM failure mode.? At this point, you have established a 3-way match capability inside the CMMS – and supports ease-of-update.

How Important is the Cause Code?

The cause code is the 3rd element of the RCM failure mode. Although capturing the cause code is the hardest part of failure coding, without it, this asset may fail again in the next 6-9 months. It won’t matter how fast or how well you did the previous repair if the real cause is not acted on. Figure 3 illustrates a way to set up choice lists on the work order where the user can stop at basic cause - or drill down further if there is human involvement. Whatever the last value chosen, then this becomes the cause code for the failure mode. ?

Again, there will be times when the technician simply doesn’t know. In this situation, the work order can be electronically routed for review and update. Example roles that could help include the maintenance supervisor, maintenance manager, reliability engineer, maintenance engineer, planner, and HSE representative.

No alt text provided for this image
No alt text provided for this image

Failure Modes versus Failure Mechanisms

Per Daniel Delay, P.E., CMRP, “Individuals who perform repairs need to be educated to create repair records that identify failure modes (using choice lists)”. Mr. Delay also identified the following failure mechanisms (figure 4).

No alt text provided for this image
No alt text provided for this image

Figure 5 - Failure Modes and Failure Mechanisms

Some organizations will have maintenance/reliability engineers and with their advanced training they would be able to add more detail as to the failure mechanism. This information could be stored on the work order. It would not be practical however to involve the maintenance/reliability engineer on every functional failure and therefore business rules would be created stating when additional resources are needed to evaluate further.?

When is it Important to Capture Human Factors?

Sometimes the failure analysis needs to move from physical cause to human factors. This analysis doesn’t necessarily have to be stored inside the CMMS but needs to be addressed if it is a factor contributing to functional failures. There are different categories of human factors (as shown in figure 6): human root, personal action, department procedures, and organizational mindset. If this failure is frequent enough whereby downtime is impacting production and repair costs are adding up, then the reliability team might verify human factors are not part of the problem. For example, it may be a lack of skills, or maybe the supervisor assigned the wrong person, or the repair procedure was poorly worded. As a corrective action, the entire shift might be trained on precision maintenance (or whatever the deficient skill was).

No alt text provided for this image

Figure 6 - Human Factors

Design the CMMS for Reliability Engineering

Imagine the value of this information in the hands of the reliability team. Imagine the decisions they can make to reduce reactive maintenance.

Imagine the improvement to corporate profitability due to increased availability and less downtime.

Imagine the improvement to all industries in their ability to leverage data from the CMMS to make more informed decisions.

Nastaran Adeli, M. Sc., MA, MBA

Digital Transformation | Asset Information & Data Strategy | Technology Management & Innovation

2 年

John Reeve can we add a devil advocate in this meeting asking "Even if we design CMMS to support failure analytics, would it be utilized to capture failure data?" Around 5 years ago I was told flat out by a maintenance supervisor that it's not his crew job to fill up the forms about what was wrong. Their only job is to fix what it broken. Frankly as harsh as it sounds, I agree with him. I can't think of a different solution by the current state of CMMS capabilities but I see the possibilities of better solutions coming up soon.

回复
Brett Wilson

Reliability/ Maintenance Analyst

3 年

They use of both Problem and Cause is important as it shows the alignment of what was reported versus found, which helps in narrowing down issues faster. I agree with a good but simple list of codes. I once saw a small nightmare with 200 options which meant inaccurate reporting. However Problem and Cause codes can't be the only section, in complex components the coding only path could confuse issues as they are lumped together. I am a fan of the remarks and then using a word cloud to pick out the nuggets. Combining together is the best outcome, but is not an overnight task to get peoples imput and getting them to understand the value their input brings.

回复
Ronan O'Sullivan

Business Development Manager

3 年

Nigel Stocks speak to you Friday ????

John, very good read and very thought provoking. The only part I need to figure out is how to convey the benefits to the maintenance trades that are doing the work, what's in it for them, why they want to do it to make their work easier in the future. Got to get their buy in to make it all work. Thanks for writing the article John.

Marc W. Yarlott, PE CRL CLS

Director at Veolia North America

3 年

John, great article. I found the point about the human factors in the cause of particular interest and I like the idea of including them. Without doing a full root-cause failure analysis (RCFA) including the human elements in a cause might allow the reliability team to determine core issues (training, organization, or work process) that are contributing to failures. Thanks.

要查看或添加评论,请登录

John Reeve的更多文章

社区洞察

其他会员也浏览了