"PM" Basics - often misunderstood!
You don't replace your car headlights at regular intervals of six months. You probably don't wait to replace your tires after they've worn through. You probably don't use oil analysis to see what is wrong before changing your car's lube oil Each component and system in your car has a function that is prone to failure in one way or another. When it happens, each failure has a set of consequences – some with little importance and others with great importance.
Your decision on what to do (or not do) about those failures is actually based on a few factors:
- You may just blindly follow the manufacturer's recommendations.
- You may understand what can fail, and how those failures occur.
- You probably understand the implications (consequences) of certain failures, but not all.
- You may or may not have an understanding of the various maintenance approaches, technologies or actions that are available to deal with failures.
- Depending on vehicle ownership, you may or may not be motivate to care enough to take those actions.
Your car
Headlight can be run to failure, then replaced. Failures are not frequent and there are two headlights, so you can get away with (tolerate) one being out for a short time. They will fail due to random causes and you cannot tell when that will happen. They don't "wear out" so they give you no warning of their imminent failure.
If your lubricating oil gets excessively dirty it can clog filters, damage engine components and result in a very expensive repair - potentially even replacement of the car if the repair is costly enough. Changing oil and filters regularly will keep the oil clean so that problem is entirely avoided.
Running your road tires to failure is a different matter. Wear on tires is more obvious, you can see if and check tread depth indicators (if you look). Imminent failure due to wearing out can be predicted. In the case of a worn out tire, the lack of traction or blow out could result in an accident, and depending on where and when it happens, it could prove fatal. Checking those tires is clearly worth doing.
Allowing the spare tire in your trunk to deflate has no consequence at all unless something goes wrong with one of the four that you rely on as rubber on the road. A flat spare is very inconvenient and embarrassing. That can be avoided with a simple periodic check.
In the examples you can see that running to failure, condition monitoring, preventive replacement and periodic tests (checks) can all be used depending on the circumstances. We all have a reasonable understanding of cars and the consequences of their failure. Our plant systems are more complex and not so easily understood. A failure that may seem inconsequential on its own, can be catastrophic because of complex interactions that occur in many plant systems.
Your factory or plant
At work the systems are complex, highly interactive, have a great of embedded automation, they interact with operators and maintainers and you depend on them for continued operations. Knowledge of the various potential failures, and the maintenance options available to you is imperative.
Failures lead to business risks. The options for risk management include:
- Accepting the risk (doing nothing)
- Avoiding it (taking some physical actions)
- Transferring it (get someone else to action)
- Reducing it to a tolerable level (possibly a combination of the above).
The actions you can take to manage risks are:
- Reactive (accept the risks by allowing the failure to occur),
- Proactive (avoid, minimize, or transfer the risks),
- Engineering to design out failures (avoiding or transferring the risks) or consequence reduction.
The decision to act at all will be made on the basis of whether or not it is “worth it” to act. The consequences you avoid must be more costly (or more risky) than the actions you choose.
Timing of actions
When you act can matter and it can be related to the nature of the failure and how it manifests:
- Scheduled (rigid adherence to some time or usage-based frequency or fixed interval), like oil and filter changes in your car.
- In response to some condition or sign that work is needed, like changing tires when they show signs of excess wear.
- Unscheduled (no fixed interval used at all. This only applies to the reactive approach), like changing a burnt out headlight.
- Flexible scheduling (which entails the shifting of scheduled or unscheduled work to convenient “windows of opportunity” when the work can be done with minimal disruption to production. This, of course, implies that the work is something that can be deferred or advanced with minimal consequences). This is enabled by checking equipment and system condition, catching degradation before it goes "too far" and timing your corrective action soon, but when least inconvenient.
Your options, like the car, include:
Run–To-Failure
Wait until the equipment fails. This approach is sometimes the only option. In electronic circuit boards and light bulbs, for instance, you get no warning of degradation and the failure will occur randomly. If you can allow equipment to run to failure it should be non critical to operations, safety or environmental considerations. Consequence of its failure must be tolerable.
By default, most reactive maintenance will be done on an unscheduled basis, whenever the equipment fails. Because the equipment is of low critical importance, it's repair may even be deferred to a suitable window of opportunity. However, if you find that more than half your work is repair work and in response to failures, then you are probably over-using this option.
Preventive Maintenance
We measure kilometers, cycles, throughput, fuel consumption, and running hours to give indication of "wear out". We prevent failures by eliminating their causes: dirt (cleaning removes contaminants), lubrication (top up, cleaning, replacement), minor adjustments (resetting), replacements, periodic restorations (overhauls) and other failure prevention actions. In all cases, our goal is to take action before the failure occurs and thus prevent the failure – hence the name.
Condition Based (Predictive) Maintenance
This is actually two distinct activities, the first of which is recognizing that the equipment is failing (condition monitoring) and the second is correcting the defects before they progress to an unacceptable level (proactive repair). Often both are lumped under the heading of "condition-based maintenance," “on-condition maintenance”, or "predictive maintenance" because we predict imminent failure and then act. When the condition that indicates imminent failure falls outside acceptable limits, corrective repairs are made. If it remains within acceptable limits, then nothing beyond the inspection is done.
Timing of the monitoring inspections is critical. If you go too long between checks you increase risk of failure. If you check too often, you increase the costs of checking needlessly.
Detective Maintenance
When you check tire pressures on your car, you should check 5 tires – 4 on the wheels and 1 spare in the trunk. Checking the 4 road tires is a condition monitoring activity. The check of the spare is actually a failure finding test, or detective maintenance. Finding it deflated give you the opportunity to restore its condition BEFORE you need it. You are increasing the spare tire's availability for service.
Like the spare tire in the car, your plant has a number of protective, backup or stand-by devices - probably more than you realize. These can be alarms, shutdowns, redundant equipment, warning lights, warning signs, first aid kits, defibrillator kits and other things that are only used when needed. Bear in mind that they are only needed, whenever something else goes wrong. They provide protection in one form or another. Do they all work? Are they ready to operate when needed? Without checking, you cannot know for certain. If they pass inspection or their functional check, then you do nothing more until the time for the next test arrives. If they fail, you correct them.
Testing identifies the protective devices that are already failed. It does not prevent the failures nor does it predict them. It finds them after the fact but hopefully before you need it to operate. By testing regularly and correcting for any defects found, you are increasing their availability to act and therefore minimizing the consequences to your operation if the device were needed to act in its protective function.
Redundancy may be built into a system and sometimes into the design of equipment. A great deal of electronic equipment has redundant circuitry to ensure its high level of reliability. If the primary unit fails, the secondary unit is available.
All maintenance managers know that having a spare is an excellent way to guard against loss of service. Unfortunately, this is also an expensive option that is (or should be) limited to situations where failure is absolutely unacceptable. Redundancy does not eliminate the failures or maintenance action; it merely allows service to continue when failures of the spared equipment or systems occur. Adding redundancy adds assets and capital cost as well as the potential for additional failure modes and maintenance. If you have redundant equipment or systems you also need to carry out detective maintenance on them to make sure they’ll be useful when needed.
Redesign
Designing out failures and the need for maintenance can be expensive and really only viable at the early stages of the asset life cycle when it being designed and built. Retrofitting design changes once you are in operation is extremely disruptive and costly.
Proactive maintenance is our first line of defense against failures with severe consequences. Once the plant or equipment is in service, we use re-design only if the consequences of failures that cannot be prevented or predicted are beyond our tolerable limits.
Redesign is often an outcome of the modern quality approach known as Six Sigma, but it is important to keep in mind that it should really be limited to a "fallback" option dealing with a physical root cause of failures. It is often less expensive to maintain properly than to redesign the asset. Unless you design your systems right from the outset, the engineering approach to maintenance improvement is almost always the most expensive. Fortunately, at the design stage we have methods that work well to help us avoid the need to do re-design later.
Understanding where to use preventive, predictive, detective and run-to-failure tactics is key to setting up an effective proactive maintenance program. You will be able to choose tactics that are suitable to the failures you might reasonably expect to see occurring. The best way to do this to ensure both an effective and efficient proactive program is to use Reliability Centered Maintenance (RCM-R(R)).
Costs
The illustration at the top of this article shows the relative costs of various maintenance programs. The most expensive are at the extremes - doing nothing proactive, and overdoing preventive. Completely reactive maintenance programs are the default. They occur because of negligence and ignorance - the understanding of the value of being proactive and its application is lacking. Overdoing maintenance occurs if there is a poor understanding of preventive methods - they will be overused and can actually lead to more failures. Manufacturer recommendations are often heavy on preventive actions that, if followed faithfully, can actually result in additional failures.
Preventive (if not over-used) and condition monitoring programs will save you money. Allowing some run-to-failure can also save you money if allowed only where its consequences can be tolerated. Getting the mix right is what RCM does for you - it produces your lowest cost and lowest risk program.
#reliability #PM #maintenance #manufacturing #mining #production #CFO #COO #VPOperations #RCM #RCM-R
CMMS Team Lead at Arctic LNG2
5 年To what category does Over fall? If it is over mixture of PvM and CBM should it not be displayed in their category variables???
Managing Director /Owner at Blu Sky Engineering & Consulting
5 年Good context in a simplified way. I prefer the TTTA route with risks. Treat, Transfer , Terminate and lastly Accept the risk that remains or cannot be further mitigated.
Fleet | Logistics Professional | M.Sc | LSSGB | ISO 55,000
5 年James, thanks for sharing this as you made it quite simple to make understanding for all those who are new to PM program.
Organizational Effectiveness Resource & Certified RCFA Principal Investigator - Retired yet seeking opportunities to teach & mentor investigators
5 年I liked the article. It should be easy to communicate its content to others leading to discussion. My only comment would be there is an opportunity between reactive and proactive to manage the failures. You can respond to early indicators (detection). You can have early indicators a tire is worn or battery is having problems you just need to respond. Good article. Thanks for sharing.