It's About Time! 5 Points About Reliability (& Failure)
Salvador Dalí. (Spanish, 1904-1989). The Persistence of Memory. 1931. Oil on canvas, 9 1/2 x 13" (24.1 x 33 cm). ? Salvador Dalí, Gala-Salvador Dalí

It's About Time! 5 Points About Reliability (& Failure)

We've all heard the phrase "Time is money". In the reliability world, understanding how time affects failure or reliability can be a competitive advantage. The following list is an example of how the ever forward march of time can affect reliability and in turn profitability.

  • Reliability, by definition (in some definitions) is literally a function of TIME! In their report, Reliability-centered Maintenance, Nowlan & Heap defined failure as "... the probability that an item will survive to a specified operating age..." (i.e. to a certain time). In some cases, the probability of failure increases as the equipment ages. In other cases, probability of failure decreases. Weibull analysis can identify which failure patterns apply to your equipment. Most failure patterns look like straight lines on a Weibull plot (see figure below). The second figure shows the conditional probability of failure for various Weibull shape parameters (i.e. these are the more familiar failure patterns).
Weibull plot (straight lines) that depict 5 of the 6 failure modes found in Reliability-centered Maintenance
No alt text provided for this image
  • Benjamin Franklin once said "nothing can be said to be certain, except death and taxes". If he worked in Reliability, he may have added "failures" to this list. Failure is inevitable and given enough TIME failures will happen. Trending how quickly failures accumulate provides a lot of valuable insight. From a Crow-AMSAA plot, we can determine if we are improving our overall reliability or if it is deteriorating. We do this by observing the slope (beta) on the plot. In addition to how reliability is trending over time, the Crow-AMSAA plot may hint at major step changes in our reliability (perhaps an improvement was made to a pump). It can also hint at the nature of the failure modes involved (batch problems and infant mortality failure patterns tend to have a unique appearance).
No alt text provided for this image
No alt text provided for this image
  • When failures seem to be predictable and repeatable, we tend to think we can avoid failure by prescribing a TIME-based maintenance action. The plot below shows two curves. The first, narrower curve, shows a typical range in which a PM may be completed for a filter replacement on a rotary lobe blower (in this case a time-based PM frequency of 6 months). Depending on your maintenance program, this PM may be completed a little bit earlier or later than the nominal 6 months (this leads to a distribution around 6 months). The second curve shows the time in service it may take for the filter to plug with debris and collapse (on average ~9 months for this curve). Unfortunately, due to the variation in execution time of the PM in this example (considerable) and variation in how quickly the filter collected dust/debris (also considerable), the PM curve and the failure curve have substantial overlap. This overlap can lead to a plugged or collapsed filter prior to PM change out, and dust/debris in the blower (and ultimately into the process). Being able to plot the normal variation in time of both maintenance activities and of failure mechanisms can provide real insights that we can use to improve reliability. As maintenance & reliability professionals, we can use this information to guide an updated PM frequency or perhaps a change to a condition monitoring (e.g. monitoring pressure drop) approach.
No alt text provided for this image
  • Failures are not typically isolated events, but rather a sequence of cause & effect events that take place over TIME. An example of this can be a gear drive failure. The gear drive's reliability can be described as a function of time, but perhaps more relevant to our reliability and maintenance roles may be to understand how time affects the events that led up to the failure. The example below shows how a mineral oil may age and oxidize in a gear drive (in this case over a period of 1-5 years). Understanding the time frame involved can influence decisions related to lubricant analysis frequency, oil change frequency, system design and possibly even lubricant selection.
Plot of oil life in a gear drive application
  • Failures rarely happen without warning. Although, we may wish we have more TIME between a potential failure and a true functional failure. The time from an observed potential failure to a functional failure is referred to as the P-F Interval. In the case of the gear drive above, the observation of oxidized oil (by lubricant testing) can be considered a potential failure. The seizure of the gear box or dangerous vibration levels could be considered the functional failure (functional requirement defined by user). With proper analysis and sufficient data, P-F intervals can be estimated if you know how. The plot below shows the P-F interval estimates for several potential failures observable with a typical lubricant analysis report (e.g. contamination, oxidation or viscosity issues, and excess wear metals). As noted in the previous section, failures do not happen as singular events, and each of these potential failures may lead to eventual functional failure (water contamination may lead to oil oxidation and viscosity changes, which in turn may lead to mechanical wear / metal debris, which could lead to high vibration and ultimately functional failure).
Curves showing typical P-F intervals for lubricant analysis defects

No company or organization has the TIME or money (resources) to completely eliminate failure. Even on our best days, with a great design, precision installation, capable predictive maintenance, and great planning / scheduling / maintenance response times, the probability of failure at any given moment is always greater than 0. Understanding how failure or reliability relates to TIME is therefore a prerequisite to optimal management of our organizations. Contact Alejandro Erives at Blackstart Reliability LLC to get started applying these to your business.

Rod Jenkins

Organizational Effectiveness Resource & Certified RCFA Principal Investigator - Retired yet seeking opportunities to teach & mentor investigators

5 年

Nice article and easy to follow and understand. ? I agree with the information but the filter and plugging is not a clear relationship issue. ? ?The equipment plugging is the filter so they are inclusive events. ? I think a better example is redundant pair or trio of pumps with different failure rates that eventually you lose the redundancy because multiple pumps fail at the same time. ? ie One with 6 month the other with 8 months. ?

回复
JD Solomon

How to Get Your Boss's Boss to Understand by Communicating with FINESSE | Solutions for people, facilities, infrastructure, and the environment.

5 年

Nice, straightforward article.? Time is indeed the important dimension.? Maybe your next article can be on why it is missing from the definition of risk.? Thanks for posting this one.

Vanessa Madrigal

Maintenance & Reliability Professional | Latina Entrepreneur | Passionate Advocate for Diversity & Inclusion

5 年

Great article Alejandro! My concept of time relating to a failure however, isn't always linear. Failure often times, starts at conception. The conception of design to fulfill the objective it means to serve. We don't seem to give sufficient time and consideration to defining the parameters of what we intend our equipment/system to do to meet our end goal.? Terrance's comment brings me back to our conversations of what is a Maintenance Engineer VS. Reliability Engineer? They are not one in the same but they do influence each other (in my opinion). Hope all is well and good luck to you in your new endeavor!

要查看或添加评论,请登录

Alejandro Erives的更多文章

  • ON THE ORIGINS OF MAINTENANCE

    ON THE ORIGINS OF MAINTENANCE

    Countless reliability texts discuss the different types of maintenance, the evolution and history of it all… well I am…

  • Ceaselessly Curious, a Hispanic Heritage Month article

    Ceaselessly Curious, a Hispanic Heritage Month article

    “Like most men of his generation, my father hated to be de oquis, with nothing to do.” “I hope that this work will…

    4 条评论
  • Over 100 Professionals...

    Over 100 Professionals...

    ..

    7 条评论
  • Looking Back - 2019

    Looking Back - 2019

    Blackstart Reliability in 2019 In April of last year, I branched out on my own to start Blackstart Reliability. I count…

    12 条评论
  • Bad Actors in 2020

    Bad Actors in 2020

    As we enter a new year, we can all take stock of our success (or failures) in the last year and begin to think about…

  • DataStart Analysis of a Chemical Metering Pump

    DataStart Analysis of a Chemical Metering Pump

    A water treatment plant requested a reliability review of a bad actor chemical metering pump (i.e.

  • 4+ Take-Aways About Failure Patterns, Failure Mechanisms, and Perception.

    4+ Take-Aways About Failure Patterns, Failure Mechanisms, and Perception.

    I recently posted a quiz about failure (here). The focus was on 4 failure patterns (Wear Out, Random, Fatigue, & Infant…

    2 条评论
  • Infant Failure Patterns Are Random!

    Infant Failure Patterns Are Random!

    ..

    2 条评论
  • Seeking the P-F Interval

    Seeking the P-F Interval

    John Moubray wrote in his book RCM II: Reliability Centered Maintenance about determining the P-F interval: "It is…

  • Do your assets burn out or fade away?

    Do your assets burn out or fade away?

    Nowlan & Heap discussed this very question in Reliability Centered Maintenance nearly 40 years ago. In John Moubray's…

社区洞察

其他会员也浏览了