MTBF free Availability
The classic formula for availability is MTBF divided by MTBF plus MTTF. Standard. And pretty much wrong most of the time.
Recently working for a bottling plant design team we pursued the design options to improve availability and throughput of the new line. The equipment would remain basically the same, filler, capper, labeler, etc. So we decided to gather the last 6 months or so of operating data which included up and . the data included time to failure and time to repair information.
The plant traced availability using the classic formula, simply the total operating or up time by the total run time. We learned during plant visits that the longer runs tended to get better availability. And the data, looking at short runs (1 day) versus long runs (5 days) did show a marked difference in availability. It also showed the MTBF went up with longer runs. The MTTR remained pretty constant or went up just a little.
Recall we have the ‘time to’ data. A little work with a spreadsheet and we fit distributions to the data. Weibull, and Exponential clearly fit different equipment results. And, we had a wealth of data, permitting very good data analysis and distribution fit decisions. For sake of this article, let’s assume the equipment operating time is well described by a Weibull distribution with a beta less than 1. And the time to repair data is well described by the lognormal distribution. While this wasn’t universal across all the types of bottles and equipment, it will work for the purpose of this article
So why do we commonly use MTBF with the assumption of an underlying exponential distribution? There are many reasons: ignorance, “always done that way”, lack of ‘time to’ data or in rare cases the assumption is valid. Let’s remove the ignorance excuse and ‘always....’ and make the case to get the right data at the same time
Why is using MTBF so bad for availability calculations? Let’s look at one example, the design of the bottling line. There are up to 10 steps in the high volume bottling process and each step involve highly complex and expensive equipment. The design team was balancing the line availability and throughput per shift with the cost of the equipment. The other part of the design consideration was the cost of storing finished goods with the change over time between flavors and bottle sizes. The idea was to add just enough redundant equipment that the line change over time and line availability permitted frequent line changes which significantly reduce finished goods storage costs.
In a perfect plant, each flavor and bottle size combination would have a dedicated line. In the expected run times of the flavor and bottle size combinations ranged from a few hours per month to two weeks a month, with most running for less than a week. Also, consider the equipment cost a few million dollars per unit.
Therefore, we built model of the existing line configuration and the various proposed line configurations. The model would permit the simulation of various expected line management policies to determine ability to reduce finished good storage costs. The model would significantly influence the million dollar decisions in the project.
Back to why we want to look at the underlying and very common assumption - MTBF and MTTR often assume the exponential distribution. This distribution does not account for changes in the failure or repair rate. The first hour and hour of operation after that have exactly the same average failure rate or repair rate.
Recall the observation and supporting data that suggest neither MTBF nor MTTR are constant, both seem to depend to some degree on the length of the run. Well, I’m a statistician and the plant had years worth of data on the equipment. Happiness.
First get the data and determine the best fitting distribution. This is basic regression analysis. Here is an example of the difference between assuming the exponential and fitting a distribution.
Here the data was drawn at random from a Weibull Beta 2 Eta 1000 distribution. the Weibull fit is pretty good as expected. Forcing the fit to exponential overestimates the failure rates at earlier times, and those earlier times of often of most interest.
Second, determine how to calculate availability given the fitted distributions. This second step had me hitting the books.
The general formula to calculate availability is based on expected values and not MTBF and MTTR based on exponential distributions. ‘Expected Values’ if you are like most engineers you only have a vague recollection of this statistical term. Most associated this phrase with the mean or average value, which is mostly true. The availability formula changes from
to this
If are familiar with the Weibull distribution you recall if is equal to 1, the characteristic life is the theta. The same as for an exponential distribution. When the beta is not equal to one, the characteristic life is not equal to theta. The characteristic life is another way to say expected value.
For the exponential distribution the expected value calculation is very commonly used to calculate MTBF.
Whereas, for the Weibull distribution the formula is
and, for the Lognormal distribution the formula is
Therefore, with the data properly described with the appropriate we calculate the expected values and determine the availability. Easy.
Fred Schenkelberg is an experienced reliability engineering and management consultant with his firm FMS Reliability. His passion is working with teams to create cost-effective reliability programs that solve problems, create durable and reliable products, increase customer satisfaction, and reduce warranty costs. If you enjoyed this article consider subscribing to the ongoing Accendo Reliability.
Senior Director- Presales at KPIT
7 年Hi Fred, the concept explained in a very simple fashion. The main takeaway is to do a distribution fit before going to the next step. With software tools available to do this, it should not be a deal. Obviously, in the bottling plant change over case you have mentioned, it's not just a reliability issue but also a system throughput issue, something that Lean tries to solve with concepts like SMED ( single minute exchange of die)
All things asset PROCESS, INTEGRITY, RELIABILITY, MAINTENANCE and LIFE EXTENSIONS
7 年I think the confusion stems from what we understand of what is meant by Expected Value of a continous variables. It is not an arithmetic mean of operating runtimes but population mean
Senior Reliability Engineer at ASML
7 年Hi Fred! MTBF works reasonable ok if you have multiple failure modes combined which together form an exponential behavior. Indeed if you have a single failure mode with a single non-exponential behavior Weibull is best.
Principal Director Intelligent Asset Management |Gen AI| Digital Manufacturing Operations & Artificial Intelligence | HSE at Accenture
7 年Its a good read but the formula you have used its Inherent availability but we have others achieved, operational,Mean availability. I have other queries also , will send in separate. Thanks
Proprietor/Consulting Engineer at Long Term Quality Assurance (LTQA)
7 年The classification of availability is somewhat flexible and is largely based on the types of downtimes used in the computation and on the relationship with time (i.e., the span of time to which the availability refers). As a result, there are a number of different classifications of availability, including: ? Instantaneous (or Point) Availability ? Average Uptime Availability (or Mean Availability) ? Steady State Availability ? Inherent Availability ? Achieved Availability ? Operational Availability A wide range of availability classifications and definitions exist. The ones indicated here are the most common but variations exist and you should be aware of how they are calculated and what they mean so that you can make an appropriate choice for the analysis you are performing