A Series of Unfortunate MTBF Assumptions
This image comes from Dictionary of French Architecture from 11th to 16th Century (1856) by Eugene Viollet-le-Duc (1814-1879).

A Series of Unfortunate MTBF Assumptions

The calculation of MTBF results in a larger number if we make a series of MTBF assumptions. We just need more time in the operating hours and fewer failures in the count of failures.

While we really want to understand the reliability performance of field units, we often make a series of small assumptions that impact the accuracy of MTBF estimates.

Here are just a few of these MTBF assumptions that I’ve seen and in some cases nearly all of them with one team. Reliability data has useful information is we gather and treat it well. 

Assumptions Around Use

Since we only know the shipment date to a customer and when they call, let’s say the unit operated full time from shipment till they called.

We suspect it takes time to actually transport unit from our factory to customer. It may even take time to install and place units into service. Yet that is unknown unless we ask some customers for information. Wouldn’t want to bother customers.

Also, let’s consider every unit is in use. No spares or stored units. Likewise not sitting in a warehouse or store shelf.

We do learn about some, let’s assume all failures, through a call or product return from the customer. Thus, we can only assume all other units are still in service, operating full time.

This set of assumptions tends to increase the number of hours we are counting as operating hours. That pads or helps MTBF (or reliability for that matter) look better than it actually is.

Spend the time to understand the time from shipment till the start of service. This may be a distribution from nearly immediate to months or years till placed into service. You should know this information.

Spend the time to understand the typical operating hours per day. Some items do work 24/7 while others only on occasion. Maybe you have different classes of customers that use your product in very different manners. Again you should know this information.

Only Real Failures Count

Have you noticed that sometimes a customer will call to complain that the product isn’t working and when you received the product back from them, it works just fine? Funny (strange) isn’t it. About 25% of product returns (varies by industry and specific product, of course) have no trouble found, or no fault found. Many of these products are then cleaned up and shipped out to other customers.

Sometimes the customer suggests a product is a failure when it is the wrong color (my ex-wife did this once). Or, it’s a failure if it doesn’t solve the problem they thought it should. Or, it’s a failure if they no longer need the product. Sometimes they call a product a failure when it doesn’t function as expected.

When analyzing returned products we want to know what to do different or better to avoid future product failures. The units with something missing or broken are great, we can get to a root cause right in the lab and implement design/process changes to fix it.

When the analysis finds nothing wrong do we question our analysis to make sure we are evaluating the product as the customer did? Rarely. The customer wanted the unit to operated outside on a cold day, while in the nice warm, clean, lab it starts just fine. Did we just miss an opportunity to improve product reliability? Probably.

If it is an ordering mistake, the wrong color, doesn’t do what we thought it would do, or doesn’t operate as we expected (where is the on/off switch…), are those failures? Did they return the product? If so, the customer called it a failure.

By not counting software bugs, ordering or use errors, or any group of claims or reasons for a product return, we incur the cost of the return and the potential permanent loss of a customer. It’s a failure that requires more than hardware related changes.

Let’s Smooth to View Trends

Smoothing data is a nice word to say averaging. While MTBF is technically an average, an average of averages ‘smooths’ a monthly or weekly reported MTBF value just a bit more.

Let’s say we ship products weekly and groups products made in a specific week into a cohort for our analysis. Let’s say the MTBF values are estimated for each week’s production, assuming the week’s products all go into service at the same time.

Each week we add another batch to the products in the field, and we age all other batches one week. Pretty soon that is a lot of weekly MTBF values. With normal variation, let alone active changes to components, design, or processes, the week to week variability of MTBF may cloud or obscure the trends represented in the data.

Let’s assume a rolling average of 3 or 6 months of data will helps us spot trends. Also, let’s assume there are not meaningful ramp/decline of production at the start, seasonally, or at end of production.

Do you see any problems with this approach? I’ll leave this one open for your comments on why smoothing may be an issue.

Summary

Our ability to gather and analyze field data is often a central role in our ability to understand how well a product is performing in the hands of customers. If we make a series of unfortunate assumptions during the gathering, interpreting, or presenting of the data, we are likely to obscure the very information we seek to understand.

Add your comment on why smoothing as described above is a potential problem. Plus, add some of the other unfortunate MTBF assumptions you have seen (and hopefully exposed and corrected!)


Fred Schenkelberg is an experienced reliability engineering and management consultant with his firm FMS Reliability. His passion is working with teams to create cost-effective reliability programs that solve problems, create durable and reliable products, increase customer satisfaction, and reduce warranty costs. If you enjoyed this article consider subscribing to the ongoing series at Accendo Reliability.

Hilaire (Ananda) Perera P.Eng.

Proprietor/Consulting Engineer at Long Term Quality Assurance (LTQA)

4 年

Traditional reliability predictions based on handbook methods using MTTF (MTBF) are inaccurate. When need to use MTTF, do not use single point estimates, use it with Confidence Levels. Prognostic Health Management (PHM) is more suitable for reliability prediction and remaining life assessment, since it considers actual operational and environmental loading conditions. Currently, research is being conducted to build-up physics-based damage models for electronics, obtain the life cycle data of product, and assess the uncertainty in remaining useful life prediction in order to make PHM more realistic. In the future, due to the increasing amount of electronics in the world and the competitive drive toward more reliable products, PHM will be looked upon as a cost-effective solution for predicting the reliability of all electronic products and systems. Prognostics is the process of predicting the future reliability of a product by assessing the extent of deviation or degradation of a product from its expected normal operating conditions. Health monitoring is a process of measuring and recording the extent of deviation and degradation from a normal operating condition. To learn and apply PHM methodology,?recommend?joining the CALCE Prognostic and Health Management Group ?https://www.prognostics.umd.edu/

回复
Marcos Paulo Del Passo

Diretor Industrial | Gerente de Planta | Six Sigma Master Black Belt | Especialista Lean

4 年
回复

要查看或添加评论,请登录

Fred Schenkelberg的更多文章

  • Accendo Weekly Update #487 March 2, 2025

    Accendo Weekly Update #487 March 2, 2025

    CMMSradio A podcast series by Greg Christensen All things CMMS, Computerized Maintenance Management Software, including…

    2 条评论
  • Accendo Weekly Update #486 February 23, 2025

    Accendo Weekly Update #486 February 23, 2025

    NoMTBF An article series by Fred Schenkelberg and friends A series of articles devoted to the eradication of the misuse…

    4 条评论
  • Accendo Weekly Update #485 February 16, 2025

    Accendo Weekly Update #485 February 16, 2025

    The RCA An article series by Bob and Ken Latino According to Bob, "I tend to write about all things Root Cause Analysis…

    6 条评论
  • Accendo Weekly Update #484 February 9, 2025

    Accendo Weekly Update #484 February 9, 2025

    Courses offered by Integral Concepts A set of courses offered by Allise and Steven Wachs More than just Applied…

    4 条评论
  • Accendo Weekly Update #483 February 2, 2025

    Accendo Weekly Update #483 February 2, 2025

    The Manufacturing Academy A set of courses offered by Ray Harkins and team Courses designed to teach foundational…

    2 条评论
  • Accendo Weekly Update #482 January 26, 2025

    Accendo Weekly Update #482 January 26, 2025

    Speaking of Reliability A podcast where friends talk shop Enjoy an episode of Speaking of Reliability. Where you can…

    1 条评论
  • Accendo Weekly Update #481 January 19, 2025

    Accendo Weekly Update #481 January 19, 2025

    Everyday RCM Short videos and some articles by Nancy Regan Reliability Centered Maintenance (RCM) is a time-honored…

    1 条评论
  • Accendo Weekly Update #480 January 12, 2025

    Accendo Weekly Update #480 January 12, 2025

    Articles Tutorials, comments, ideas, how-tos, etc. Readers of this newsletter know of the many contributors of articles…

    3 条评论
  • Accendo Weekly Update #479 January 5, 2025

    Accendo Weekly Update #479 January 5, 2025

    Courses offered by Industrial Metallurgist Taught by Michael Pfeifer We offer metallurgy training that teaches…

    1 条评论
  • Accendo Weekly Update #478 December 29, 2024

    Accendo Weekly Update #478 December 29, 2024

    Course via Accendo Reliability Time to plan what to learn next year The Course page on Accendo Reliability lists the…

    1 条评论

社区洞察

其他会员也浏览了