Create Living FMEAs with Industry 4.0

Create Living FMEAs with Industry 4.0

Introduction

The goal for the chapter is this:?How can we make the FMEA Process:

  • Easier to perform?
  • More accurate?
  • More comprehensive?
  • With more context?
  • An ever-green, on-going part of the daily management process?

To get started, we will take a quick look at the traditional FMEA process in the next section.

FMEA Overview

A Failure Mode and Effect Analysis is a detailed quality process that lists out potential failure modes, the effects and causes of those failures and then calculates the overall risk with those failures. The team then attempts to find ways to mitigate the effects, eliminate the occurrences or improves the probability of detection.

No alt text provided for this image

FMEA techniques have been around for over 50 years. It gained widespread adoption thanks to the automotive industry and the QS-9000 supplier requirements. This standard requires suppliers to conduct product/design and process FMEAs in an effort to eliminate failures before they happen.

An FMEA can provide significant savings for a company by improving throughput and by reducing the internal costs of quality and external warranty expenses.

While an FMEA can be performed on a product or on a process, the principles and steps are the same. The objective for a product or design FMEA is to uncover problems with the product that will result in safety hazards, product malfunctions or a shortened product life. Process FMEAs uncover problems related to the manufacture of a product. This will be our primary focus for today.

FMEA Process

No alt text provided for this image

I’ve worked with many different customers who use slightly different process approaches for FMEAs. There are also standards defined by various governing bodies. However, they all have certain things in common.

For a process FMEA, it is necessary to define three things up front:

  • The Team – usually this is a team of 4-6 people that have some level of knowledge of the manufacturing process and expertise in diagnosing failure modes and their effects and causes
  • The Scope – What process will the team be investigating? What are the beginning and end points?
  • The Process Map – Once the scope is determined, the process steps should be defined prior to the event

No alt text provided for this image

Once in the event itself, the FMEA usually proceeds by listing out the following for each step

  • What is the function or purpose of each step (optional in some forms of FMEA)
  • The complete set of failure modes – how can that step of the process fail to achieve its purpose?
  • What are the complete set of effects for each of those failure modes (there can be more than one)?
  • What is the severity of that effect? How much will that effect impact safety, product quality or throughput?
  • What is the complete set of potential causes for each failure mode (there can be more than one)?
  • What potential prevention methods exist for that cause?
  • How likely is that cause to occur?
  • What detection methods exist for that cause?
  • What is the probability of detecting that cause when it does occur?
  • What is the risk management plan?
  • This can include ways to mitigate the effect of the failure, eliminate occurrences of the cause or to raise the probability of detection

This entire process is usually a long, time consuming process for the entire team. It often takes a full week or more of time to complete a thorough FMEA on a decent sized process.

Unfortunately, once they are completed, they are rarely kept up to date. Once the initial set of actions are completed, the FMEA will often sit in a drawer or on a hard drive, only to be pulled out during audits or other calamities!

In our next section, we will look at how Industry 4.0 can help.

How Industry 4.0 Can Help

How can Industry 4.0 help to make the FMEA process easier, more valuable or both?

Where Automation Can Help

No alt text provided for this image

Where can we best apply these technologies? Going back to the list of FMEA steps, the following stand out as potential areas where various solutions can provide value:

  • Generating and maintaining the list of failure modes, effects and causes
  • Generating and maintaining the ratings for severity, occurrences and detection
  • Expanding the solution space of potential actions

Combining Industry 4.0 and FMEA

Let’s walk through an example of how we can get some of the information for our FMEA from our systems.

No alt text provided for this image

We start with a simple FMEA worksheet. This example just uses the basic columns.

The columns for the failure modes and causes are highlighted. Where would this information exist in our current (or future) solutions?

One source of information could be from the PLC controller for the machine. From this controller, we can get a *lot* of information. We can trap fault codes that tell us the particular failure mode experienced by the machine. We can also track input or process variables that help us manually or automatically diagnose which causal factors were underlying the event. We can also track downtime, rejects, failed tests and so forth to trap the specific effect of this event.

Alternatively, we could also get operator entry on failure modes or even their estimates at the causal level. The PLC screen shown here allows for that type of operator input and we will see a more detailed example later.

Also, when a new failure mode, effect or causal is identified, it should be added to the list in the data collection, and the lists within the FMEA should automatically be updated with that information.

Ultimately, by tracking this information in our shop floor systems on a 24/7 basis, we can then feed this information back into the FMEA as updates to the Severity, Occurrence and Detection risk numbers. This keeps the analysis up to date with the process and allows tracking of the real impact of the failure modes and causes over time.

Where to Get the Data

Our next step is to look at where we can get the data for the analysis

As we will see, there is a massive amount of information out there already – it is a matter of accessing and analyzing it.

Data Originates at Every Step in the Value Chain

No alt text provided for this image

To show just how much data is available, let’s walk through an example first.

At one of our bakery customers, we were initially told that they did not have much data available for the project. When we walked the floor with the customer, we tracked how much data they were currently tracking – much of it was in Excel / Access or currently locked away in the PLCs, but it was there.

Just a few examples:

  • Data from Receiving Inspection
  • From inventory, they tracked ingredient dispensing by lot to meet the Food Safety Modernization Act
  • At Mixing and Dividing, the machine controller captured the process time and many different process variables; they also had a checkweigher
  • They also had a standalone SPC system in use at this station and others
  • At Proofing and Baking, they captured process variables such as the oven temperature, conveyor speed and more
  • At Final Inspection, they had a vision system that checked the color and other factors on the bread
  • At shipping, they had another checkweigher and tracked information about production against the work order

Throughout the entire process, they used spreadsheets to track non-conformances, scrap and downtime issues.

We worked with the customer to pull the information from the existing systems and to create quick web forms on top of SQL database to replace the spreadsheets. In an upcoming section we will see how this information was used to improve the process.

Where to Get the Data

No alt text provided for this image

It is hard to give a definitive list of systems where we can source the information we will need.

Unfortunately for this conversation – this varies with every customer. Different companies have different systems implemented in different ways.

But there are standard systems where we look for the data. The list below is not meant to be comprehensive, but these are the common places where we start to look for the data we can use in the process.

  • Industrial Internet of Things (also useful to access raw PLC data)
  • MES
  • Quality Systems
  • Supply Chain Systems
  • ERP
  • Employee Time Tracking Systems (e.g. – Kronos)
  • RFID Tracking

High level data such as scrap, rework costs and so forth can be found in ERP systems. ERP exists in just about every company we work with. The ERP system can also be used to apply business context such as costing to data we collect from other sources.

But the details of the process usually have to be found elsewhere.

MES systems and Data Historians contain massive amounts of data that can be extremely helpful in this process. Depending on the industry and the process undergoing the FMEA, these may be nearly sufficient in themselves.

However, for other companies Industry 4.0 systems can take this information to another level of detail. Industry 4.0 systems such as I-IoT can capture and provide information at any level of detail useful to the analysis. They are also almost infinitely flexible to fill any gap current systems may have in the data collection process. For example, additional sensors can be used in the process to detect causal factors that were previously impossible to capture.

Another note with I-IoT and sensors – for many companies such as machine builders, this is an opportunity to build these capabilities into their own products they ship to their customers. They can then have a live FMEA process with their own products in the field based on the telematics data they receive back from usage.

Additional systems such as Time Systems or RFID can also help to provide contextual information to the analysis.

There are many additional potential sources of data, as we will see in the next section highlighting quality data.

Where to Get the Data – Quality

No alt text provided for this image

When looking at using systems to provide this data, there are typically a large number of systems at each company that contain some form of quality data.

For example, we worked with a firearms manufacturer that asked us to provide a consolidated view of quality data across their facilities. Within their plants they had 57 different systems that contained quality data. It was nearly impossible for any single person at the company to know about all the systems, let alone use them for analysis.

Automating Root Cause Analysis

No alt text provided for this image

The next step is to focus on what type of information we capture from the process. One key is to collect information beyond simple performance data. Instead of looking at a metric such as first pass yield (FPY), it is important to know what is driving the rejects. Without that information, you can tell if things are getting better or worse, but you will not know why!

For the FMEA process – it is the “why’s” that are absolutely critical to know.

Once you start to collect it, there are many ways to organize the diagnostic data. One way to do it is on a fishbone (or Ishikawa) diagram. It can also be done through pareto diagrams with drilldowns, treemap diagrams, and many more.

The key is that you want to have that structure defined as you are capturing information from the process. Then it will be available for the analysis that you want to do later.

There are several approaches to this categorization. One is to have the operators enter information when events occur on the shop floor. Another is to capture fault codes and other data from the machines and defining a map from those codes to the diagnostic categories.

Another way is to use machine learning capabilities to automate the categorization efforts. We will discuss this in more depth in the section on Machine Learning.

Customer Example (Bakery)

No alt text provided for this image

This is an example from the bakery referenced earlier. At this plant, we combined automated data collection from the PLCs with data input by the operators to categorize downtime and quality events. We then designed a process where the team would be able to compare performance between the lines & shifts, identify trends and compare performance of the current period to past periods. We helped design the team meeting to quickly identify where the previous shift had issues and create focus on getting those issues addressed as rapidly as possible. We also looked to create standard work to drive consistency from shift to shift in how they performed. With this tool, we were able to measure our progress in that effort.

Where we really started to make huge gains was when we stepped down to another level of detail. For every instance of downtime or quality issues, we started using the existing system to track the Area of the issue, the Category of the problem and the Cause. Then we trained the continuous improvement team to use the tool to identify where the plants biggest issues were happening.

No alt text provided for this image

After just a couple clicks we can start to identify those big issues. In this example, we clicked on the biggest problem area of “Equipment Loss” and then we clicked on the biggest category of “Oven”.

The trend chart then shows some critical information

  • Most days we have very little downtime as a result of the oven
  • But then we see huge spikes of downtime on a periodic basis
  • So we immediately know we have a preventive maintenance issue here

After just a bit more digging in the detailed information at the bottom, we identified that the conveyor pins were failing after a number of cycles in the oven. We analyzed how often this was failing and put a preventive maintenance program in place to inspect and replace those pins. By doing so, we were able to eliminate the single biggest source of downtime in the plant.

As we saw in an earlier slide, these events can be fed back into the FMEA analysis to highlight the failure modes, the causes and the effects of those issues. The FMEA also gives us the structure to capture what actions to take in the event those causes occur.

Automatically Identify, Classify and Prioritize with Machine Learning

No alt text provided for this image

I mentioned earlier that we can utilize machine learning and artificial intelligence to automate parts of the FMEA process or to increase the solution space.

Amongst the many uses for machine learning is to improve the probability of detection. If we think about how we are monitoring for issues, sometimes it is not possible to directly detect that a particular issue is happening.

But it may be possible to look at other process variables or inputs to infer what we seek. If we can identify variables that are correlated to (or predictive of) that particular failure cause, then we can monitor those other variables to indirectly detect our causal factor. Statistical analysis and machine learning can help us identify which other process variables to use for this process.

Another key application of AI/ML is to automatically classify events into different causal and failure categories to reduce the load on operator input. Instead of having our workers try to determine the failure mode and/or causal, we can have the algorithm make that determination and simplify the reporting.

Broaden the Solution Space: Eliminate, Mitigate and Detect

Another big impact of Industry 4.0 is to broaden the solution space. For many problems in manufacturing, the fix will be some alteration to the physical equipment involved. Going back to the fishbone diagram, these fixes primarily relate to the “Machine” branch. However, when the problem is related to Man, Material, Method, Management or Environmental the fix is often process or system-based.

For all of these issues, the many technologies of Industry 4.0 can provide tremendous benefits. As we saw in the bakery example, we can identify and eliminate causes of downtime or quality events using these solutions.

In other cases, we can mitigate the effects when those issues occur. A simple example here is rapid notification of a downtime event and remote support through augmented reality solutions from maintenance for the operator to address the problem directly instead of having to wait for a technician to travel to the machine for simple fixes.

Finally, we also discussed the ability to improve our probability of detection through additional sensors, IoT technologies and machine learning algorithms.

The FMEA as a Living Document

The next area to cover is the FMEA as a living document. That is the central idea of making the document ever-green so that it never gets out of date. Currently the FMEA is a one-time project that is performed as a project, then perhaps updated for a while, but quickly grows out of date.

The intention for this new approach is for the FMEA to be something that continues to be live and updated.

Continuous Updates from Data Sources

No alt text provided for this image

The biggest step to achieve the FMEA as a living application is being able to collect information from different sources and utilize that to continuously update the FMEA analysis. In this approach, the FMEA app will have an ongoing integration with those source systems so that continues to receive information about the process.

In a traditional project-based approach, the data collection is performed and then manual entries will be made in Excel or a bespoke FMEA solution as the analysis is performed. The list of failure modes, effects and causes are entered manually. The RPN numbers are estimated and entered manually. The actions are done the same way – there is no closed loop established with the existing systems. Thus, as the manufacturing process changes, the analysis becomes stale and out of date.

To create our ever-green approach, we must have the FMEA application tied into the source systems themselves. The lists of failure modes, effects and causes should be common across any system that captures information about the manufacturing process. The RPN numbers can either be automatically adjusted based on actuals or a weekly/monthly process can be established to review the underlying data to make manual adjustments. But even in this case, the app should provide access to any data required to make those manual adjustments.

This closed-loop integration is critical to the success of this approach.

The FMEA Application

The final topic to discuss in going digital is how to display the FMEA application. Given the number of possible outputs, data sources and goals for improvement, there is no single recommendation on which approach to use. In this section, we’ll cover a few that we have seen successful with our customers.

Purpose-Built Lean Solution

No alt text provided for this image

One approach is to use a purpose-built FMEA system. These types of systems are specifically designed for FMEA analysis and have many, if not most, of the mapping outputs built into the software. There are several such packages in the market, but it is outside the scope of this document to get into the details of one system versus another.

One of the key factors to consider when looking at those systems is how easy it is to feed data into and out of the solution. Depending on how much you want to automate the criteria for assessing the RPN factors, this could be important. But it is critical to be able to keep the failure modes / causes here in synch with the data collection systems on the floor.

Business Intelligence Solutions

The next option in our list is utilize business intelligence (BI) systems such as PowerBI, Tableau or Qlik. These systems are very good at taking fragmented information across disparate sources, consolidating that information together and then providing cohesive views of that data.

These systems will also provide for ongoing data collection from those source systems. The downside with these tools is that none of them have the FMEA format pre-configured. It must be done as part of an initial implementation. However, once that initial process has been completed, the system should be reasonably easy to modify going forward.

Industrial Internet of Things Platforms

Our next solution in the list is an Industrial Internet of Things (I-IoT) solution. These solutions have similar advantages/disadvantages to the business intelligence options with a couple of key exceptions.

The first is that when a data source does not exist in the BI approach, another new system must be implemented to perform that data collection. When using an I-IoT approach, that system itself can plug practically any data collection gap that exists. In addition, the I-IoT system can act as a hub for data on the shop floor to persist data and also to integrate systems where necessary.

Where I-IoT systems typically fall a bit short is the analysis of historical information. While it is possible to do so in those systems, that function is much easier in a BI system. Some of our customers have utilized a blended approach between these two solutions in a very successful hybrid application.

Custom Software Development

The final approach we have seen with our customers is the custom software approach where everything is built from scratch. Some of the customers we have worked with have tremendous capabilities within their IT departments. A few of those customers had already put custom solutions in place for data collection. In those cases, it made sense for them to continue that development to add these visualizations and workflows in place on top of their existing code base.

Standard Work Integration

No alt text provided for this image

The final step is to standardize the solution. I cannot emphasize enough the key role of standard work here. When we think about standard work in manufacturing, the first thing that pops into mind is getting all the operators to work with the equipment in the same way every time. And absolutely that is part of the solution to many problems.

But we also have to think about the leader standard work. Setting the culture and making a solution part of the process means having the team leaders, supervisors, value stream managers and executives all be a part of the solution.

Where event categorization is being done partially or entirely by the operators, it is critical that standard work be built up around that process.

  • The operators must have standard work to assign failure modes / causes when events occur
  • Maintenance should have standard work to update those when they work on a machine
  • Supervisors should have standard work to review what is being reported throughout the day
  • Plant leadership should be reviewing that data on a daily / weekly basis, as well

We have seen operator input be very comprehensive and accurate means of tracking – but only when the review and use of that information is embedded deeply into the culture of the organization through standard work.

Summary

The key to this new approach is that you no longer think of an FMEA as a project. There is a project to create the FMEA in the first place. But once that FMEA is created, it is not a project that ends there. It becomes a management tool that is used on a daily / weekly basis to monitor and improve operations. There is more time involved up front to make this happen, but this one time effort can be leveraged across all the processes within the manufacturing facility – the system integrations only have to happen once.

Another insight that we have had in our experience is that the initial FMEA can be used to inform the ongoing data collection plan within the plant. This makes the rollout something of an iterative process where there may be a “light” initial FMEA done in the traditional manner to identify what data is going to be required over the long term.

One key to using the FMEA as an ongoing management tool is to use it to track actions taken to improve the manufacturing process. This can then be tied back into the results to see the real impact of improvement efforts and whether those efforts should be replicated on other processes in the facility.

The other big key for use as an ongoing management tool is to build the FMEA into standard work on the shop floor and for leadership. This is always true, but is especially important when manual tagging for reason codes is performed for events.

Finally, it is worth mentioning the potential impact of embedding I-IoT into your products. For example, if you are a manufacturing machine builder there can be enormous internal and external benefits to having this capability built into your machines.

Starting internally, being able to get the usage data from your machines in the field can be fed back into your design/product FMEA and used to improve future iterations of your design. It can also dramatically improve your customer service through reduced downtime events, proactive alerts, faster fixes and improved first time fix rates. Finally, it can improve sales with the promises of the product and service improvements, additional revenue streams for value-add reporting,?and improved perception of your product in the market.

Glenn Graney

Director - Industrial and High Tech at QAD

1 年

You are one prolific source of quality material! I think there is also benefit to your views for automotive tier customers who depend on FMEA during APQP where the challenge of deep history or machine connectivity in the pre-production stages.

要查看或添加评论,请登录