Legacy MES – When is the right time to jump out of the water?
The Boiling Frog Syndrome

Legacy MES – When is the right time to jump out of the water?

The boiling frog is a fable describing a frog being slowly boiled alive. If the frog is put suddenly into boiling water, it will jump out, but if it is put in tepid water and then brought to a boil slowly, it will not perceive the danger and will be cooked to death.

The analogy is self-explanatory. Year after year, many companies, especially those that implemented MES systems many years ago, see the global maintenance effort increase and the competitive advantage of their digital solutions decrease.

In terms of maintenance effort, not necessarily the original MES system, as this is often crystallized or contained in formaldehyde, but the entire solution that has been developed around the original MES, in hundreds or thousands of patches to overcome its limitations. Furthermore, any more significant change or new business idea that deviates from the standards for which the solution was designed has a terrible handicap.

It's all about the debt

To explore this, I’m hereby using the concept of technical debt. Despite existing for a long time, it was only in 2016 that the academic community agreed to define it as "a collection of design or implementation constructs that are expedient in the short term, but set up a technical context that can make future changes more costly or impossible. Technical debt presents an actual or contingent liability whose impact is limited to internal system qualities, primarily maintainability and evolvability".

According to Deloitte, there are three main types of technical debt that adversely affect organizations.

Design Debt: Arises when IT doesn't follow a holistic approach to architecture and solution design. This can be the consequence of an expedited design phase or by the addition of applications or modules to the solutions without taking into consideration a holistic design. This debt significantly adds risks to future innovation and can lead to governance issues within the IT department.

Software Code Debt: Comes from poorly written, complex, obsolete, unused, duplicated, or untested code that works but may not be up to standards for automatic testing or future development.

Infrastructure Debt: Results from aging IT infrastructure components essential for applications and services. As organizations grow, the effort to maintain and update these systems timely becomes more challenging due to limited time and budget.

Technical debt - analogy to financial statements - source: www.purepower.com/blog/what-is-technical-debt

The debt of Legacy MES systems

What is interesting about legacy MES is that they accumulate all three types of debt. Manufacturers in high tech areas of production, such as semiconductor, were early adopters of MES technology. In fact, they adopted this type of system even before the term MES was coined.

Critical Manufacturing is now headquartered in a location that was once a Texas Instruments semiconductor factory, installed in the 1970s. In what was undoubtedly a pioneering use at the time, the factory used an MES system with VT100 style terminals with a satellite connection to the mainframe systems where it ran. Therefore, while the role of MES systems in industrial digitalization processes is being discussed in numerous industrial segments, it is worth remembering that some industries have been using them for almost 50 years. The main reason for this adoption was the need for control and traceability.

Texas Instruments plant in Portugal (TISEP, 1974)

But this is where one of the main problems lies. Many of these industries continue to use productive solutions, often created 20, 30 or more years ago.

Existing commercial systems, despite some configuration possibilities being quite advanced for the time, quickly became insufficient to keep up with the evolution of the business. There were not many proven alternative solutions back then, so the most frequent option was to extend the solution with patches or home developed peripheral systems. Of course, these in turn end up generating additional issues, resolved with more patches and additional systems.

This new compound solution became a contraption, which over time significantly increased the types of debt. First, design, because the solution was not thought of holistically from an architectural point of view. There are many points of failure, a lot of overlap and duplication of functionality, resulting in a lot of maintenance and support effort.

After that, from a code point of view. One of the most used solutions in the semiconductor segment was Workstream, whose extensibility was (and still is!) done using Cobol programming.

And finally in terms of hardware. It seems unbelievable, but some top companies that still use these systems have done everything and anything to maintain the necessary hardware (e.g. end-of-life HW like HP AlphaServer and HP PA-RISC 9000) to run legacy MES, including buying parts on eBay!

Global maintenance costs are therefore increasing but are often hidden or disguised between internal headcount and various external services, beyond the direct impact on production through planned or unplanned downtime.

However, despite all these costs, the biggest of all is the impact on the organization’s business by hindering the ability to innovate and to provide customers with enhanced products and services. Once again using semiconductor as an example, a Frontend facility requires several scenarios to be supported by advanced MES capabilities.

Critical Manufacturing has launched a Technical Guide called “Advanced MES Capabilities for Semiconductor Front-End Manufacturing”, explaining those scenarios in agnostic MES terms. It covers topics such as Experiments Management, Recipe Management and Chamber-Dependent Recipes, Run-to-Run, Reticle Management, Send-Ahead Wafers, Sorter Integration, Equipment Qualification, Equipment Dedication., Contamination, Sampling, Process Queue Time Constraints among other topics.

Send Ahead Wafer Scenario - Advanced MES Capabilities for Semiconductor Front-End Manufacturing

Further Reading: Critical Manufacturing - Advanced Scenarios in Semiconductor Manufacturing

How can this be done with Legacy MES? It can’t. And these functionalities are either covered by other applications, with all the problems of integration and data duplication, or by extensions to Legacy MES, in a tangle of patchwork.

So why aren’t companies jumping out of the water?

It’s about risk. And many executives or people in charge simply ask themselves this question: “it is worth it? We’ve survived so far… why should we take the risk now?”.

In fact, the risk is there. From “changing a tire while the car is still moving” to a “heart transplant”, analogies abound for what such an endeavor implies. MES has become so important in ensuring the quality of processes and products and optimizing resources, that factories that use it simply cannot live without it. Any downtime has a huge impact. High investment facilities need to operate nearly 100% of the time, with significant financial impact if they don’t.

But more than its core functionality, the MES often accumulates information system backbone functions, with all surrounding applications depending on the MES data and events. And that's why beyond the direct impact, stopping it really stops everything.

As its limitations have been overcome with patches or home developed peripheral systems over the years, very few (if at all) people know the ins and outs of these extensions. When done internally, they are poorly documented, and often no one has touched them for a long time.

These risks, of dependence and knowledge, result in large-scale projects. The resources needed, the time required for planning, preparing and executing and the risk mitigations make this an expensive and very time-consuming project.

With all those risks… can it be done?

Of course. It's been done countless times. In 2004 I was project manager of an MES replacement project in a high volume backend facility at Infineon. It was a large project with many people involved with significant knowledge in both manufacturing and MES systems, many of whom work at Critical Manufacturing today.

Infineon Semiconductor Backend (today part of AMKOR)

What we did then was nothing short of amazing. We migrated the solution using a dual system strategy, running the new and old MES systems in parallel for 6 months, until we disconnected the old one.

Since that time, and already in the context of Critical Manufacturing we have done many successful migration projects. In fact, among the dozens of projects we carry out annually, some are greenfield facilities, but the majority are brownfields, with some sort of MES in place. Sometimes commercial solutions, but many times homegrown ones, both of which are no longer able to meet the requirements of evolution and innovation.

The approach we used at Infineon, the parallel dual system strategy, is not always the most suitable one. In fact, there are many more cases of phased introduction or even big bang than the ones using dual system.

But there is no right or wrong migration strategy and the process selected is highly dependent on choosing one that fits the needs of the plant. This includes the level of risk that is acceptable and the amount of investment a business is willing to put into the project. Assessment of complexity of interdependence and interaction with other systems and applications; level of automation, and impact of downtime will also guide the choice of strategy for migration.

To start the reflection that must be done together, we use this white paper. It is necessarily high-level, because the devil is in the details, and there is no other way than a more in-depth conversation between those who need to do it and those who have already done it.

Guide to Successful MES Replacement - Migration strategies explained

Further Reading: :Critical Manufacturing - Guide to Successful MES Replacement

Conclusion - MES migration: If not now, then when?

One thing is clear though: the financial cost implicated by technical debt ultimately equates to the accumulation of financial liabilities. But beyond the financial impact, the accumulated technical debt is effectively impacting the organization’s current or future business by hindering the ability to innovate. And the consequences are increasing each time the company decides not to change its legacy system and extend its life for another period.

It is important to perceive this non-decision as a decision. And one with gradually more serious impacts. The water is boiling in many organizations, without anyone paying attention... until they get to the point where no further decision will be necessary...

While the requirements and specifications take a Bottoms-Up approach, deciding when it is time to jump out of the water must be Top-Down. A very well defined legacy system can take you far in businesses where the fundamental design of the product does not change (the system design for an MES for an automotive factory will fundamentally never change if the assembly line is mostly reusable - Infra debt, software debt or design debt won't matter much) but in places like job shops or bespoke manufacturing, the MES system keeps accumulating software debt and design debt. Infra debt for OT networks can be absorbed most often. So it really boils down to the leadership teams to rally the manufacturing teams to move towards modernizing and reducing tech debt depending on what they see as an opportunity or a threat. ROI is often over estimated or highly under estimated. It is only the leadership's will, available cash in the business and appetite to endure change management that drives such projects at the end of the day.

Grant Vokey

Retired and gone kayaking

11 个月

Although the timing may be considered a evaluation of ROI, the analysis of that info is no easy task. It also depends on how well the legacy system was deployed in the first place, any changes in a company’s manufacturing model that can (or can’t) be supported by the legacy system and concerns of cash flow that a company may or may not have. If the legacy system was deployed well, updating small changes is likely easier than a full deployment, especially if the company is tight on cash flow at the time. However, if there are limitations in the legacy system that is hampering growth, then breaking through that barrier may be very important. The other concern may be availability of support for the legacy system. At some time, manufacturing management is forced to bite the bullit and let go of their pet system. To really be able to evaluate this timing, one must have deep understanding of the company’s MOM program and the strategic direction of the company as a whole. No simple task in my opinion!

回复
Nelson Ferreira

Industry 4.0 Coordination at Bosch Thermotechnik Gmbh; Bosch Termotecnologia SA

11 个月

Francisco, Great article with very useful insights

回复
Koel Banerjee

Business Systems Analyst @Lycra | Expertise in MES Implementation

11 个月

Excellent article… the patch works done to “keep the lights on” are never long term solutions. Poor source control, loss of resources who has full knowledge of the system makes its even worse. The knowledge on legacy MES developed 30 years back is slowly dripping away. IT is the most taken for granted area for mfg organizations.

回复
Thomas Seubert

Manufacturing Execution Systems Engineer at Mara Technologies USA

11 个月

The answer is actually quite simple, but often ignored: it is based on the ease of the new deployment and its Return on Investment (RoI). Coupled with that is the support of the new system to the existing processes. If the new system requires major reworking of existing processes and technologies and requires considerable capitol, the frog will not jump. Period.

要查看或添加评论,请登录

Francisco Almada Lobo的更多文章

社区洞察

其他会员也浏览了