Industry Pi.0 - It's all about what you do with the data

Industry Pi.0 - It's all about what you do with the data

Not long ago, in a comment on an article I wrote on LinkedIn about Industry 5.0, that I consider to be a deviation from the necessary focus on smart manufacturing technologies and business models, someone brilliantly called the stage we live in, the industry PI or Industry 3.14159265.

It's funny because it's easy to memorize, but also because it has a huge dose of truth: anyone who has field experience in industry and in the application of technology can easily understand that, although with a lot of variability, we are closer to the 3rd industrial revolution than the 4th.

From Industry 1,0 to Industry 4.0

More than in the automation, control and traceability components, where this can be seen very well is in the industry's attitude towards data. It is therefore relatively easy to assess the maturity of a company from the way it collects, processes and uses the data produced.

Many companies still live in pre-Industry 3.0 stage, living in an almost absolute obscurity. The factory has little or no data, much of it collected manually, or through relatively rudimentary applications.

In stage Pi, probably where most companies are, there is a set of software solutions in place, such as ERP, MES, QMS, and with some automation solutions. It is common at this stage to have data collected from machines, but these are stored in silos, in isolated files or databases, which eventually someone will try to analyze.

Tell me what you do with your data, and I’ll tell in which stage you are

Trying to evolve from the Pi stage (perhaps in a Pi++ stage ??), there are companies that seek to interconnect these solutions, and begin to try to solve the problems of disparity in information sources. Furthermore, they already understand that one of the greatest assets they have is data and therefore seek to centralize information in data warehouses.

In a maturity stage already on the way to Industry 4.0, there are companies that have already realized that they need to have centralized data collection and storage solutions from various applications. This is when we see companies looking to create centralized data lakes that they imagine will not only finally become the single source of truth, but that promises to be the opening door to the artificial intelligence “el dorado”.

And here a new phase begins, often called pilot purgatory. Many companies are embarking on machine learning and data science initiatives with quite disappointing results. Lured by these much-hyped initiatives promising great insights and predictive models allowing the company to boost its performance. Unfortunately, more often than not, solutions end up being inefficient, costly, and difficult to scale.

Most of the time it is not due to lack of data – many manufacturers are gathering loads of data and sending it to central data lakes. But the problem is then that data scientists spend most of the of time creating data sets and cleaning data, not running advanced algorithms to uncover valuable insights.

With AI everywhere, why is it so difficult to use it in Manufacturing?

The title of this section says it all. We’re surrounded by AI in everything we do, lately with LLM’s becoming omnipresent. But if this is case, why isn’t manufacturing leveraging it yet?

The challenge is that unlike what happened in the past, to truly extract value from data, organizations must employ a combination of three roles: data engineers, data scientists, and domain specialists. This multidisciplinary approach is crucial for overcoming the "pilot purgatory".

Data engineers are responsible for building and maintaining the infrastructure and architecture that allow for efficient data collection, storage, and access. This includes setting up databases, data warehouses, and data pipelines. They ensure that data is available, reliable, and correctly formatted for analysis. Without robust data engineering, organizations can struggle with data silos, poor data quality, and inefficiencies that impede the scaling of data initiatives.

Data scientists on the other hand analyze and interpret complex data to help make informed business decisions. They use a variety of techniques from statistics, machine learning, and predictive modeling to uncover insights and patterns within data. Their expertise is vital in turning raw data into actionable insights. However, without proper data infrastructure and domain knowledge, their ability to deliver meaningful results can be limited.

Finally, domain specialists (aka subject matter experts), have deep knowledge and understanding of the specific area or industry the organization operates in. They provide context to the data and help in formulating relevant questions and interpreting the results in a meaningful way. Without their input, data analysis can lack practical applicability or miss crucial industry-specific nuances.

There’re many references to the need for these roles and the need for them to work together, in a topic that is not new at all. As an example, you can find below a schematic view of these roles, taken from an HP blog post, “The New Data Science Team: Who's on First?” in which the author funny enough says that “AI is too important to leave to the data scientists alone”.


Overlapping roles of data scientists, data engineers, and domain experts,

To get rich from your data, you first need data enrichment

Once again, this title says it all. The key to moving beyond pilot projects is the collaboration between the three roles described before. And one of the main topics they need to collectively resolve is data enrichment and contextualization.

The idea is not new and not specific to manufacturing. Yet, it seems most people simply forget to analyze how other industries resolve their problems and apply similar solutions. In fact, data enrichment is a common practice in a lot of industries.

In Marketing and CRM, companies start with basic customer data like names, email addresses, and purchase histories and enrich them with additional information such as demographic details, social media activities, or browsing behaviors, to gain a deeper understanding of their customers; in financial services, customer data is enriched with investment history and transaction patters to help understanding the customer’s financial behavior; E-commerce platforms enrich their customer data with browsing patterns, product preferences, and feedback or reviews; in healthcare, patient records are enriched with lab results, genetic information, or lifestyle data, so that better diagnosis and treatment planning can be done.

And in manufacturing it should be no different. If for instance the target is to perform predictive maintenance, then operational data from machinery needs to be combined with machine information and specs, maintenance history, current material being processed, program being used, etc. In terms of quality, sensor data or other quality parameters collected from the equipment can be enriched with specific production batch, product specifications, machine settings, etc., to predict quality or pinpoint the root cause for the quality issues.

What we’re adding is data context. And data context are all the meaningful relationships between data sources that support the use cases.

Almost every company of a given size that I deal with have some sort of a data lake. Or at least a data lake project. And in vast majority of those cases, they are storing the diversified data in separate areas of the same “data lake”, with no contextualization.

And this is where MES comes into play. I don’t know of any better contextualization source of manufacturing information capable of enriching otherwise uncontextualized and feature poor equipment or sensor level data, than Manufacturing Execution Systems.

Clean up your data lakes

The idea of using MES as not only a contextual layer, but also as the main structure of a Common Data Model is tremendously important and I am convinced it will become a game changer in our industry. I will deal with it in another article, but for now, I would like to simply pass on some of the reasons why Data Lake projects are not yet successful in manufacturing.

Out of luck, I found a great article from cognite “Clean up your data lakes: Data contextualization in the manufacturing industry (cognite.com)

It’s already from 2020, and although we likely advocate a different solution to the problem, the findings are spot on, and I would not have said this any better.

First, for many data lakes, metadata is an afterthought.

Metadata is often an overlooked aspect of many data lakes. Although raw data is theoretically accessible for a wide array of immediate, potential, and not-yet-discovered uses, proactive management of metadata typically takes a back seat.

“Raw data — absent of well-documented and well-communicated contextual meaning — is like a set of coordinates in the absence of a mapping service. Those lucky few who intuitively understand the coordinates without a map may benefit. For all the rest, it’s the map that provides the meaning. Without a map, coordinates alone are useless to the majority.

Second, data lakes lack contextualization.

There's a notable absence of contextualization in data lakes. While certain applications may utilize raw data effectively, the majority require data that has been further refined through contextual processing. This can include aggregation, enrichment, or modifications from machine learning.

"Data lakes that only store data in an untransformed, raw form offer little relative value. These vast amounts of expensively extracted and stored data are rendered unusable to anyone outside the data lake project team itself.

Third, liberating the data to democratize insights

Many mid-sized manufacturing companies, primarily dealing with IT data, might find a basic data catalog sufficient. Yet, large operators managing a combination of operational technology (OT) and information technology (IT) data—especially in environments rich with IoT data and complex brownfield data scenarios—require robust, enterprise-grade solutions for data contextualization.

By effectively contextualizing data, businesses can unlock substantial immediate benefits and time savings in industrial performance optimization and advanced analytics efforts.

"By liberating data from their silos, defining the relationships between them, and making it all available in the cloud, manufacturers create a foundation on top of which they can build both advanced and low-code digital tools that make insights available across the organization, enabling remote monitoring and diagnostics. This lets engineers focus on solving operational problems, improving existing products and services, and developing new solutions."

Michael Deng

Grid Software ASIA Commercial and APAC Alliance Leader GE Vernova

7 个月

Manufacturing also brings up something new - Pi.0 indeed is an interesting way to address the criticality and frustration of data management in this sector. From Hadoop to Azure/AWS, data lake doesn’t really solve real problems, only because the diversity of subsectors and endless verticalization which hinder the universal data standard and best practice??. I do belive there is a reason why data fabric surges up as the main stream architecture solving large part of the challenges. Federation/virtualisation has been embedded into the DF methodology based on meta data / cataloging so no actual data sources will be touched. But Pi.0 carries on itself to the deeper granularity. …good thoughts and sharing.

回复
Hammad EL FILALI

Digital, Data and AI from strategy to execution | ex PwC | ex Deloitte | Data Mesh Trainer and Expert| VP- Global Head of Capital Market Data Centre of Excellence at Crédit Agricole CIB

7 个月

Francisco Almada Lobo thanks for this article, I’m not sure the data lakes are nowadays the right response to enable Data valorisation including in the industry field. The data mesh is the new paradigm to take the best benefit from data and make the industry state of the art! Happy to discuss this further with you

回复

Interesting piece, Industry PI reflects the reality of many manufacturers being closer to Industry 3.0 than a truly data-driven Industry 4.0. Looking forward to your next article Francisco Almada Lobo!

Craig Pinegar

Solution Architect | SCP | SCE | ERP | EPPM | MES | PLM | FSM | WMS | TMS | DQM | MDM | iPaaS | Analytics | Director | Investor | Husband | Dad

7 个月

Really enjoyed this piece, Francisco Almada Lobo!

要查看或添加评论,请登录

Francisco Almada Lobo的更多文章

社区洞察

其他会员也浏览了