Using Data Models To Clean Up Your Digital Data Dump
In the early days of the Cloud, there were hundreds (yes, literally hundreds of companies) that would grab some data (OPC UA data) from your production system, store it in their Cloud and
say “There you go. Now you can optimize your production processes!” It was a complete scam in my opinion equivalent to the snake oil salesmen of the early 19th century.
Unfortunately, a lot of companies followed that kind of path. Because it was easy to move data to a Cloud system, they did. It was “I can so I should.” What they didn’t realize was that, in the end, they had created a digital garbage dump that looked like this:
In this scenario, there is effectively no way to make wide use of that data within a manufacturing facility. It can only be used by a few skilled technicians who have a prior understanding of the data format and context.
This is shocking to IT folks when they get involved in industrial data collection. IT folks often come from a different planet than their manufacturing colleagues. Everything in the IT world is very well standardized. They have the benefit of a limited set of devices (printers, switches, routers, Windows PCs…etc.) that almost all follow a set of known and well-developed standards. That’s a key purchase requirement for these devices. They integrate well into the IT architecture which isn’t all that different from corporation to corporation.
With that background, their heads explode when they learn about factory floor devices and connectivity on the factory floor. In manufacturing systems, not only is every production system different from every other production system, but many devices are unique, have specific requirements and follow one of many different connectivity standards, if they follow any standard at all.?
WHY DATA MODELING
Effective data modeling brings some order to this chaos. It enables the efficient use of industrial data by different systems operated by different kinds of users, each with unique needs. It means your team can make decisions based on trusted data that they believe to be complete and correct. It also provides the means to manage and secure that data over its life cycle (data governance).a
A well-designed data model standardizes how data is stored, its structure and its representation. Data in an effective data model has the context and metadata that enables the data to be fully understood and used properly.
Context is a defined, common structure for the data – common units and data types – with Metadata. Metadata is additional data that provides information such as when, where and how the data was gathered. Metadata makes data easier to locate, understand, use and manage. Together, context and metadata provide for data governance, the processes and standards for gathering, storing, processing and disposing of data.
领英推荐
BUT WHAT IS DATA MODELING?
A data model can be expressed in a simple name-value pair or in a complicated structure like the OPC UA data model in the adjacent figure. A simple data model like [SPD35,22.5] is a data model though an entirely inadequate one. It fails to identify the source device, what kind of data is being reported, when the data was collected and what the units are. More importantly, it isn’t related to any other data in the system which makes it less useful to consumers of the data. Simple data models are only valuable to consumers who have prior knowledge of the structure and content of the data.
Often, data models are more complicated, but they don’t have to be. A UDT (User Defined Tag) for a ControlLogix PLC can be considered a data model. UDTs bring together a set of related data values that can be moved and processed as a unit. ?A UDT is a shortcut for working with related data but it’s usually insufficient as a data model. There is little in the way of context provided by most controllers. A UDT delivered as a data model to a higher-level system would leave a lot of questions unanswered and it would only be marginally valuable to consumers of that data model.
There is no generally accepted methodology or standards for modeling data. There are some tools and there are some standards but most applications do it uniquely depending on the needs of the applications which are to be supported.
CONCLUSION
Manufacturing data models are needed to share manufacturing data across enterprise and Cloud applications with consumers who have different knowledge and different applications for the data. These users need to be able to quickly understand the data, its source, structure and the underlying representation.
More and more tools are becoming available to meet this need. Some SCADA systems and some manufacturing databases provide this functionality. These systems are often impressive, but they often aren’t designed to and have limited ability to share that data in an unrestricted way with lots of other systems. These tools are both data collectors, modelers and consumers of the data. They aim to provide a solution, not collect data for distribution.
A tool that provides just this sort of functionality is the Intelligence Data Hub from HighByte (https://www.highbyte.com/). This tool collects and integrates data to improve data quality and reduce time spent preparing data for use in the Enterprise and Cloud applications. HighByte calls their approach Industrial DataOps and is modeled after the DataOps used in Software Engineering processes. It is comprised of the practices, processes and technologies that combine, integrate and model data to improve quality, speed, and collaboration to widely share that data across the enterprise.
I expect to see more and more tools that provide this sort of functionality in the future, eliminating the digital garbage dumps we see today.
John
(From the Real Time Automation Best Darn Newsletter)