Spacetime Data Hub Technology Connecting the Physical and Digital Worlds
The advancement of IT technology has enabled people to project physical spaces into virtual spaces known as “digital twins”. In a digital twin, the virtual space is identical to the real-world space. Buildings, machinery in factories, weather, and more elements are implemented in the virtual space. To achieve this, data is an essential part of it. Data for constructing the virtual space is collected from the real-world space. In digital twins, data is collected through LiDAR or 3D scanning technology to reconstruct real-world space into virtual space. Data to define the environment of virtual space or the movements of virtual objects is collected through IoT sensors. Various statistical data, performance information of factory workers, driving records of drivers, and other data are refined and processed before being provided in the virtual space.
However, virtual spaces also can generate diverse data that is difficult to collect in real-world space thanks to the simulations that are taking place. Future prediction data is also generated through artificial intelligence technology. These simulation results are visualized by integrating with technologies such as AR/VR, and virtual objects are sometimes projected into the physical space.
Digital Twin and Spatio-temporal Data
According to Professor Tobia Lakes of Humboldt University in Berlin, about 80 percent of the world's data possesses attributes of time and space. In other words, most data generated in the physical space has spatio-temporal characteristics. Especially, when data for simulations is collected, it tends to have a strong dependency on location and time elements. Therefore, to efficiently store and process data in a digital twin, it is crucial to understand the nature of the spatio-temporal data.
Firstly, all data has a lifecycle. Even the 3D maps constituting virtual space need to be updated as the structure of physical spaces changes once every time. IoT data is sometimes used in real-time but can also be used for statistical or analytical purposes. Thus, all data has a moment of usage, and analyzing the period of past data is crucial. Depending on its age, it is necessary to categorize data as Hot, Cold, or Warm and allocate separate storage space for each type of data.
Secondly, all spatiotemporal data can be categorized into fixed and moving data. Fixed data is collected continuously from a single point, such as automatic weather observation stations or sensors related to equipment in factories. Analysis of fixed data is usually conducted based on the collection point, and time series analysis is commonly used. Anaysis of spatial correlations is frequently performed because fixed data is collected from a specific location. Moving data is collected from moving objects, such as vehicles or drones, by attaching sensors to them. As moving data is collected along the trajectory, both the collection time and location are frequently changed. Path analysis and methodologies like reinforcement learning are often used in the field of artificial intelligence for moving data analysis. Depending on the analysis or utilization method, the storage method for fixed and moving data needs careful consideration.
Spatiotemporal data includes structured and unstructured data. Most data collected through IoT sensors is structured data. However, data such as point clouds or image data for constructing three-dimensional space falls under unstructured data. The format of spatio-temporal data plays a crucial role in selecting a data repository.
Real and Virtual Spaces Connection Through Data Hub
The core technology of digital twins lies in the mutual interaction between the physical and virtual spaces. To achieve this, a pathway connecting the physical and virtual space is needed. Data from the physical space collected through IoT sensors is synchronized with the virtual space, and the results obtained in the virtual space are reflected onto physical objects or people. The pathway that supports the interaction of digital twins is the Information Processing Layer.
The Information Processing Layer is the layer where the actual data interaction occurs to synchronize the two spaces. This layer's main roles are in data storage, processing, and data mapping. Firstly, all data generated in the real-world space and virtual space is stored and managed in the Information Processing Layer. The processing functions in the Information Processing Layer include data collection, preprocessing, analysis, mining, data fusion, and more. Data mapping supports synchronization between the two spaces through correlation analysis and time series analysis of data between the two spaces. It is natural for companies and research institutions dealing with digital twin technology to show interest in technology related to data lakes or data hubs that connect the data pathway between the two spaces.
A data hub is a platform technology with applications in various fields that enables data collection, storage, management, processing, and analysis. One notable example is the Data Hub by Hitachi which serves as a foundation for building smart factories based on digital twin principles. It stores data produced in the real-world space, such as operational technology (OT) and information technology (IT) data, and enables its utilization in the virtual space. To achieve this, Hitachi has established a data hub capable of collecting, processing, and storing data.
The key functions of Hitachi's Data Hub include collection, refinement, storage, and management. It is designed to collect various types of data, manage the processing flow through a user interface (UI), and consider scalability for large-scale processing. Additionally, it establishes a data lake for storing and managing large volumes of structured or unstructured data.
Center of Digital Twin: Data
Digital twin encompasses a variety of data, generated in both physical and virtual spaces, and exchanged for interaction. To facilitate this, the information processing layer collects, stores, and manages data. It also plays a crucial role as a pathway for interaction between the real world and virtual spaces. Efficient handling of diverse spatiotemporal data in digital twins requires spatiotemporal computation functions from storage to analysis and simulation throughout various stages of digital twins. Therefore, data processing and storage technologies capable of handling vast and diverse types of data are essential and indispensable elements in the realm of digital twins.