课程: IoT Foundations: Operating System Applications
Overview of data management and processing
课程: IoT Foundations: Operating System Applications
Overview of data management and processing
- [Instructor] The data management and processing are the core components of many IoT applications. The data-centric IoT can be viewed based on the IoT system architecture, which are at high levels, split into three parts, local or field networks, network edges and cloud. The sensing data is usually generated from a field network, and transferred within its own network, or between the field, edge and cloud entities. Having that in mind, data management is about managing data from its generation, collection, fusion to its processing, storage and retrieval. Although the data management functions can be mapped to run on physical IoT devices, we may view it beyond these physical entities, as the data management entities may differ. For example, we may need some metaware pieces running on devices, edge and cloud to accomplish some data management tasks. Well, what and how much data management functionality should be implemented is application-specific. For the simple applications, each sensor can just report the data to an endpoint and users can access the data from the endpoint. Therefore, the data management functionalities, such as generation, collection, storage and retrieval, are straightforward. For a complex application, data needs to be processed and fused out of data streams coming from multiple sources. Let's look at what they mean in OS applications. First of all, data is generated by the data sources, such as sensors, and once it's generated, another entity should collect and process it. In general, a data source needs to be exposed to be used by another data subscriber. In this case, you might use some standardized protocol on top of your OS to provide such discovery interface. With the messaging protocol stack, including in OS application, such as CoApp and MQTT, a data collection metaware residing in one device can obtain the messages published by the sensors and transfer them to another device or entity through a network. Since the data usually needs to be stored locally in a file or a database structure in a device, our application needs to use a file system or database driver for such purpose. For some time-series data that needs to be acquired from multiple sensors, we also need to have a mechanism of ensuring the accuracy of the data. For example, some inaccurate timestamps associated with a set of data may make a fresh set of data be considered as a stale one, thus data with inaccurate timestamps should be adjusted or filtered out, or else it may introduce errors for later analysis. If we consider the case of our swarm of sensors on moving devices such as micro UAVs or drones, the data collection process gets more challenging, as we need to deal with the possible connection disruptions between the drones and the timing synchronization. Data fusion is the technique of combining data from multiple sources and getting a better result than a single one. This can be implemented at different levels, such as at the single device, at the cluster head of a set of neighboring nodes, or at the gateway of our field IoT network. A typical example is the data fusion on a single device, which is equipped with multiple sensors, such as accelerometer, gyroscope, inertial and magnetic sensors. The data fusion or algorithm on the device can determine the high-precision device orientation in linear acceleration than the single accelerometer sensor can do. Another typical example is for industrial health monitoring of electric motors. The accurate prediction on failure of any motor may not only depend on the data from one source, but multiple ones, thus if we have our gateway that collects all data, the data fusion algorithm can be done based on the data collected from sensors on multiple motors. After a data fusion job is done, we may need some processing or filtering of data before storing it to a local or remote data repository. Sometimes you may need basic processing tasks done at the edge, where some data can be filtered out before transferring it to the data repository. Sometimes, the data repository at the edge, may only need to store a small amount of data after being processed out of the big amount of data generated by the sensors. Sometimes, you may need basic processing tasks done on the edge, where some data can be filtered before transferring it to the data repository. Sometimes, the data repository and edge may only need to store a small amount of data after being processed out of the big amount of data generated by the sensors. Some basic software components we'll need to add to our OS-based applications include math processing libraries and possibly some messaging protocol stacks that can help exchange the data between data sources, edge and the cloud endpoints.
随堂练习,边学边练
下载课堂讲义。学练结合,紧跟进度,轻松巩固知识。