To a more green IoT with better data management and Edge processing.
Nicolas Guilbaud
Helping teams to implement cybersecurity solution on their electronic product.
Since beginning of 2000's and the start of the internet of everything, our industry has been focused to get data and value to data. Any devices personal and professional became progressively connected to the internet and pushing more and more data to data centers. This centralized architecture was really a big opportunitiy to have all the data at the same place. This definitely fuels the artificial intelligence and makes possible the development of all these fantastic algorithms which helps people on driving cars, monitor equipement with condition based monitoring application, and so many useful applications...
It's now an evidence that our infinite world, which was in our mind at beginning of the century, is not anymore there. We are now in a constrained world, because of limited energy, limited water... We now have to start to think about how could we improve our way of working and start to look at if we process our data in the more efficient way? Here comes the concept of dark data, data not useful for any application or usage, these dark data are still stored in the cloud and transmitted over telecom infrastructure. According to some computation [1] [2] this data is up to 90 % of the overall amount. It means that only 10 % of the data transmitted are really useful. May be it's time to think differently the data infrastructure ? In the following lines below we will review some possible solution to reduce this amount of data.
1) What data do I need ?
The first point seems very trivial, but define clearly at the beginning of the project, sprint or product increment, the use cases and so the data flow requested to implement this use case. Avoid pushing useless to the cloud, just because it's available on the device. Doing that list will help you as well on privacy topics (aka GPRD) and security thread analysis.
2) Do I still need this Data ?
Once the data is in the cloud and we know that our application needs it, doesn't mean that the data will be usefull during all the lifetime. For instance, if the goal of stored data is to print a hourly graph and we store at the second rate, may be could be interresting to store only at hour rate because anyway people will not be able to see the difference on the graph and in the meantime we will save x60 memory. Retention policies on timeseries database like InfluxDB [3] or Druid [4] are nice tools to optimize that. It permits to create automatic rules to compute mean, remove samples, keep only max or min values or any complicated rule you can imagine.
3) Keep your data local
There is a lot of technology evolution to improve the energy efficiency by replacing hard drive to SSD which improves the energy efficiency [5], also solution to improve processor energy per watt. But in the meantime the number of data stored in the cloud increased year after year; and even if cloud is powered more and more by green energy for de-carbonation, the cooling is still done using water and transmitting energy to the cloud is still consuming energy. Thinking a new architecture paradigm is so needed: Edge computing.
In Edge Computing data are computed as close as possible of the source of the data, to ensure that only usefull data are pushed to the cloud.
Serveral solution are possible, like putting the edge processing like the artificial intelligence inference in a gateway.
The 5G standard permits also to host the Edge directly in core network of the operator through the MEC feature. The MEC is well standardized by 3GPP and several telecom operator like Verizon [6] in US are already supporting these services.
领英推荐
In case of using local connectivity between sensor and the gateway, we can't rely on this MEC technology, an the gateway has to support its own edge processing framework. To avoid breaking the (good) habits of the cloud developpers, it's quite important to keep the best practices. It means that these Edge processing framework has to support containerization and micro-services type of architecture which is totally possible on modern embedded linux.
We can push the paradigm further, why not pushing the computation directly in the sensor ?
At the beginning of the article we were talking about our world full of constraints, the issue with edge computing is that it's less scalable than a pure serverless architecture in the cloud, the edge computation power is limited, but it's less and less true. Firstly, for AI application, more and more processor have now a dedicated AI accelerators (NPU / TPU) with very good power efficiency of few mW / TOPS. Secondly, embedded processor are now using small technology nodes for 16 nm for the microprocessor and 40 nm or even 28 nm for microcontrolers, it means that the memory (close to MB of RAM) of frequencies (close to GHz) can be higher and so hosts more complex processing. Also these architecture relies on very efficient memory architecture, where the code and the data are very close to the processing unit (inside the same chip), so access to data to process it is more efficient.
In this article we analysed different solutions to improve our energy footprint:
Refences:
[1] Dark Data: https://en.wikipedia.org/wiki/Dark_data
[5] AWS sustainability page: https://sustainability.aboutamazon.com/environment/the-cloud?energyType=true
[6] Verizon MEC offer https://www.verizon.com/business/solutions/5g/edge-computing/public-mec/
Sales and Marketing Director - IoT / Smart City Platform @ Citylinx | New Business Development
1 年Nice article Nicolas, with a strong focus on the efficient use of data, but you just forgot a 4th point in your conclusion to have more #sustainable and #energyefficient #iotsolutions ??: "4. Use the #power of light with ASCA or other #energyharvesting solutions for your #iotdevices when possible"