Architecting for Dynamic Analytics in IoT
Denis Canty
Experienced VP/CTO | Graduate Economist | AI, and Software Leadership | Speaker | Fractional CTO
Quick notes
1: Architecting solutions in IoT is a complex game, with partitions between cloud, edge gateway and devices causing many challenges for developers.
2: Analytics will play a crucial role in IoT, and analytics applications should be designed to be dynamic and to function across distributed compute topologies using cascading design paradigms.
3: Analytics applications developers should understand the difference between “insight from data” and “impact from data” for their customers.
The expected flood of data from billions of connected devices is raising many challenges for how IoT solutions will be architected. Common design paradigms from device-to-cloud will allow more flexibility on how the compute will best be utilised for machine learning. A big challenge is where do we place data analytics: on device, at the edge, or in the cloud? Why does it have to be in one place - distributed computing paradigms must exist for IoT enablement.
Just what role does data analytics play in the internet of things? Whilst there is no single definition of IoT, it is good to define IoT from a Machine Learning (ML) perspective – “Applying algorithms to data from smart connected devices that leads to process (industrial IoT) and life (consumer IoT) optimisation”. We have been using data for decades in commercial and non-commercial applications, and one could suggest it just passed the “hype baton” to IoT. In fact, one could easily suggest that IoT was the train that data analytics was waiting all these years for. Billions of devices producing data. A marriage made in heaven.
Up to now, a lot of the use cases or customer challenges that data analytics have helped solve have been pretty static in their application area. You had a set of data that described a particular process, a length of time that the data spanned, and some form of process or time efficiency one was trying to improve. Enter the data scientist. Rinse, repeat. Even the slight evolution to predictive analytics still looked at a historical dataset in this manner to recommend future performance.
Where the challenges lie
Two frequent problem statements from developers working on data analytics use cases in IoT are 1: “I can’t place any analytics on the device or gateway as it just doesn’t have the resources to cope with the amount of data” and 2: “We need to store all the data”. These are not necessarily true.
Looking at 1, if developers could look to classify the data into two buckets at edge and device, namely 1: useful data and 2: use later data, then it can begin to perform local data reduction by pre analytics. This will ensure we can minimise data storage and transfer rates, and free up compute for device based analytics.
Now focusing the lens at the second challenge above, the first step for any data scientist after getting a data set is to cleanse it. This by its very existence should suggest that we don’t need all the data to make the decision required for business impact, as there is still a lot of junk data being generated by these IoT devices. Why would anyone want temperature data every second if the temperature never changes for 2 hours? The expected flood of data from billions of connected devices is raising many challenges for how IoT solutions will be architected right now. Developers need to be careful in wanting too much.
Introducing Haze Computing
IoT gateways are now an essential part of IoT applications. But where does the edge begin and end? Normally, one would associate the edge with being the gateway in your IoT application. However compute exists right from the devices all the way to cloud, so why should the gateway be treated any differently? In fact, in lots of cases the pooled compute of the devices that are connected to the gateway can exceed what is available at the edge gateway. The challenge is how devices are configured in M2M, and also how the many different protocols can be supported. It is predicted here that the classic gateway will be squeezed from above by cloud and below from devices, and it may even become distributed by design.
Design for IoT exists for the network of devices, edge and cloud, but not holistically. The holy grail of IoT is to be hardware, operating system and network agnostic. In order to get to that, an essential step will be create common design paradigms from device to cloud which will allow more flexibility on how the compute will best be utilised for data analytics to lead to true business impact for customers.
What is being proposed here is to create a dynamic model for your analytics applications, which I name presently as haze computing, where you begin with a pooled view of your resources. Each DA app that you build analyses the local and global compute available to it across cloud, edge and device(s), and the haze data managers aggregate and design how and where analytics take place in a dynamic fashion.
By being a little more clever at the data source, one can both reduce the amount of data being kept locally and pushed to cloud by designing a messaging service from cloud to device that serves a series of data managers (DM) that are in sync and take priority over other messaging communication for IoT. Each IoT application would have its own cloud DM, device DM and edge DM. Their purpose is to stream and optimises the value of the data. Having this type of architecture will ensure you can expand on your applications and services that can be extensible across cloud, edge and device.
This holistic design approach for IoT shown at a high level in figure 1, has many advantages:
Scale
The architecture is fluid and dynamic, and this will ensure one can scale their architecture depending on the dynamic partitioning points that are shown in figure 1. For example, certain applications may not require cloud infrastructure at the beginning, but still design in a minimum amount of cloud resources and connectivity, so that the developer can then turn on cloud on demand of the application, without an architecture redesign.
Security
The single view per IoT application also ensures that security can be better managed across device, edge and cloud. Security and privacy are still the main concerns for IoT practitioners across the industry. Applying a more holistic architecture design makes implementing next generation security topologies much easier. One such topology is blockchain, of bit coin fame. If you consider that the cloud application can act as the parent blockchain that can spawn multiple sidechains at the edge, which can in turn manage device based sidechains, then you can create an ecosystem that is automatic, based on consensus, and fully auditable.
Energy Efficiency
If developers can become more conscious and implement green computing paradigms at the haze level, where energy usage rates are much more visible, then best practices can be designed and built much easier. Simply put, the more complex the algorithm, the more data we have to process, store and transfer data means the more energy that will be consumed.
Complexity
It has been shown that developers are slightly behind in where the trends are within the IoT landscape. One main reason for this is normally we have experts in one of the areas required to build full breath IoT applications, as there are a myriad of technologies with various design practices that are not in sync. By viewing your IoT application architecture in the singular, it will allow developers to eliminate design chasms, and also allow more simplicity in IoT standards being driven by the IIC and OPC.
Cascading analytics model for IoT
If one creates an machine learning use case simply to sit on a server, gateway or device, that would be an inefficient use of one’s time. Why should analytics be static? Sure, having seemingly infinite compute available for analytics in cloud does have its appeal to some, but experience should tell us making decisions as close as possible to the source increases either business impact and/or cost reduction. The best example of this is in the fire industry. Two minutes is the length of time between saving a building and not saving it. The only way analytics can help here is if you can predict the occurrence of a fire in the time before it begins at the source.
Having this type of dynamic haze architecture outlined above will allow for a cascading model for analytics. Everyone will have seen the impact apps have on our technology fingerprint as individuals. Analytics use cases are just apps. And like any app, there can be a store. And in that store, you pick which product you want to install the app on. And those apps can be installed all the way from personal computers to tablets, phones and wearable’s. It’s still the same app at its core.
With container based technologies such as docker now becoming more main stream, it is becoming easier to move or expand your applications across your IoT application domain. Applications that were originally built to run in cloud can now be redeployed to lower resourced hardware and function exactly the same. In fact, these next generation of technologies will allow one to move or reallocate their analytics applications dynamically, application dependant. This would ensure that you can build your apps once, and deploy to many.
Whilst IoT can be a huge source of both data and challenges that can be solved with the data, it can be easy for developers to get distracted. Regardless of consumer or industrial IoT, the best advice is to start small with a simple classifier at the device level. Build on this at the edge, with some more advanced time based analysis. Then once you get to cloud level, use the vast compute and applications you have there to run much more sophisticated machine learning algorithms on your data. However ensure that this learning is feedback to your simplistic model at the device and edge, so that you can improve your classifiers over time. Thus applying reinforcement learning down through your cascaded analytics model.
And finally, the single most important aspect to building any analytics application in IoT is to ensure you keep the customer at the centre of the discussions right from the start. Too often the fragmented nature of how business is done means that a huge amount of insight is discovered, but if there isn’t a dollar value attached to it, then it will stay exactly that. The key aspect is to ensure you can translate the “insight” to “impact”. This is where your customers will see the key role analytics plays in their IoT strategic future.
Bio: Denis Canty is the Lead Technologist for Data Science and IoT with Tyco’s Innovation Garage, focused at sensing and building analytics applications to solve direct customer technology challenges. He holds a Masters in Computer Science along with a Masters in Microelectronic Design. Denis is passionate about promoting STEM as careers for the future generations. You can find Denis on twitter @deniscanty, and read his blog at deniscanty.wordpress.com.
Experienced Software and Silicon Development Manager. Agile expert with extensive experience as an Agile trainer/coach. I am a hands-on leader, doing what's necessary to help my teams succeed.
8 年Great perspective Denis. I'd also agree with Dave about getting implementations based on an ideal architecture out there, and start getting the feedback from users...the agile way. Getting that feedback early, and using the feedback to tune your architecture could very well end up with an implementation standard!
VP Silicon & AI equal1
8 年very nice article, clear and logical progression on how to scale Analytics apps to service different physical tiers that make up an end-2-end IoT solution
Good Post.
CEO, Clarke Analytics - delivering tangible business value to companies using insights found in the data the company has access to.
8 年Nice post Denis Canty; starting small with a specific simple classifier at the edge and keeping the customer involved at all times is excellent advice for a start point in this space.
Senior Executive Officer
8 年Well done! Always a step ahead!