Internet of Things - critical roles of data fusion, analytics, and intelligent agents

Internet of Things - critical roles of data fusion, analytics, and intelligent agents

Anywhere between twenty and a hundred billion physical objects and devices are expected to be interconnected via internet by 2020. These devices will be spewing large volumes of raw sensor data at a rapid pace. Apart from the billion blue sky potential applications that people like to talk about in armchair discussion mode, what are the real life challenges that face data scientists as they strive to support the goals of their respective organizations? Yes, obtaining actionable insights is the overarching vocation, but there are items to address in order to formulate our thinking before we even get to that point:

The objective of this article is to make our readers aware of data fusion, analytics, and intelligent agents’ paradigms for formulating and solving data science problems in an interconnected Internet of Things (IoT) environment. For various technical approaches and algorithms, please consult the author’s books that are referred to at the end.
  1. Because there will be so many devices to attend to, can they operate without the direct intervention of humans or others? In other words, will they be autonomous? This will have implications on what relevant sensor readings and abstractions to publish without crowding the bandwidth. We can already see primitive levels of such autonomous behavior in our everyday life like a thermostat switching off automatically. But much more complex situations are imminent as the widespread dispersion technologies like self-driving cars or intelligent domestic environments becomes progressively feasible.
  2. To what extent will they be able to communicate and interact? For example, my intelligent cupboard and refrigerator could coordinate with each other, as well as with the jars, milk cartons, etc., to produce a shopping list, upload it to the store, and provide me with an optimized aisle route based on what I need. Note that this is a case where I would want to go to the store myself for the experience. Otherwise my personal assistant robot would be receiving the output. The communications (via a well-defined protocol) aspect also has implications in building modularized systems and hence in making a more robust IoT where one or more device failures don’t shut down the connected world.
  3. Will they be able to perceive the environment and respond in a timely fashion? We see primitive levels of autonomous behavior in our everyday life, for instance when the car volume decreases upon sensing an inbound call. On the other hand, a more sophisticated perception-based response would be a vehicle issuing a warning or slowing down upon sensing a driver’s fatigue.
  4. Will such a device be able to exhibit goal-directed behavior by taking its own initiatives? Will they be able to resolve their own goals via planning and scheduling the way a robot does? We see such action in a GPS recommending alternative routes with the goal of efficiently reaching the desired destination, while a more sophisticated goal-directed behavior would be my personal assistant device deriving the state of my health by communicating with wearable devices, and contacting the doctor.
  5. Will such a device be able to learn from the environment over time? This is an imperative property, or else frequent interventions would be required to adjust the factory settings of physical devices. Most routine items can be automated very easily but how about a learning period after which my coffee maker turns itself on late over the weekends or knows to make extra coffee after sensing me arriving late the night before?
  6. Will they be anthromorphic beyond just the standard sweet voice of Mandy in my GPS? Will she be able to understand and interact with me in natural language? Will it project human characteristics when required or will it need to pass Turing test? Does my iRobot Roomba have to look like a human even if it does the job well?
  7. Will they be able to transport themselves or their representatives, either physically or via an autonomous piece of code, to another device in order to extract information that they require? A relevant military example would be the process of launching and follow-on monitoring of a small UAV from a mother vehicle, dropping and monitoring sensors in hazardous areas, and sending a virus-like code to execute in another machine.

These are important characteristics of an Intelligent agent, a computational entity (referring the embedded software driving the device) with intentionality that performs user delegated tasks autonomously.

An Intelligent agent is a computational entity with intentionality that performs user delegated tasks autonomously

So now we have sensors monitoring and tracking all sorts of environment. We have cloud and other paradigms for ingesting big data but how do intelligent agents translate that data into useful intelligence and transmit it to distributed humans and machines, enabling mobile and real-time responses? Here comes the multi-sensor, multi-source data fusion; a paradigm that has been around since the early eighties, but mostly confined within the DoD. In my own opinion, analytics and data fusion are two sides of the same coin if we are to consider the underlying generic paradigms of computation.

Analytics and data fusion are two sides of the same coin

Data fusion is a “process dealing with the association, correlation, and combination of data and information from single and multiple sources to achieve refined position and identity estimates, and complete and timely assessments of situations and threats, and their significance,” according to Frank White from a 1985 publication. Barring the terms such as “position,” “identity,” and “threat,” which are typical of the defense-domain in which the field originated, the rest of the processing concepts in the definition constitute analytics processes. High-level data fusion is a sub-field of data fusion and as per my own definition it is the “study of relationships among objects and events of interest within a dynamic environment,” and combines the descriptive and predictive analytics processes.

High-level data fusion is the study of relationships among objects and events of interest within a dynamic environment

The central software module of a self-driving car is a good example of fusion, collecting data from various sensors within it, perceiving the environment in terms of obstacles, visibility, etc., acting in the form of driving safely, and communicating with the passengers in an anthromorphic fashion.

We often get bogged down by cosmetics, bypassing some serious challenging issues like how to extract actionable insights by viewing the problems as distributed fusion problems and developing algorithms in traditional statistics combined with artificial intelligence, deep linguistics processing, and machine learning. However deep our learning is, there will always be constant struggle in handling big data. So far we have put less emphasis on the decentralization and distributed processing in a net-centric environment that is now inevitability given to IoT. Many years of research are associated with each of these paradigms and hundreds of algorithms that we can make use of.

The closeness of the two fields of fusion and analytics motivates us to introduce some basic concepts of fusion, starting with the well-known Joint Directors of Laboratories (JDL) model. The so-called JDL functional model was intended to facilitate communication among data fusion practitioners, rather than to serve as a complete architecture detailing various processes and their interactions.

Sources on the left of the figure include local and remote sensors accessible to the data fusion system, information from the reference system, and human input. The main task of Source Preprocessing involves analysis of individual sensor data to extract information or improve a signal-to-noise ratio, and preparation of data (such as spatiotemporal alignment) for subsequent fusion processing. The JDL model has the following four functional levels of fusion:

Level 1: Object Refinement

This level combines sensor data to obtain the most reliable and accurate tracking and estimation of an entity’s position, velocity, attributes, and identity. Although this level is not considered part of the high-level fusion, entity-tracking is analogous to tracking a phenomenon, such as the price of a stock.

Level 2: Situation Refinement

The Situation Refinement level develops a description of current relationships among entities and events in the context of their environment. This is analogous to descriptive analytics.

Level 3: Threat Refinement

This level projects the current situation into the future to draw inferences about enemy threats, friend and foe vulnerabilities, and opportunities for operations. This is analogous to predictive analytics.

Level 4: Process Refinement

Process Refinement monitors the overall data fusion process to assess and improve real-time system performance (it has been placed on the edge of the data fusion domain due to its meta-level monitoring characteristics).

The Human Computer Interaction (HCI) block provides an interface to allow a human to interact with the fusion system. The Database Management System block provides management of data for fusion (sensor data, environmental information, models, estimations, etc).

The distinction between Level 2 and Level 3 is often artificial. Models for Level 2 fusion are temporal in many cases, and thus both the current situation and its projection to the future come from a single temporal model. Level 2 fusion is the “estimation and prediction of relations among entities, to include force structure and cross force relations, communications and perceptual influences, physical context, etc.” The Level 2 fusion is also called Situation Assessment (SA), a term equally appropriate for business domains. Moreover, drawing inferences about enemy threats, friend and foe vulnerabilities, and opportunities for operations requires generations of Courses of Action (COAs). Here we take the hypotheses evaluation approach, where COAs are overall actions and their suitabilities need to be evaluated via some arguments of pros and cons and expected utility measures.

The DIKW (Data, Information, Knowledge and Wisdom) hierarchy bears some resemblance to the JDL data fusion model in the sense that both start from raw transactional data to yield knowledge at an increasing level of abstraction. Here is the DIKW information-processing hierarchies that will let us conveniently divide analytics into modularized processes. The hierarchy organizes data, information, knowledge, and wisdom in layers, with an increasing level of abstraction and addition of knowledge, starting from the bottom-most data layer. Various analytical systems help to transform content from one layer to a higher one so as to be better comprehended by analysts.

As we all know, data are transactional, physical, and isolated records of activity. Information is the semantic interpretation of data, and may represent relationships among data with meaning and purpose. Knowledge is the general awareness or possession of information, facts, ideas, truths, or principles. Knowledge is generally personal and subjective. Wisdom is the knowledge of what is true or right coupled with just judgment as to action. Thus “data” is the basic unit of “information,” which in turn is the basic unit of “knowledge,” which in turn is the basic unit of “wisdom.” The term “information” is sometimes used in a generic sense, representing any of the four layers of the DIKW hierarchy.

Standard three-layer robot architecture of perception-orientation-action resemble the above two paradigms and so is Col. Boyd’s OODA (Observe-Orient-Decide-Act) Loop, which is one of the first C4I (Command, Control, Communications, Computers, and Intelligence) architectures. “Observations” in OODA refers to scanning the environment and gathering information from it, “orientation” is the use of the information to form a mental image of the circumstances, “decision” involves considering options and selecting a subsequent course of action, and “action” refers to carrying out the conceived decision.

The Orient step in the OODA loop encapsulates both descriptive and predictive analytics, whereas the Decide step corresponds to prescriptive analytics. An example instantiation of the OODA loop for intelligent house is as follows: 1) observation is sensing an intruder via a camera and/or a motion sensor; 2) orientation is to identify the intruder, possibly by looking at the databases of all visitors in the past; 3) decision could be to switch the light, alert the owner or even start a conversation asking to present an id to go and search some external data sources to match the id; and 4) action is to switch on the light. An action in the real world generates further observations such as the intruder’s id.

In conclusion, the roles of data fusion, analytics, and intelligent agent paradigms become critical as we move towards a more connected world.

Dr. Subrata Das is the author of the books High-Level Data Fusion, published by Artech House, and Computational Business Analytics, published by the CRC Press. Dr. Das also edited a special issue of Elsevier’s Information Fusion Journal on Agent-based Information Fusion. Arup Das, Chief Analytics Officer at Machine Analytics, contributed to this article. Thanks to Leslie Singer and Sebastien Das for editing and commenting on the draft.

要查看或添加评论,请登录

Subrata Das的更多文章

  • Nobel & AI

    Nobel & AI

    This year’s Nobel Prize in Physics has been awarded to two veteran AI scientists, while the Chemistry prize has…

  • Can generative AI produce realistic medical images?

    Can generative AI produce realistic medical images?

    The question above was posed to the students of my Generative AI class for graduate students at Northeastern, which…

    3 条评论
  • Deduction in ChatGPT

    Deduction in ChatGPT

    Something fundamental to the intelligence of a system is to be able to make inferences of different types, such as…

  • Systems Engineering in Building Complex AI Systems

    Systems Engineering in Building Complex AI Systems

    An extended abstract of the invited presentation at the workshop Leveraging Systems Engineering to Realize Synergistic…

  • Factors inhibiting AI adoption

    Factors inhibiting AI adoption

    Despite the recent surge of activities in the field of data science and demonstrable benefits as a result, many…

    1 条评论
  • Analysis of Text (aText) Tool in Python and Java

    Analysis of Text (aText) Tool in Python and Java

    Analytsis of Text (aText) is a Natural Language Processing (NLP) package developed over many years using machine and…

  • Categories of data scientists – where do you want to be?

    Categories of data scientists – where do you want to be?

    I lay out these three broad choices in front of an aspiring data scientist seeking advice: Do you want to be a slave of…

    3 条评论
  • The Death of True Intelligence?

    The Death of True Intelligence?

    [Alternative title: Quest for True Intelligence] Much has been spoken recently about the danger of making computers…

    8 条评论
  • Computational Business Analytics

    Computational Business Analytics

    1 条评论
  • Time Series Modeling and Forecasting

    Time Series Modeling and Forecasting

    A time-series is a sequence of data points representing the state of a “system” as it evolves over time. Each data…

    10 条评论

社区洞察

其他会员也浏览了