登录查看更多内容

Causality for Engineers

Dr. PG Madhavan

Digital Twin maker: Causality & Data Science --> TwinARC - the "INSIGHT Digital Twin"!

发布日期: 2023年12月5日

Here is a “pragmatic” explanation of Causality – I aim to explain “concepts” accurately at the expense of (sometimes unnecessary!) mathematical rigor. If Causality were a topic in my course today for Senior Engineering Undergrads, this will be my first lecture introducing the subject . . .

Where there is Causation, there is Correlation. The action of the person pushing from the very back is of course correlated with the truck movement. But Correlation does NOT imply Causation – push by the guy on the truck bed has no causal effect on truck’s forward movement.

Therefore, Causation implies Correlation.

[Digression: A well-regarded book on Causation states that Causation does NOT imply Correlation; an extreme corner case of poor data selection is used as an example to make the author’s point – quite unnecessary and confusing . . .]

So Causation is Correlation PLUS something else! What is it?

Causation is Correlation PLUS something Else

Meinolf Sellmann has noted that “what we call "causal" relationships are actually also correlations that are just grounded in way more evidence” . . . TRUE! This “evidence” is the data generating process (DGP) that one has gleaned from observational data - Directed Acyclic Graph (DAG) which you provide as initial guess from expert knowledge or refined from data is the DGP ...

Let us not dismiss initial guess of DGP as a sophomoric approach! In my domain of application in Industrial IoT, Machinery experts have a great deal of insight into what part of a machine can cause issues to another part (or among connected equipment) from decades of experience listening, smelling and touching these machines. I use that information (= candidate DAG) to prime my Causal discovery from sensor data all the time. Of course, this knowledge may not be available in some other domains such as social sciences . . .

The holy grail is DGP – data generating process. From observed data, we are performing “inverse modeling” to estimate DGP structure and parameters when we do Causal Analysis.

The crux of Causal Discovery and Estimation is the inverse modeling process. It turns out that estimating the parameters of DGP from observed data is a highly ill-posed problem. Simple regularization based on norms, etc., won’t do to solve this ill-posed problem – the parameters so obtained may have no basis in reality.

Now things get interesting . . .

Unrelated to the philosophical or even Econometric Causality work in the long past, there was another estimation method brewing in the second half of the 20th century (at least according to my reading) finding applications in Telecommunications (starts in the 1950’s with a theorem of Bussgang), Audio Signal Processing, etc. The overall field can be called Blind Source Separation or BSS. BSS is a method to solve the so-called “cocktail-party effect”. My version of the history till now is available in a short note and hence I will not elaborate here.

The key idea behind BSS is to exploit statistical INDEPENDENCE in the data – going beyond Uncorrelated to Independent makes a usable distinction only if the data is NON-Gaussian. The data model of BSS and that of Structural Causal Model (SCM) are IDENTICAL.

The heavy machinery developed for BSS (a popular one is called ICA – Independent Component Analysis) has been brought to bear on Causal Analysis since 2006 (“A Linear Non-Gaussian Acyclic Model for Causal Discovery”) with excellent results especially for multivariate timeseries observations that fit the SCM model (with Non-Gaussian and Independent noise terms and a few other assumptions such as “Markov” and “causally sufficient”).

So as Engineers, we can think of Causal Analysis as an ill-posed inverse modeling problem whereby DGP is estimated from observed data; this is solvable if the “system noise” is constrained – constraints being non-Gaussian and independent.

As mentioned at the top of this section, in many practical applications (especially in Industrial IoT), domain expert-informed initial guess of DAGs will provide an additional constraint that will aid the convergence of ICA algorithm.

It is like anything else in Science – you have to bring in constraints to go from one level to the another, from Correlation to Causation. In that spirit, Correlation plus constraints identified above gives us Causation!

Why DAG?

Other than causality arguments, Directed Acyclic Graph is an awesome constraint. The corresponding Adjacency Matrix has a highly-exploitable form – TRIANULAR!

PS: You may be surprised that there is no mention here of “ladder of causation”, intervention, Do-Calculus, Judea Pearle, etc. - that is another major domain of Causality! In Industrial IoT, Intervention or A/B test is almost never possible on a functioning plant floor. However, within the framework described in this note, Counterfactual simulations using observed data alone can be performed to identify root-causes when something fails or “what-if” experiments can be performed to identify performance improvement options.

#causality #iiot #twinarc #ICA #BSS #DGP #SCM

Divya Atre

Building brand & demand through content marketing, social media marketing and campaigns

12 个月

It's great to see your passion for causality! The connection between prediction, optimization, and generation is indeed fascinating.

Ricky Ho

1 年

Nice article ! This is my passion. Can you say a bit about the connection between prediction, optimization, maybe generation from a causality perspective ? Is the problem settings the same as optimal policy search where we try to find the best action for a given state to maximize the expected future cumulative reward ? Or we try to quantify the effect (to the future cumulative reward) of taking an action at a given state relative to taking other actions ?

1 次回应

查看更多评论

要查看或添加评论，请登录

Dr. PG Madhavan的更多文章

All I wanted to know about GenAI in 2025 . . .

2024年12月23日

All I wanted to know about GenAI in 2025 . . .

. .

4 条评论
Adding to the Epistemology of Causality

2024年7月30日

Adding to the Epistemology of Causality

You observe a practical situation. You try to figure out the laws related to the observed phenomenon.
TwinARC for AOM

2023年10月25日

TwinARC for AOM

TwinARC is a “modern” digital twin that goes beyond observability of your manufacturing facilities. Asset Operations…
Time to embrace MODERN digital twins!

2023年9月27日

Time to embrace MODERN digital twins!

When we deployed the first industrial scale real-time “IoT" and “Digital Twin” systems in USA in 2000, our digital twin…

3 条评论
A short and personal history of Causality in Industry

2023年9月22日

A short and personal history of Causality in Industry

This history is indeed short since applications in Industry are just starting out! Causality methods applicable in…

1 条评论
Evolution of Digital Twins: From Data to Knowledge

2023年9月18日

Evolution of Digital Twins: From Data to Knowledge

Written by ChatGPT based on a long series of LinkedIn comments (18 Sep 2023) in response to a post by Dr. PG Madhavan…
Help is on the way to lift IoT out of its “trough of disillusionment” . . .

2023年8月28日

Help is on the way to lift IoT out of its “trough of disillusionment” . . .

In the past week, I saw two articles on LinkedIn where “IoT” and the phrase, “trough of disillusionment”, appeared…

3 条评论
TwinARC - complete digital twin solution for enterprises making physical products

2023年8月2日

TwinARC - complete digital twin solution for enterprises making physical products

IoT platform providers have laid the groundwork for you to measure and observe your operations. Now Eminds TwinARC adds…

1 条评论
Causality in IoT – a short history

2023年1月6日

Causality in IoT – a short history

Interest in cause and effect relationships has pre-historic roots. Did the stone I threw kill the beast .

1 条评论
Prescriptive Analytics & Causal digital twin

2023年1月3日

Prescriptive Analytics & Causal digital twin

A question was raised recently as to what "Prescriptive Analytics" really is and its relationship to Causality . .

2 条评论

See all articles

Causation is Correlation PLUS something Else

Why DAG?

Dr. PG Madhavan的更多文章

All I wanted to know about GenAI in 2025 . . .

Adding to the Epistemology of Causality

TwinARC for AOM

Time to embrace MODERN digital twins!

A short and personal history of Causality in Industry

Evolution of Digital Twins: From Data to Knowledge

Help is on the way to lift IoT out of its “trough of disillusionment” . . .

TwinARC - complete digital twin solution for enterprises making physical products

Causality in IoT – a short history

Prescriptive Analytics & Causal digital twin