Artificial Intelligence No 52: An introduction to causal machine learning
Image source https://www.amazon.co.uk/Causality-Judea-Pearl/dp/052189560X/

Artificial Intelligence No 52: An introduction to causal machine learning

UPDATE

Thanks again for the response and insightful?comments to this article

Like some of you said, PEARL is not the only person formulating the maths behind causality - but for me, PEARL is the most mature esp in terms of implementation

To summarise, there are three stages to building a causal model

a)?Causal discovery (via data / surveys or understanding the distribution in some way)

b)?Causal model building (ex creation of the DAGs)

c)?Causal inference ex via the Do Operator

In terms of deployment, it's no different from any other model (that's the easy part)

which libraries to use is also rapidly evolving

At Oxford, we are exploring TF Probability and?pgmpy

If you are using others and recommend them please let me know

original article follows

I have been a fan of causal machine learning, and I believe it will increasingly impact AI / ML techniques.

In this post, I explain the basics of causal machine learning. The posts are based on three links from Shawhin Talebi, which I found very useful in explaining these concepts.

Causality is not familiar to most machine learning developers. Causality is based on the work of Judea Pearl (‘the new science of cause and effect’ and ‘the book of why’).

The ideas themselves have been around for a while but are making a bit of a comeback in the recognition that current machine learning and deep learning techniques do not address a class of problems (cause and effect problems)

Causality is concerned with the question of 'Why.'

There are many ways/stories to explain why something has happened and its related questions: what is the reason for a phenomenon, where is this phenomenon going next, etc

We can also think of causality in terms of the limits of current statistical thinking. These include:

1) Spurious co-relation (correlation does not cause causation)

2) Simpson’s paradox: i.e., the same data gives contradictory conclusions depending on how you look at it

3) Symmetry: i.e., A causes B does not imply that B causes A

?Thus, ?Causality?goes beyond correlation. Causality describes the cause and effect of elements in a system. A variable, X, can be said to cause another variable Y if an intervention in X results in a change in Y, but an intervention in Y does not necessarily result in a change in X(and if when all confounders are adjusted).

In contrast, for correlations, if X correlates with Y, Y correlates with X and also if X causes Y, Y may not cause X.

Note that in statistics, a confounder (also confounding variable, confounding factor, extraneous determinant or lurking variable) is a variable that influences both the dependent variable and independent variable, causing a spurious association. Confounding is a causal concept, and as such, cannot be described in terms of correlations or associations. The existence of confounders is an important quantitative explanation why correlation does not imply causation.

Causality is expressed as a?Directed Acyclic Graphs (DAG)?and?also as Structural Equation Model (SEM).

A DAG is?a special kind of graph?for which all edges are directed (information flow is in one direction) and no cycles exist (information that leaves a vertex cannot return to it). The vertices (circles) in a causal DAG represent variables and edges (arrows) represent causation, where a variable is directly?caused?by its parents.

SEMs?represent relationships between variables having two characteristics. SEM equations are asymmetric meaning equality only works in one direction. Hence, SEMs cannot be inverted. Second, equations can be non-parametric meaning the functional form is not known.

No alt text provided for this image

Image source: shawhin talebi

Causal Inference: Now that we have formulated the causal structure (model) of the problem as a DAG/SEM, we look at causal inference which uses the causal structure to answer causal questions such as:

  • Did the treatment directly help those who took it?
  • Was it the marketing campaign that lead to increased sales this month or the holiday
  • How big of an effect would increasing wages have on productivity?

Causal inference is based on estimating causal effects based on a technique called the do-calculus/ do-operator.

In simple terms, as per the idea of do-calculus, X causes Y if an intervention in X results in a change in Y, while an intervention in Y does not necessarily result in a change in X. ?

Thus, the?do-operator?is a?mathematical representation of a physical intervention. The power of do-operator is that it?allows us to simulate experiments, given we know the details of the causal connections. For example, suppose we want to ask, will increasing the marketing budget boost sales?? With a causal model, we can simulate what would happen?if we were to increase marketing spend.In other words, we can evaluate the?causal effect?of marketing on sales.

While causal inference is useful, how do we build a causal model?

For that, we need causal discovery.

Causal discovery?aims to?infer causal structure from data. In other words, given a dataset,?derive?a causal model that describes it.

There are four common assumptions for data for causal discovery algorithms. (reference: Structural Agnostic Modeling: Adversarial Learning of Causal Graphs)

1.????Acyclicity?— causal structure can be represented by DAG (G)

2.????Markov Property?— all nodes are independent of their non-descendants when conditioned on their parents

3.????Faithfulness?— all conditional independences in true underlying distribution?p?are represented in G

4.????Sufficiency?— any pair of nodes in G has no common external cause

If all this sounds a bit abstract, it is!

However, much of it is being implemented in python libraries like the Causal Discovery Toolbox, One problem currently is that this is a fairly new field and the libraries are not mature. So, if you are working in this field and have some recommendations, I would welcome them.

Sources the following three posts?by shawhin talebi

https://towardsdatascience.com/causal-discovery-6858f9af6dcb

https://towardsdatascience.com/causality-an-introduction-f8a3f6ac4c4a

https://towardsdatascience.com/causal-inference-962ae97cefda

Mayank Biswas

UTRINQUE PARATUS . Periodt

2 年

are those blue tadpoles? 1 inside and 1 outside the circle?

回复
Orhan B.

Data Scientist | AI | ML Engineer | MBA

2 年

It is a clear article

回复

In this newsletter Shawhin Talebi Thomas Toseland

Joseph A di Paolantonio

SensAE are better than IoT projects; mature with connection, communication, contextualization, collaboration, causation, conceptualization and cognition into Sensor Analytics Ecosystems

2 年

When you say “…been around for awhile…” I was first introduced to Dr. Pearl’s work around 1988, when, as the only Bayesian at Palo Alto Research Labs, I was asked to help the AI group understand a new concept: Bayesian Neural Networks. And then winter came ????♂?

Dr Pavan Sangha

Head of Data Science

2 年

D-separation plays a huge role in isolating causal influence from associational influence in causal graphs. Pearls do-calculus is designed to provide sufficient conditions to isolate causal effects where possible. One examples would be front door adjustment and another the back door adjustment. For a more formal treatment of Pearls atomic intervention concept I recommend Shachter and Heckerman paper on “causal influence diagrams” which formalises the atomic intervention through the introduction of responsiveness/limited responsiveness.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了