登录查看更多内容

An Open Question: Is Explainable Machine Learning attainable?

Michio Suginoo, CFA (He/Him)

CFA | Machine Learning | Data Science | Paradigm Shift | Technical Research Writer | Teleological Pursuit | UBA Postgraduate Student

发布日期: 2021年3月13日

Black box of Machine Learning has become a subject of daily criticism. In some cases, authorities have started demanding Machine Learning to be ‘Explainabile’ through their regulatory frameworks.

What is Explainable Machine Learning (EML)?

Is ‘explainability’ an attainable objective for Machine Learning/Deep Learning? Can Traditional Models claim solid explainability? Whether dealing with traditional models or Machine Learning models, our knowledge obtained from any model might be more tenuous than we would like to assume.

An Architectural Limitation

By its design, Machine Learning models lack explicit inductive and deductive formulations in discovering hidden relationships among the given dataset. The design was intentional. It was intended to address the limitations of traditional algorithm paradigm, which explicitly specifies rules regarding how to map independent variables (the Features) to dependent variable (the Label). As mapping tasks become increasingly complex, it has become progressively difficult, or even impossible, to explicitly predetermine rules to program under traditional algorithm paradigm. This limitation of the traditional algorithm paradigm set the stage for the emergence of Machine Learning, which does not require explicit inductive or deductive formulations (in mapping rules). (Suginoo, 2021)

Liberating itself from the task of pre-specifying mapping rules, in some cases Deep Learning applications have demonstrated transformative success in discovering complex relationships, which conventional algorithm paradigm have failed to discover. AlphaFold 2 is one of such successful cases. (Heaven, 2020)

Despite its success, there are blind spots in Deep Learning. While granting Deep Learning with the power to discover complex relationship in a given dataset, the architectural design makes it difficult for us to establish a valid scientific explanation regarding what’s going on during its process either inductive or deductive manner. In this sense, the limitation in the ‘scientific explainability’ might be one of its architectural limitation. Deep learning’s freedom from predetermining mapping rules shapes both its strength and its weakness.

By the way, what is “Valid Scientific Explanation”?

Establishing Scientific Explanation

In order for us to establish the scientific validity of any model, the model needs to satisfy the most fundamental criteria: ‘reproducibility’. (Nuzzo R. , 2015)

In practice, in an attempt to demonstrate the ‘explainability’ of their models, many developers of ‘Explainable’ Machine Learning (EML) present ‘post-hoc’ observations about the passages of their processes (either statistically or deterministically) during the development domain. And they present their interpretations about those ‘post-hoc’ observations. Well, interpretation is different from explanation.

Now, let’s contemplate ‘explainability’ scientifically in the context of ‘reproducibility’. To qualify a ‘scientifically explainable model’, a predictor generated by a Machine Learning model in the development domain, when introduced with a totally new dataset in the deployment domain, needs to yield an a-priori expected result, demonstrating an a-priori expected passage in the process (either statistically or deterministically). If there is no guarantee that a-priori expected passage(s) can be reproduced (either statistically or deterministically) with a totally untested new dataset, the predictor—which was once validated in the development domain—would not qualify to be ‘scientifically explainable’.

Deep Learning Models often demonstrate unexpected behaviours in the deployment domain. According to a famous paper issued by a group of Google Data Scientists--"Underspecification Presents Challenges for Credibility inModern Machine Learning"--Machine Learning's lack of inductive formulation in valuation mechanism is causing the problem of 'underspecification' and its consequence, model instability. (Google, 2020)

As of today, is there any scientifically qualified, reproducible, ‘Explainable’ Machine Learning (EML) model?This is one of my open questions.

Language Ambiguity: Explainability or Interpretability?

Although it is not impossible for us to observe and interpret the process, ‘explainability’ of Deep Learning appear tenuous, or even an unattainable dream, at least as of today. In addition, we have the issue of ‘reproducibility’ with Deep Learning.

Those called EML today might need to be renamed “Observable Machine Learning”, or “Interpretable Machine Learning”. And the question remains whether those models are reproducible to qualify ‘scientific’ before ‘explainable’. Overall, if the predictor in question does not demonstrate ‘reproducibility’, it would fail to qualify a scientific method: ‘explainablity’ should become ‘scientifically’ irrelevant.

Here is an insightful perspective by Cynthia Rudin: “Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead”. (Rudin, 2019) Its title is self-explanatory about its message. Here are some excerpts:

“Let us stop calling approximations to black box model predictions (Machine Learning predictors) explanations.”

“Calling these “summaries of predictions,” “summary statistics,” or “trends” rather than “explanations” would be less misleading.”

“Since the definition of what constitutes a viable explanation is unclear, even strong regulations such as “right to explanation” can be undermined with less-than-satisfactory explanations.”

At least, we need to come to consensus regarding the definition of 'explainability' and 'interpretability' to avoid unnecessary confusions.

Business as Usual: Engineering Achievements without Scientific Consensus

In order to avoid confusion, I would like to clarify my position as an advocate of the notion that we explore the use of Machine Learning as an engineering tool, but not as a scientific method.

Appreciating the benefit of engineering achievements without scientific validity— however eccentric it may sound, we are surrounded by multiple of such cases. As an example, airplanes. There is no scientific consensus among scientists regarding an explanation why airplanes fly. (Regis, 2020) Despite the absence of scientific consensus in how it works, our civilization today embraces airplanes.

Tenuous Scientific Validity in Traditional Model Use cases

Furthermore, if ‘explainability’ of Deep Learning is questionable in the new algorithm paradigm space, our conventional models might be questionable more than we would like to assume from the perspective of “scientifical explainability”. One example can be observed in our frequent use of ‘p-value’. A British statistician, Ronald Fisher, the inventor of ‘p-value’, did not intend to promote its use as a scientific reproducible metric of significance.

Fisher "intended it simply as an informal way to judge whether evidence was significant in the old-fashioned sense: worthy of a second look."

"Fisher intended it to be just one part of a fluid, non-numerical process that blended data and background knowledge to lead to scientific conclusions." (Nuzzo, 2014)

Against the inventor's intention, there seem to have been abusive use cases of ‘p-value’ in practice—using it as a scientific reproducible metric of significance—more frequently than we would imagine. (Baker, 2016) In 2016, American Statistical Association publicly announced precaution in the use of ‘p-value’. (American Statistical Association, 2016)

Given that 'p-value' is embedded in 'confidence interval', how solid the reproducibility of 'Hypothesis Testing' would be?

If the scientific explanation of Deep Learning, an example of new model paradigm, is tenuous, we would also have to maintain our vigilance on the scientific validity of traditional model paradigm as well.

Now, let’s return to the famous mantra by George Box, a British Statistician in the 20th century (Unknown, N.D.):

“All models are wrong, but some are useful”.

Our civilization might be driven by wrong but useful tools to some extent. Whether it is good or not is another question. Nevertheless, that would definitely embed some inherent risk within the construct of our civilization.

Regulatory Implication

Is ‘explainability’ an unattainable dream for Machine Learning by its design?

More critically, should Machine Learning’s “explainability” be an unattainable dream, when a government demands ‘explainability’ in the use of Machine Learning, would the government policy only gives participants an incentive to deceptively game the term? If that is the case, authorities should refrain from requiring something unattainable.

Vigilance required

Whether dealing with conventional model paradigm or Machine Learning models, we need to maintain our vigilance when we question about model ‘explainability’ scientifically. Our knowledge obtained from any model might be more tenuous than we would like to assume.

Let me close my note with another reminder allegedly credited to Stephen Hawking:

“(the worst fallacy of the human nature) is not our ignorance, but our delusion of knowledge”

An Open Question: Is Explainable Machine Learning attainable?

Michio Suginoo, CFA (He/Him)

CFA | Machine Learning | Data Science | Paradigm Shift | Technical Research Writer | Teleological Pursuit | UBA Postgraduate Student

更多精彩文章

社区洞察

其他会员也浏览了

What is Machine Learning? Article by Saurav Mukherjee

Did You Know How MNC’S Company Manage Machine Learning /Artificial Intelligence

Are you ready to unlock the magic of machine learning?

Machine Learning - An Introduction

Machine Learning: A Brief Overview

Knowledge is Everything: Using Representation Learning to Optimize Feature Extraction and Knowledge Quality

The Learning Machines: A Quick Guide to ML Algorithms Shaping Our World

Introduction to Machine Learning: A Beginner's Guide

Machine learning or Artificial intelligence?

Machine Learning for Business Managers

Part 2: Paradox of Monetary Policy in Managing Elastic Currency

2023年2月28日

Proposal for AI based Monetary System of the Future: Beyond Myth of the Gold Standard and Bitcoin & Failure of Contemporary Monetary Policy

2023年2月25日

Part 1: Myths of the Gold Standard and Bitcoin: Historical View on Perennial Problem of Limited Supply

2023年2月25日

Postmortem Business Model Analysis of Twitter1.0 (before acquisition): a case study of a failed digital platform model

2023年1月5日

Bond Yield Paradoxes: Gibson's Paradox and Shiller's Paradox

2021年12月31日

In search for Elastic and Scalable Money: Historical Lessons of the Gold Standard for the Future of Money

2021年12月13日

New Economic Paradigm for Sustainable Society--A Perspective: Constrained Use of Debt Monetization under Lame Duck Inflation Regime

2021年3月19日

Basic Intuitions of Machine Learning & Deep Learning for Beginners

2021年2月16日

Energy Efficiency Revolution would be a key to the Next Deep Learning Generation

2021年1月29日

Economic Catastrophe and Monetary Big Bang: a quick look at the genealogy of Contemporary Monetary System

2019年11月19日