Interpretability/Explainability: “Seeing Machines Learn”
Interpretability/Explainability: “Seeing Machines Learn”
In the article "Scenarios: Which Machine Learning (ML) to choose?" [1], as part of the "Architectural Blueprints—The “4+1” View Model of Machine Learning," which helps you to choose the right ML for your data, we indicated that “From a business perspective, two of the most significant measurements are accuracy and interpretability.” [Accuracy: The Bias-Variance Trade-off]
On the one hand, we claimed that “Evaluating the accuracy [2] of a machine learning model is critical in selecting and deploying a machine learning model.”
“On the other hand, measuring interpretability (reasoning) is a more complex task because there is neither a universal agreeable definition nor an objective quantitative measure.” Additionally, the complexity of time, space, and sample, is a necessary factor to be taken into account because it affects the number of resources required to run your model. [3]
- But, which methods produce an interpretable predictive model?
The "AI/ML Black Box Paradox" refers to the inherent opacity of AI/ML systems with complex computational methods, where the prediction/decision-making and reasoning processes are often obscure and difficult to comprehend for humans. Some of the most sophisticated and accurate AI/ML models are impossible to understand or interpret, and there is an inverse relationship between accuracy and transparency. This lack of Interpretability/Explainability makes it challenging to explain how an AI/ML model arrives at its conclusions, leading to questions about the accuracy and reliability of the AI/ML model. These issues have prompted to explore approaches to address this "AI/ML Black Box Paradox."
In general, opaque computational methods, "Black Boxes", obtain higher accuracies than transparent ones.
In order to understand and trust your model prediction/decision and reasoning, there are two tantamount and equivalent factors that need to be considered: Interpretability and Explainability.
So, What’s Explainable AI? [4]
- Definitions
“The ability to determine and observe cause and effect from a machine learning." [5]
“Or, to put it another way, it is the extent to which you are able to predict what is going to happen, given a change in input or algorithmic parameters. It’s being able to look at an algorithm and go yep, I can see what’s happening here." [6]
One measure of interpretability based on “triptych predictivity, stability, and simplicity” is proposed by Vincent Margot in “How to measure interpretability?" [7]
“The ability to justify the reason and its importance of a machine learning results." [8]
“Explainability, meanwhile, is the extent to which the internal mechanics of a machine or deep learning system can be explained in human terms." [9]
“…the interpretability of a model is on a spectrum where some models are more interpretable than others. In other words, interpretability is the degree to which a model can be understood in human terms. One model is more interpretable than another if it is easier for a human to understand how it makes predictions than the other model… However, there is a grey area where you would find that people would disagree on the classification." [10]
The Interpretability Spectrum. Diagram: Conor O'Sullivan
-???????Importance
The purpose of interpretability encompasses options such as:
So, interpretability serves multiple purposes in machine learning, leading to better understanding, fairer outcomes, and more reliable models.
The paper “Design/Ethical Implications of Explainable AI (XAI)" [11] addresses the design and ethical implications of eXplainable AI (XAI) and its necessity for user trust. This paper argues that there are three main reasons for this necessity. These three reasons are accountability/trustworthiness, liability/policy evaluation, and human agency/authority. This paper also defines three types of XAI systems: Opaque (black box), Intertable, and Comprehensible systems.
-???????Problem-solving
Understanding a model's problem-solving capabilities, process, inputs, and outputs is essential before selecting your ML model. An applicable machine learning model depends on your problem and objectives. Machine learning approaches are deployed where it is highly complex or unfeasible to develop conventional algorithms to perform needed tasks or solve problems. Machine learning models are utilized in many domains, such as advertising, agriculture, communication, computer vision, customer services, finance, gaming, investing, marketing, medicine, robotics, security, weather, and bioinformatics research.
ML algorithms in bioinformatics research. Table: Noam Auslander et el.
"An example of the usage of each algorithm and the respective input data are indicated on the right. Abbreviations: Support Vector Machines (SVM); K-Nearest Neighbors (KNN); Convolutional Neural Networks (CNN); Recurrent Neural Networks (RNN); Principal Component Analysis (PCA); t-distributed Stochastic Neighbor Embedding (t-SNE), and Non-negative Matrix Factorization (NMF)."
There are specific applications of ML learning techniques integrated with bioinformatics in molecular evolution, protein structure analysis, systems biology, and disease genomics.
Bioinformatics ML Integration. Table: Noam Auslander et el.
-???????Process
Planning your model development needs to take into consideration the end-to-end lifecycle of your ML model.
End-to-end lifecycle of ML models. Diagram: Shrijayan Rajendarn
For example, the research article "An operational guide to translational clinical ML in academic medical centers" offers a framework for translating academic ML projects into practical use within an academic medical center setting. Here is a summary of the key points:
Challenges Addressed:
Proposed Strategy:
Pipeline for making usable ML solutions. Diagram. Mukund Poddar et al.
This framework aims to equip health systems with a roadmap for deploying "minimum viable data science tools" that can add value to clinical practice. By following this guide, academic medical centers can bridge the gap between research and practical application of ML with interpretability/explainability in a clinical setting. [Operations: ML Operations (MLOps), Continuous ML & AutoML]
-???????Explainability Methods
The two main types of explainability methods in ML are Model-Specific and Model-Agnostic.
Explainability Methods. Diagram: Matt Dancho
- Model-Specific explainability methods are for models that are explainable without any added processing and tend to be simpler models.
- Model-Agnostic explainability methods could be applied to any model.
Here is a table summarizing the key differences:
Key Differences of Explainability Methods. Table: Gemini
Overall, the best approach often involves a combination of both methods. You can leverage model-specific explainability if available for a deeper understanding and then use model-agnostic techniques for further analysis or when dealing with complex models.
-???????Techniques
There are techniques to produce an interpretable predictive model such as an intrinsically interpretable algorithm or a post-hoc interpretable model.
Interpretable ML in terms of Prediction Accuracy vs. Model Explainability. Diagram: Zhang
There are basically two approaches to achieving Interpretability: build a transparent model or apply post-hoc techniques. There are a number of techniques for improving machine learning interpretability such as Algorithmic generalization, Paying attention to feature importance, Local Interpretable Model-Agnostic Explanations (LIME), Deep Learning Important Features (DeepLIFT), and Layer-wise relevance propagation. [12]
The article “Explainable Machine Learning - XAI Review: Model Agnostic Tools" [13] reviews the main ideas underlying Model Agnostic explanations for ML models. In particular, the article focuses on the geometric interpretation of the models.
Interpretability of machine learning-based prediction models in healthcare. Table: Gregor Stiglic, et al.
“Model-Agnostic techniques work for any kind of machine learning models, while Model-Specific ones rely on a certain model structure. Global methods give an explanation for all the units in the dataset, whereas Local ones are just for a bunch of dataset units (but you may always repeat the Local explanation on all the units of interest).” Image: "Interpretability of machine learning-based prediction models in healthcare." [14]
A follow-up article, “LIME: Explain Machine Learning predictions - Intuition and Geometrical Interpretation of LIME," [15]?covers the Local Interpretable Model-Agnostic Explanations (LIME) method for explaining predictions of Machine Learning models, which was developed by Marco Ribeiro et al. [16]?
Steps of the LIME algorithm. Diagram: Giorgio Visani [17]
-???????Data Science
Understanding your ML model should start with the data collection, transformation, and processing because, otherwise, you will get “Garbage In, Garbage Out” (GIGO).
There is a heavy cost of poor data quality to the success of your ML model. You will need to have a systematic method to improve your data quality. Most of the work is in your data preparation and consistency is a key to data quality. [Data Science Approaches to Data Quality: From Raw Data to Datasets]
"Collecting better data, building data pipelines, and cleaning data can be tedious, but it is very much needed to be able to make the most out of data." The Data Science Hierarchy of Needs, by Sarah Catanzaro, is a checklist for "avoiding unnecessary modeling or improving modeling efforts with feature engineering or selection." [Serg Masis]
The Data Science Hierarchy of Needs. Diagram: Serg Masis
Learning goals and objectives are significant to establish. Organizing objectives helps to clarify objectives.
"Bloom's taxonomy is a set of three hierarchical models used for the classification of educational learning objectives into levels of complexity and specificity. The three lists cover the learning objectives in the cognitive, affective, and psychomotor domains.
Bloom's Revised Taxonomy. Diagram: Wikipedia
There are six levels of cognitive learning according to the revised version of Bloom's Taxonomy. Each level is conceptually different. The six levels are?remembering, understanding, applying, analyzing, evaluating, and creating. The new terms are defined as:
This Bloom's taxonomy was adapted for machine learning.
Bloom’s Taxonomy Adapted for Machine Learning (ML). Diagram: Visual Science Informatics, LLC
There are six levels of model learning in the adapted version of Bloom's Taxonomy for ML. Each level is a conceptually different learning model. The levels order is from lower-order learning to higher-order learning. The six levels are?Store, Sort, Search, Descriptive, Discriminative,?and?Generative. Bloom’s Taxonomy adapted for Machine Learning (ML) terms are defined as:
Conditional Generative Adversarial Network Model Architecture Example. Diagram: Jason Brownlee
Another decision point in choosing your ML model, which also impacts your model's interpretability and explainability, is the difference between a discriminative vs. a generative model. A discriminative approach focuses on a solution and performs better for classification tasks by dividing the data space into classes by learning the boundaries. A generative approach models understand how data is embedded throughout space and generates new data points.
Discriminative vs. Generative. Table: Supervised Learning Cheatsheet
-???????Types of Variables in Data Science
Variables are the characteristics or attributes that describe a dataset. They can be classified into different types based on their nature and the type of data they represent.
Types of Variables in a Dataset. Diagrams: Avi Chawla
Key Variable Types in a Causal Dataset. Diagram: Causal Wizard
These types of variables can be grouped into:
Types of Variables in a Data Science. Table: Gemini
Knowing and understanding the type of your dataset variables are important for:
领英推荐
-???????Visualization for Data Quality
You should check and analyze your data even before you train a model because you might discover data quality issues in your data. Identifying common data quality issues such as missing data, duplicated data, and inaccurate, ambiguous, or inconsistent data can help you find data anomalies and perform feature engineering. [Data Science Approaches to Data Quality: From Raw Data to Datasets]
TensorFlow Data Validation & Visualization Tools. Table: TensorFlow.org [19]
TensorFlow Data Validation provides tools for visualizing the distribution of feature values. By examining these you can identify your data distribution, scale, or label anomalies.
Another way you can visualize data on a different axis is by using Facets for data analytics. Facets create a graphical user interface where you can select columns and axis that you want to understand and analyze the association between different features.
Facets for Data Analytics & Visualization. Table: Himanshu Sharma [20]
Feature engineering or feature extraction or feature discovery is using domain knowledge to extract or create features (characteristics, properties, and attributes) from raw data. If feature engineering is performed effectively, it improves the machine learning process. Consequently, it increases the predictive power and improves ML model accuracy, performance, and quality of results by creating extra features that effectively represent the underlying model. The feature engineering step is a fundamental part of the data pipeline, which leverages data preparation, in the machine learning workflow.
For example, in a study of "Modeling multiple sclerosis using mobile and wearable sensor data," the researchers "data analysis objectives were to identify the most reliable, clinically useful, and available features derived from mobile and wearable sensor data. Their machine learning pipeline identifies the best-performing features..." [Shkurta Gashi et. al.]
Study Design and Data Modeling Setup. Table adapted: Shkurta Gashi et. al.
For accurate predictions, your data must not only be properly processed but also should process the "right data." To improve your data quality with unsupervised ML, you can employ the Uniform Manifold Approximation and Projection (UMAP) algorithm. "The UMAP algorithm allows the construction of a high-dimensional graph representation of data and further optimization of a low-dimensional graph to be as structurally similar as possible." [21]
UMAP Projection to Various Datasets. Animation: Powered by umap-js
Dimensionality Reduction. Map: ?Andy Coenen and Adam Pearce [22]
UMAP Projection of Various Datasets with a Variety of Common Values. Table: Andy Coenen & Adam Pearce
-???????Animation
You can continue to obtain insight into your ML model decision reasoning during model training.
Gradient Descent of a Linear Fit in Three Dimensions. Animation: Tobias Roeschl [23]
Logistic Regression Weights are Updated at Each Iteration. Animation: Adarsh Menon [24]
Logistic Regression Curve and Surface Plot of Costs. Animation: Tobias Roeschl [25]
Support Vector Machine (SVM) Classifier Boundary Plot. Animation:?Bruno Rodrigues [26]
Ultimately, you can explain your trained decision tree model employing interactive visualizations.
Training Data Flow Through a Decision Tree. Diagram: Stephanie Yee & Tony Chu [27]
-???????Visualization for Interpretability
After examining your trained model’s overall accuracy metric or getting finer-grained measurements such as Precision, Recall, and F1-Score, you can use diagnostic tools by visualizing plots such as Receiver Operating Characteristic (ROC) Curves and Precision-Recall Curves [28] that can help in interpretation. ?
Interpretability is important if you must understand and interpret the phenomenon being modeled, debug a model, or begin to trust its decisions. TensorFlow provides a plot that you can visually follow a decision tree structure. [29]
Visually Follow a Decision Tree Structure. Animation: TensorFlow
Random Forests. Diagram: Mathieu Guillame-Bert et al.
Random Forests are a popular type of decision forest model. Here, you can see a forest of trees classifying an example by voting on the outcome. Image: Mathieu Guillame-Bert et al.
-???????Visualization for Explainability
Explainability is important if you must understand and explain your features' importance, loss function, and model parameters and hyper-parameters.
An Artificial Neuron in Action. Animation: Anddy Cabrera
“An artificial neuron simply hosts the mathematical computations. Like our neurons, it triggers when it encounters sufficient stimuli. The neuron combines input from the data with a set of coefficients, or weights, which either amplify or dampen that input, which thereby assigns significance to inputs for the task the algorithm is trying to learn." [30]
Backpropagation is a fundamental algorithm used to train artificial neural networks. It is essentially a computational method for calculating the gradient of the error function with respect to the network's weights. In simpler terms, it helps the network learn from its mistakes by adjusting its parameters to minimize the error between its predicted output and the actual output.
Deep Neural Networks (DNNs) are trained using large sets of labeled or unlabeled data and increasingly learn abstract features directly from the data without manual feature extraction. Traditional neural networks may contain around 2-3 hidden layers, while deep networks can have as many as 100-200 hidden layers.
Deep Learning Mimics Brain Learning Through Examples and Layers of Networks. Animation: Adatis
Although DNNs have high predictive power, but have low interpretability because the nature of deep networks is a black box where the inner working of deep networks is not fully explainable.
Neural Networks' Architectures: ANN, RNN, LSTM & CNN. Diagrams: A. Catherine Cabrera, and B. InterviewBit
Different neural networks have distinct architectures tailored to their functions and strengths. Here are description of major neural networks' architectures:
Here is a table summarizing the key differences:
Key Differences of Neural Networks' Architectures. Table: Gemini
ML Algorithm Cheat Sheet. Diagram: Microsoft
-???????Interactive Visualization Tools
“ML is great until you have to explain it”. The modelStudio is an R package that makes it easy to interactively explain ML models employing four techniques and explainable plots: 1.) Feature Importance, 2.) Break Down Plot, 3.) Shapley Values, and 4.) Partial Dependence.
In essence: “Seeing Machines Learn”
Warmth Maps of Neural Community Layers. Animation: TensorFlow Playground, Daniel Smilkov & Shan Carter [33]
"Fernanda Viégas and Martin Wattenberg started the OpenVis conference by opening up the black box of neural networks. A guided tour of the playground.tensorflow.org [34] revealed the beautiful process of neurons digesting wisely chosen features and learning from them in front of our eyes." [35]
Seeing Machines Think. Video: Fernanda Viégas and Martin Wattenberg [40]
Next, read the "Operations: ML Operations (MLOps), Continuous ML & AutoML" [36] article at https://www.dhirubhai.net/pulse/ml-operations-mlops-continuous-automl-yair-rajwan-ms-dsc.
---------------------------------------------------------
[23] https://towardsdatascience.com/gradient-descent-animation-1-simple-linear-regression-e49315b24672
[24] https://towardsdatascience.com/logistic-regression-explained-and-implemented-in-python-880955306060
[26] https://towardsdatascience.com/the-simplest-way-of-making-gifs-and-math-videos-with-python-aec41da74c6e
[28] https://machinelearningmastery.com/roc-curves-and-precision-recall-curves-for-classification-in-python
[32] https://www.business-science.io/r/2022/02/22/my-4-most-important-explainable-ai-visualizations-modelstudio.html
[33] https://playground.tensorflow.org ????????????????????????????????????????????????????????
Read the "ML Operations (MLOps), Continuous ML & AutoML” article at?https://www.dhirubhai.net/pulse/ml-operations-mlops-continuous-automl-yair-rajwan-ms-dsc