Interpretability: “Seeing Machines Learn”
Warmth maps of neural community layers. Animation: TensorFlow Playground Daniel Smilkov & Shan Carter

Interpretability: “Seeing Machines Learn”

Interpretability: “Seeing Machines Learn”

In the article "Scenarios: Which Machine Learning (ML) to choose?" [1] , as part of the "Architectural Blueprints—The “4+1” View Model of Machine Learning ," which helps you to choose the right ML for your data, we indicated that “From a business perspective, two of the most significant measurements are accuracy and interpretability.” [Accuracy: The Bias-Variance Trade-off]

ML Architectural Blueprints = {Scenarios, Accuracy, Complexity, Interpretability, Operations}

On the one hand, we claimed that “Evaluating the accuracy [2] of a machine learning model is critical in selecting and deploying a machine learning model.”

“On the other hand, measuring interpretability (reasoning) is a more complex task because there is neither a universal agreeable definition nor an objective quantitative measure.” Additionally, the complexity of time, space, and sample, is a necessary factor to be taken into account because it affects the number of resources required to run your model. [3]

But, which methods produce an interpretable predictive model?

The "AI/ML Black Box Paradox" refers to the inherent opacity of AI/ML systems with complex computational methods, where the prediction/decision-making and reasoning processes are often obscure and difficult to comprehend for humans. Some of the most sophisticated and accurate AI/ML models are impossible to understand or interpret, and there is an inverse relationship between accuracy and transparency. This lack of Interpretability or Explainability makes it challenging to explain how an AI/ML model arrives at its conclusions, leading to questions about the accuracy and reliability of the AI/ML model. These issues have prompted to explore approaches to address this "AI/ML Black Box Paradox."

In general, opaque computational methods, "Black Boxes", obtain higher accuracies than transparent ones.

In order to understand and trust your model prediction/decision and reasoning, there are two tantamount and equivalent factors that need to be considered: Interpretability and Explainability.

So, What’s Explainable AI? [4]


Definitions

  • Interpretability

“The ability to determine and observe cause and effect from a machine learning." [5]

“Or, to put it another way, it is the extent to which you are able to predict what is going to happen, given a change in input or algorithmic parameters. It’s being able to look at an algorithm and go yep, I can see what’s happening here." [6]

One measure of interpretability based on “triptych predictivity, stability, and simplicity” is proposed by Vincent Margot in “How to measure interpretability?" [7]

  • Explainability

“The ability to justify the reason and its importance of a machine learning results." [8]

“Explainability, meanwhile, is the extent to which the internal mechanics of a machine or deep learning system can be explained in human terms." [9]

“…the interpretability of a model is on a spectrum where some models are more interpretable than others. In other words, interpretability is the degree to which a model can be understood in human terms. One model is more interpretable than another if it is easier for a human to understand how it makes predictions than the other model… However, there is a grey area where you would find that people would disagree on the classification." [10]

The Interpretability Spectrum. Image: Conor O'Sullivan
The Interpretability Spectrum. Diagram: Conor O'Sullivan

The Interpretability Spectrum. Diagram: Conor O'Sullivan


Importance

The purpose of interpretability encompasses options such as:

  • Create white-box / interpretable models (intrinsic): This is a core benefit. By building models that are inherently understandable, you can see how they arrive at decisions, making them trustworthy and easier to debug.
  • Explain black-box / complex models (post-hoc): Even for complex models, interpretability techniques can help shed light on their inner workings. This allows you to understand why they make certain predictions and identify potential biases.
  • Enhance fairness of a model: Interpretability helps detect and mitigate bias in models. By understanding how features influence predictions, you can identify unfair patterns learned from the data and take steps to correct them.
  • Test sensitivity of predictions: Interpretability allows you to analyze how a model's predictions change with variations in the input data. This helps assess the robustness of the model and identify areas where it might be overly sensitive to specific features.

So, interpretability serves multiple purposes in machine learning, leading to better understanding, fairer outcomes, and more reliable models.

The paper “Design/Ethical Implications of Explainable AI (XAI)" [11] addresses the design and ethical implications of eXplainable AI (XAI) and its necessity for user trust. This paper argues that there are three main reasons for this necessity. These three reasons are accountability/trustworthiness, liability/policy evaluation, and human agency/authority. This paper also defines three types of XAI systems: Opaque (black box), Intertable, and Comprehensible systems.


Problem-solving

Understanding a model's problem-solving capabilities, process, inputs, and outputs is essential before selecting your ML model. An applicable machine learning model depends on your problem and objectives. Machine learning approaches are deployed where it is highly complex or unfeasible to develop conventional algorithms to perform needed tasks or solve problems. Machine learning models are utilized in many domains, such as advertising, agriculture, communication, computer vision, customer services, finance, gaming, investing, marketing, medicine, robotics, security, weather, and bioinformatics research.

Machine learning algorithms in bioinformatics research. Noam Auslander et el.
Machine learning algorithms in bioinformatics research. Table: Noam Auslander et el.

ML algorithms in bioinformatics research. Table: Noam Auslander et el.

"An example of the usage of each algorithm and the respective input data are indicated on the right. Abbreviations: Support Vector Machines (SVM); K-Nearest Neighbors (KNN); Convolutional Neural Networks (CNN); Recurrent Neural Networks (RNN); Principal Component Analysis (PCA); t-distributed Stochastic Neighbor Embedding (t-SNE), and Non-negative Matrix Factorization (NMF)."

There are specific applications of ML learning techniques integrated with bioinformatics in molecular evolution, protein structure analysis, systems biology, and disease genomics.

No alt text provided for this image
Bioinformatics ML Integration. Table: Noam Auslander et el.

Bioinformatics ML Integration. Table: Noam Auslander et el.


Process

Planning your model development needs to take into consideration the end-to-end lifecycle of your ML model.

End-to-end lifecycle of ML models. Image: Shrijayan Rajendarn
End-to-end lifecycle of ML models. Diagram: Shrijayan Rajendarn

End-to-end lifecycle of ML models. Diagram: Shrijayan Rajendarn

For example, the research article "An operational guide to translational clinical ML in academic medical centers" offers a framework for translating academic ML projects into practical use within an academic medical center setting. Here is a summary of the key points:

Challenges Addressed:

  • Difficulty translating and explaining academic ML models into usable clinical tools.
  • Lack of guidance for navigating and interpreting the tasks involved.

Proposed Strategy:

  • Facilitate the transition from academic research to an explainable tool.
  • Define clear roles & responsibilities throughout the interpretability process.

Pipeline for making usable ML solutions. Diagram. Mukund Poddar et al.

Pipeline for making usable ML solutions. Diagram. Mukund Poddar et al.

This framework aims to equip health systems with a roadmap for deploying "minimum viable data science tools" that can add value to clinical practice. By following this guide, academic medical centers can bridge the gap between research and practical application of ML with interpretability or explainability in a clinical setting. [Operations: ML Operations (MLOps), Continuous ML & AutoML]


Explainability Methods

The two main types of explainability methods in ML are Model-Specific and Model-Agnostic.

Explainability Methods. Diagram: Matt Dancho

Explainability Methods. Diagram: Matt Dancho

- Model-Specific explainability methods are for models that are explainable without any added processing and tend to be simpler models.

  • Advantages: These methods are often simpler to understand and interpret because they leverage the inherent structure of the model.
  • Disadvantages: These methods are limited to specific models. If you switch to a more complex model, these built-in explanations would not be available.

- Model-Agnostic explainability methods could be applied to any model.

  • Advantages: These methods are incredibly versatile and could be applied to any ML model, regardless of its complexity. This makes them a powerful tool for understanding even the most opaque black-box models.
  • Disadvantages: These methods could be computationally expensive, especially for complex models. Additionally, the explanations they provide might be less intuitive compared to model-specific methods.

Here is a table summarizing the key differences:

Key Differences of Explainability Methods. Table: Gemini

Key Differences of Explainability Methods. Table: Gemini

Overall, the best approach often involves a combination of both methods. You can leverage model-specific explainability if available for a deeper understanding and then use model-agnostic techniques for further analysis or when dealing with complex models.


Techniques

There are techniques to produce an interpretable predictive model such as an intrinsically interpretable algorithm or a post-hoc interpretable model.

Prediction Accuracy vs. Model Explainability. Diagram: Zhang

Interpretable ML in terms of Prediction Accuracy vs. Model Explainability. Diagram: Zhang

There are basically two approaches to achieving Interpretability: build a transparent model or apply post-hoc techniques. There are a number of techniques for improving machine learning interpretability such as Algorithmic generalization, Paying attention to feature importance, Local Interpretable Model-Agnostic Explanations (LIME), Deep Learning Important Features (DeepLIFT), and Layer-wise relevance propagation. [12 ]

The article “Explainable Machine Learning - XAI Review: Model Agnostic Tools" [13 ] reviews the main ideas underlying Model Agnostic explanations for ML models. In particular, the article focuses on the geometric interpretation of the models.

“Model Agnostic techniques work for any kind of machine learning models, while Model Specific ones rely on a certain model structure. Global methods give an explanation for all the units in the dataset, whereas Local ones just for a bunch of dataset units (but you may always repeat the Local explanation on all the units of interest).” Image: “Interpretability of machine learning-based prediction models in healthcare” Stiglic[14]
Interpretability of machine learning-based prediction models in healthcare. Table: Gregor Stiglic, et al.

Interpretability of machine learning-based prediction models in healthcare. Table: Gregor Stiglic, et al.

“Model-Agnostic techniques work for any kind of machine learning models, while Model-Specific ones rely on a certain model structure. Global methods give an explanation for all the units in the dataset, whereas Local ones are just for a bunch of dataset units (but you may always repeat the Local explanation on all the units of interest).” Image: "Interpretability of machine learning-based prediction models in healthcare." [14 ]

A follow-up article, “LIME: Explain Machine Learning predictions - Intuition and Geometrical Interpretation of LIME," [15] ?covers the Local Interpretable Model-Agnostic Explanations (LIME) method for explaining predictions of Machine Learning models, which was developed by Marco Ribeiro et al. [16 ] ?

Steps of the LIME algorithm. Image: Giorgio Visani[17]
Steps of the LIME algorithm. Diagram: Giorgio Visani

Steps of the LIME algorithm. Diagram: Giorgio Visani [17]


Interpreting Uncertainty in ML

Bayes' theorem can be a powerful tool for enhancing the interpretability of machine learning models. It provides a probabilistic framework for understanding how evidence (data) updates our beliefs (model parameters).

These terms are central to Bayesian statistics, a framework for updating beliefs in the light of new evidence.

  • Prior: This is your initial belief or guess about something, before any evidence is considered. It's represented as a probability distribution.
  • Likelihood: This is the probability of observing the evidence, given a specific hypothesis or model. It measures how well the data fits your hypothesis.
  • Evidence: This is the new information or data you've collected.
  • Posterior: This is your updated belief after considering the evidence. It's calculated using Bayes' theorem, which combines the prior belief and the likelihood to arrive at a more informed posterior belief.

Here is how Bayes' theorem can contribute to interpretability:

1. Probabilistic Explanations

  • Uncertainty Quantification: Bayesian models naturally quantify uncertainty in predictions. This can be crucial in high-stakes decision-making, as it helps to understand the reliability of a models output.
  • Feature Importance: By examining the impact of different features on the posterior distribution, you can identify which features are most influential in driving predictions.

2. Model Inspection and Debugging

  • Sensitivity Analysis: Bayesian methods can be used to assess how sensitive model predictions are to changes in input features or prior assumptions. This can help identify potential biases or vulnerabilities in the model.
  • Model Comparison: Bayesian model comparison techniques allow us to compare different model structures and select the most appropriate one based on the evidence.

3. Human-Centric ML

  • Explainable ML: Bayesian models can provide human-understandable explanations for predictions, making them more trustworthy and transparent.
  • Interactive Learning: By incorporating human feedback into the Bayesian learning process, you can create models that are more aligned with human values and preferences

Specific Techniques

  • Bayesian Neural Networks: These models incorporate uncertainty into deep learning, allowing for more robust and interpretable predictions.
  • Bayesian Optimization: This technique can be used to optimize hyperparameters in a principled and efficient way, providing insights into the optimization process.
  • Probabilistic Graphical Models: These models represent complex relationships between variables using a graphical structure, making it easier to understand the underlying mechanisms of a system.

Challenges and Future Directions

While Bayesian methods offer significant advantages for interpretability, they can be computationally expensive and require careful modeling choices. However, with the increasing availability of computational resources and advances in algorithms, Bayesian methods are becoming more practical for a wider range of machine learning applications.

By leveraging the power of Bayes' theorem, you can develop more transparent, reliable, and trustworthy ML systems that are better aligned with human values and needs.


Data Science

Understanding your ML model should start with the data collection, transformation, and processing because, otherwise, you will get “Garbage In, Garbage Out” (GIGO).

Data Quality Dashboard. Image: iMerit[18]
Data Quality Dashboard. Chart: iMerit

Data Quality Dashboard. Chart: iMerit [18 ]

There is a heavy cost of poor data quality to the success of your ML model. You will need to have a systematic method to improve your data quality. Most of the work is in your data preparation and consistency is a key to data quality. [Data Science Approaches to Data Quality: From Raw Data to Datasets]

"Collecting better data, building data pipelines, and cleaning data can be tedious, but it is very much needed to be able to make the most out of data." The Data Science Hierarchy of Needs, by Sarah Catanzaro, is a checklist for "avoiding unnecessary modeling or improving modeling efforts with feature engineering or selection." [Serg Masis]

The Data Science Hierarchy of Needs. Image: Serg Masis
The Data Science Hierarchy of Needs. Diagram: Serg Masis

The Data Science Hierarchy of Needs. Diagram: Serg Masis

Learning goals and objectives are significant to establish. Organizing objectives helps to clarify objectives.

"Bloom's taxonomy is a set of three hierarchical models used for the classification of educational learning objectives into levels of complexity and specificity. The three lists cover the learning objectives in the cognitive, affective, and psychomotor domains.

Bloom's Revised Taxonomy. Diagram: Vanderbilt University Center for Teaching

Bloom's Revised Taxonomy. Diagram: Vanderbilt University Center for Teaching

There are six levels of cognitive learning according to the revised version of Bloom's Taxonomy. Each level is conceptually different. The six levels are?remembering, understanding, applying, analyzing, evaluating, and creating. The new terms are defined as:

  • Remembering: Retrieving, recognizing, and recalling relevant knowledge from long-term memory.
  • Understanding: Constructing meaning from oral, written, and graphic messages through interpreting, exemplifying, classifying, summarizing, inferring, comparing, and explaining.
  • Applying: Carrying out or using a procedure through executing, or implementing.
  • Analyzing: Breaking material into constituent parts, determining how the parts relate to one another and to an overall structure or purpose through differentiating, organizing, and attributing.
  • Evaluating: Making judgments based on criteria and standards through checking and critiquing.
  • Creating: Combining elements to form a coherent or functional whole; reorganizing elements into a new pattern or structure through generating, planning, or producing." [Anderson & Krathwohl, 2001, pp. 67-68]

This Bloom's taxonomy was adapted for machine learning.

Bloom’s Taxonomy Adapted for Machine Learning (ML). Visual Science Informatics, LLC
Bloom’s Taxonomy Adapted for Machine Learning (ML). Diagram: Visual Science Informatics, LLC

Bloom’s Taxonomy Adapted for Machine Learning (ML). Diagram: Visual Science Informatics, LLC

There are six levels of model learning in the adapted version of Bloom's Taxonomy for ML. Each level is a conceptually different learning model. The levels order is from lower-order learning to higher-order learning. The six levels are?Store, Sort, Search, Descriptive, Discriminative,?and?Generative. Bloom’s Taxonomy adapted for Machine Learning (ML) terms are defined as:

  • Store?models capture three perspectives: Physical, Logical, and Conceptual data models. Physical data models describe the physical means by which data are stored. Logical data models describe the semantics represented by a particular data manipulation technology. Conceptual data models describe a domain's semantics in the model's scope. Extract, Transform, and Load (ETL) operations are a three-phase process where data is extracted, transformed, and loaded into store models. Collected data can be from one or more sources. ETL data can be stored in one or more models.
  • Sort?models arrange data in a meaningful order and systematic representation, which enables searching, analyzing, and visualizing.
  • Search?models solve a search problem to retrieve information stored within some data structure, or calculated in the search space of a problem domain, either with discrete or continuous values.
  • Descriptive?models specify statistics that quantitatively describe or summarize features and identify trends and relationships.?
  • Discriminative?models focus on a solution and perform better for classification tasks by dividing the data space into classes by learning the boundaries.
  • Generative?models understand how data is embedded throughout space and generate new data points.

Conditional Generative Adversarial Network Model Architecture Example. Image: Jason Brownlee
Conditional Generative Adversarial Network Model Architecture Example. Diagram: Jason Brownlee

Conditional Generative Adversarial Network Model Architecture Example. Diagram: Jason Brownlee

Another decision point in choosing your ML model, which also impacts your model's interpretability and explainability, is the difference between a discriminative vs. a generative model. A discriminative approach focuses on a solution and performs better for classification tasks by dividing the data space into classes by learning the boundaries. A generative approach models understand how data is embedded throughout space and generates new data points.

Discriminative vs. Generative. Image: Supervised Learning Cheatsheet
Discriminative vs. Generative. Table: Supervised Learning Cheatsheet

Discriminative vs. Generative. Table: Supervised Learning Cheatsheet


Types of Variables in Data Science

Variables are the characteristics or attributes that describe a dataset. They can be classified into different types based on their nature and the type of data they represent.

Types of Variables in a Dataset. Diagrams: Avi Chawla

Types of Variables in a Dataset. Diagrams: Avi Chawla

Key Variable Types in a Causal Dataset. Diagram: Casual Wizard

Key Variable Types in a Causal Dataset. Diagram: Causal Wizard

These types of variables can be grouped into:

  • Causal Inference: Reveals information about another variable (determine causality).
  • Core: Variables being measured or tested in an experiment.
  • Related: Variables modify the relationship between two other variables.
  • Time-Related: Variables measured at a previous point in time (time series analysis).
  • Unobservable: Variable inferred indirectly through a mathematical model from other observable variables that can be directly observed or measured.

Types of Variables in a Data Science. Table: Gemini

Types of Variables in a Data Science. Table: Gemini

Knowing and understanding the type of your dataset variables are important for:

  • Choosing appropriate ML computational methods
  • Employing feasible Interpretability or Explainability methods
  • Considering proxy labels—an imperfect approximation of a direct label
  • Selecting suitable visualization techniques
  • Making informed decisions based on the data


Visualization for Data Quality

You should check and analyze your data even before you train a model because you might discover data quality issues in your data. Identifying common data quality issues such as missing data, duplicated data, and inaccurate, ambiguous, or inconsistent data can help you find data anomalies and perform feature engineering. [Data Science Approaches to Data Quality: From Raw Data to Datasets]

TensorFlow Data Validation & Visualization Tools. Image: TensorFlow.org [19]
TensorFlow Data Validation & Visualization Tools. Table: TensorFlow

TensorFlow Data Validation & Visualization Tools. Table: TensorFlow.org [19]

TensorFlow Data Validation provides tools for visualizing the distribution of feature values. By examining these you can identify your data distribution, scale, or label anomalies.

Another way you can visualize data on a different axis is by using Facets for data analytics. Facets create a graphical user interface where you can select columns and axis that you want to understand and analyze the association between different features.

Facets for Data Analysis & Visualization. Image: Himanshu Sharma [20]
Facets for Data Analytics & Visualization. Table: Himanshu Sharma

Facets for Data Analytics & Visualization. Table: Himanshu Sharma [20]

Feature engineering or feature extraction or feature discovery is using domain knowledge to extract or create features (characteristics, properties, and attributes) from raw data. If feature engineering is performed effectively, it improves the machine learning process. Consequently, it increases the predictive power and improves ML model accuracy, performance, and quality of results by creating extra features that effectively represent the underlying model. The feature engineering step is a fundamental part of the data pipeline, which leverages data preparation, in the machine learning workflow.

For example, in a study of "Modeling multiple sclerosis using mobile and wearable sensor data," the researchers "data analysis objectives were to identify the most reliable, clinically useful, and available features derived from mobile and wearable sensor data. Their machine learning pipeline identifies the best-performing features..." [Shkurta Gashi et. al.]

Study Design and Data Modeling Setup. Table adapted: Shkurta Gashi et. al.

Study Design and Data Modeling Setup. Table adapted: Shkurta Gashi et. al.

For accurate predictions, your data must not only be properly processed but also should process the "right data." To improve your data quality with unsupervised ML, you can employ the Uniform Manifold Approximation and Projection (UMAP) algorithm. "The UMAP algorithm allows the construction of a high-dimensional graph representation of data and further optimization of a low-dimensional graph to be as structurally similar as possible." [21]

UMAP projection to various datasets, powered by umap-js
UMAP Projection to Various Datasets. Animation: Powered by umap-js

UMAP Projection to Various Datasets. Animation: Powered by umap-js

No alt text provided for this image
Dimensionality Reduction. Map: ?Andy Coenen and Adam Pearce

Dimensionality Reduction. Map: ?Andy Coenen and Adam Pearce [22]

UMAP projection of various datasets with a variety of common values. Image: Andy Coenen and Adam Pearce
UMAP Projection of Various Datasets with a Variety of Common Values. Table: Andy Coenen & Adam Pearce

UMAP Projection of Various Datasets with a Variety of Common Values. Table: Andy Coenen & Adam Pearce


Animation

You can continue to obtain insight into your ML model decision reasoning during model training.

Gradient descent of a linear fit in three dimensions. Image: Tobias Roeschl [23]
Gradient Descent of a Linear Fit in Three Dimensions. Animation: Tobias Roeschl

Gradient Descent of a Linear Fit in Three Dimensions. Animation: Tobias Roeschl [23]

Logistic Regression Weights are updated at each iteration. Image: Adarsh Menon [24]
Logistic Regression Weights are Updated at Each Iteration. Animation: Adarsh Menon

Logistic Regression Weights are Updated at Each Iteration. Animation: Adarsh Menon [24]

Logistic regression curve and surface plot of costs. Image: Tobias Roeschl [25]
Logistic Regression Curve and Surface Plot of Costs. Animation: Tobias Roeschl

Logistic Regression Curve and Surface Plot of Costs. Animation: Tobias Roeschl [25]

Support Vector Machine (SVM) Classifier boundary plot. Image: Bruno Rodrigues [26]
Support Vector Machine (SVM) Classifier Boundary Plot. Animation:?Bruno Rodrigues

Support Vector Machine (SVM) Classifier Boundary Plot. Animation:?Bruno Rodrigues [26]

Ultimately, you can explain your trained decision tree model employing interactive visualizations.

https://www.r2d3.us/visual-intro-to-machine-learning-part-1
Training Data Flow Through a Decision Tree. Diagram: Stephanie Yee & Tony Chu

Training Data Flow Through a Decision Tree. Diagram: Stephanie Yee & Tony Chu [27]


Visualization for Interpretability

After examining your trained model’s overall accuracy metric or getting finer-grained measurements such as Precision, Recall, and F1-Score, you can use diagnostic tools by visualizing plots such as Receiver Operating Characteristic (ROC) Curves and Precision-Recall Curves [28] that can help in interpretation. ?

Interpretability is important if you must understand and interpret the phenomenon being modeled, debug a model, or begin to trust its decisions. TensorFlow provides a plot that you can visually follow a decision tree structure. [29 ]

TensorFlow provides a plot that you can visually follow the decision tree structure [29].
Visually Follow a Decision Tree Structure. Animation: TensorFlow

Visually Follow a Decision Tree Structure. Animation: TensorFlow

Random Forests are a popular type of decision forest model. Here, you can see a forest of trees classifying an example by voting on the outcome. Images: Mathieu Guillame-Bert et al.
Random Forests. Diagram: Mathieu Guillame-Bert et al.

Random Forests. Diagram: Mathieu Guillame-Bert et al.

Random Forests are a popular type of decision forest model. Here, you can see a forest of trees classifying an example by voting on the outcome. Image: Mathieu Guillame-Bert et al.


Visualization for Explainability

Explainability is important if you must understand and explain your features' importance, loss function, and model parameters and hyper-parameters.

An Artificial Neuron in action. Image: Anddy Cabrera
An Artificial Neuron in Action. Animation: Anddy Cabrera

An Artificial Neuron in Action. Animation: Anddy Cabrera

“An artificial neuron simply hosts the mathematical computations. Like our neurons, it triggers when it encounters sufficient stimuli. The neuron combines input from the data with a set of coefficients, or weights, which either amplify or dampen that input, which thereby assigns significance to inputs for the task the algorithm is trying to learn." [30]

Backpropagation is a fundamental algorithm used to train artificial neural networks. It is essentially a computational method for calculating the gradient of the error function with respect to the network's weights. In simpler terms, it helps the network learn from its mistakes by adjusting its parameters to minimize the error between its predicted output and the actual output.

Deep Neural Networks (DNNs) are trained using large sets of labeled or unlabeled data and increasingly learn abstract features directly from the data without manual feature extraction. Traditional neural networks may contain around 2-3 hidden layers, while deep networks can have as many as 100-200 hidden layers.

Deep learning mimics brain learning through examples and layers of networks. Adatis
Deep Learning Mimics Brain Learning Through Examples and Layers of Networks. Animation: Adatis

Deep Learning Mimics Brain Learning Through Examples and Layers of Networks. Animation: Adatis

Although DNNs have high predictive power, but have low interpretability because the nature of deep networks is a black box where the inner working of deep networks is not fully explainable.

Neural Networks' Architectures: ANN, RNN, LSTM & CNN. Diagrams: A. Catherine Cabrera, and B. InterviewBit

Neural Networks' Architectures: ANN, RNN, LSTM & CNN. Diagrams: A. Catherine Cabrera, and B. InterviewBit

Different neural networks have distinct architectures tailored to their functions and strengths. Here are description of major neural networks' architectures:

  • Artificial Neural Network (ANN): ANN is the foundation for other NN's architectures. ANNs are loosely inspired by the structure and function of the human brain. They consist of interconnected nodes called neurons, arranged in layers. Data is fed into the input layer, processed through hidden layers, and an output is generated. ANNs are powerful for various tasks such as function approximation, classification, and regression.
  • Recurrent Neural Network (RNN): RNNs are a special kind of ANN designed to handle sequential data such as text or speech. Unlike ANNs where data flows forward, RNNs have connections that loop back, allowing information to persist across steps. This is helpful for tasks such as language translation, speech recognition, and time series forecasting. However, RNNs can struggle with long-term dependencies in data.
  • Long Short-Term Memory (LSTM): LSTMs are a type of RNN specifically designed to address the long-term dependency problems of RNNs. LSTMs have internal mechanisms that can learn to remember information for longer periods, making them very effective for tasks such as machine translation, caption generation, and handwriting recognition.
  • Convolutional Neural Network (CNN): CNNs are another specialized type of ANN excelling at image and video analysis. CNNs use a specific architecture with convolutional layers that can automatically extract features from the data. This makes them very powerful for tasks such as image recognition, object detection, and image segmentation.

Here is a table summarizing the key differences:

Key Differences of Neural Networks' Architectures. Table: Gemini

Key Differences of Neural Networks' Architectures. Table: Gemini

ML Algorithm Cheat Sheet. Diagram: Microsoft

ML Algorithm Cheat Sheet. Diagram: Microsoft


Interactive Visualization Tools

“ML is great until you have to explain it”. The modelStudio is an R package that makes it easy to interactively explain ML models employing four techniques and explainable plots: 1.) Feature Importance, 2.) Break Down Plot, 3.) Shapley Values, and 4.) Partial Dependence.

Explain your machine learning models with R modelStudio [31]. Image: Matt Dancho [32]
Explain ML Models with R modelStudio. Charts: Matt Dancho

Explain ML Models with R modelStudio. [31] Charts: Matt Dancho [32]

In essence: “Seeing Machines Learn”

Warmth maps of neural community layers. Image: TensorFlow Playground Daniel Smilkov & Shan Carter [33]
Warmth Maps of Neural Community Layers. Animation: TensorFlow Playground, Daniel Smilkov & Shan Carter

Warmth Maps of Neural Community Layers. Animation: TensorFlow Playground, Daniel Smilkov & Shan Carter [33]

"Fernanda Viégas and Martin Wattenberg started the OpenVis conference by opening up the black box of neural networks. A guided tour of the playground.tensorflow.org [34] revealed the beautiful process of neurons digesting wisely chosen features and learning from them in front of our eyes." [35]

Seeing Machines Think. Video: Fernanda Viégas and Martin Wattenberg [40]

Next, read the "Operations: ML Operations (MLOps), Continuous ML & AutoML " article at https://www.dhirubhai.net/pulse/ml-operations-mlops-continuous-automl-yair-rajwan-ms-dsc .

---------------------------------------------------------

[1] https://www.dhirubhai.net/pulse/machine-learning-101-which-ml-choose-yair-rajwan-ms-dsc

[2] https://www.dhirubhai.net/pulse/accuracy-bias-variance-tradeoff-yair-rajwan-ms-dsc

[3] https://www.dhirubhai.net/pulse/complexity-time-space-sample-yair-rajwan-ms-dsc

[4] https://getdigitaltech.com/whats-explainable-ai

[5] https://www.bmc.com/blogs/machine-learning-interpretability-vs-explainability

[6] https://www.kdnuggets.com/2018/12/machine-learning-explainability-interpretability-ai.html

[7] https://towardsdatascience.com/how-to-measure-interpretability-d93237b23cd3

[8] https://www.bmc.com/blogs/machine-learning-interpretability-vs-explainability

[9] https://www.kdnuggets.com/2018/12/machine-learning-explainability-interpretability-ai.html

[10] https://towardsdatascience.com/interperable-vs-explainable-machine-learning-1fa525e12f48

[11] https://blogs.commons.georgetown.edu/cctp-607-spring2019/category/final-project

[12] https://www.kdnuggets.com/2018/12/machine-learning-explainability-interpretability-ai.html

[13] https://towardsdatascience.com/explainable-machine-learning-9d1ca0547ae0

[14] https://wires.onlinelibrary.wiley.com/doi/abs/10.1002/widm.1379

[15] https://towardsdatascience.com/lime-explain-machine-learning-predictions-af8f18189bfe

[16] https://dl.acm.org/doi/abs/10.1145/2939672.2939778

[17] https://www.dhirubhai.net/in/giorgio-visani

[18] https://imerit.net/blog/quality-labeled-data-all-pbm

[19] https://www.tensorflow.org/tfx/guide/tfdv#using_visualizations_to_check_your_data

[20] https://towardsdatascience.com/machine-learning-data-visualization-4c386fe3d971

[21] https://mobidev.biz/blog/unsupervised-machine-learning-improve-data-quality

[22] https://pair-code.github.io/understanding-umap

[23] https://towardsdatascience.com/gradient-descent-animation-1-simple-linear-regression-e49315b24672

[24] https://towardsdatascience.com/logistic-regression-explained-and-implemented-in-python-880955306060

[25] https://towardsdatascience.com/animations-of-logistic-regression-with-python-31f8c9cb420

[26] https://towardsdatascience.com/the-simplest-way-of-making-gifs-and-math-videos-with-python-aec41da74c6e

[27] https://www.r2d3.us/visual-intro-to-machine-learning-part-1

[28] https://machinelearningmastery.com/roc-curves-and-precision-recall-curves-for-classification-in-python

[29] https://blog.tensorflow.org/2021/05/introducing-tensorflow-decision-forests.html

[30] https://www.mql5.com/en/articles/5486

[31] https://cran.r-project.org/web/packages/modelStudio/index.html

[32] https://www.business-science.io/r/2022/02/22/my-4-most-important-explainable-ai-visualizations-modelstudio.html

[33] https://playground.tensorflow.org ????????????????????????????????????????????????????????

[34] https://medium.com/@UdacityINDIA/everything-cool-from-tensorflow-developer-summit-2018-7e22da4913de

[35] https://blog.interactivethings.com/notes-from-openvis-conference-2016-577c80cd7a01

[36] https://www.dhirubhai.net/pulse/ml-operations-mlops-continuous-automl-yair-rajwan-ms-dsc

要查看或添加评论,请登录

Yair R.的更多文章

社区洞察

其他会员也浏览了