登录查看更多内容

Probabilistic graphical models for Deep Learning Part-1 (Restricted Boltzmann Machines)

Niraj Kumar, Ph.D.

AI/ML R&D Leader | Driving Innovation in Generative AI, LLMs & Explainable AI | Strategic Visionary & Patent Innovator | Bridging AI Research with Business Impact

发布日期: 2018年7月21日

RBM: Restricted Boltzmann machines are undirected graphical models that can also be interpreted as two-layered stochastic neural networks. It is very useful in (1) Unsupervised learning, and (2) Feature extraction, etc. Restricted Boltzmann machines have received a lot of attention after being proposed as building blocks of multi-layer learning architectures called deep belief networks (DBNs, [7, 8]).

Relation with deterministic feed-forward neural networks: “The idea is that the hidden neurons extract relevant features from the observations. These features can serve as input to another RBM. By stacking RBMs in this way, one can learn features from features in the hope of arriving at a high-level representation” [10]. It is an important property, due to this the single as well as stacked RBMs can be reinterpreted as deterministic feed-forward neural networks.

The summary of some other basic characteristics of RBM:

1. Architectural Features: RBM contains one layer of hidden units and one layer of visible units. There exist no connection between hidden units nor between visible units (i.e. a restriction applied to the Boltzmann Machine). In this representation, edges can be undirected or bi-directed. It actually forms a symmetric bi-partite graph.

2. RBMs as generative models: When an RBM is used as a generative model, it is used for drawing samples from the learned distribution.

3. The Training of RBM: Contrastive divergence (CD) algorithm is used to train the RBM [9]. The algorithm performs Gibbs sampling and is used inside a gradient descent procedure (similar to the way backpropagation is used inside such a procedure when training feedforward neural nets) to compute weight update.

Video links for more information about architecture and training of RBMs

Basic facts behind the usefulness of RBM in Feature Extraction, Classification etc.: Actually, RBM comprises of two types of variables: (1) a layer of visible variables which correspond to the components of the inputs ‘visible layer’, and (2) a layer of hidden (or latent) variables which capture dependencies between the visible neurons i.e., ‘hidden layer’. After training, the expected states of the hidden variables given an input can be interpreted as the (learned) features extracted from this input pattern. The dimensionality of the learned features depends upon the number of hidden units. We use these extracted features and relation between variables of both layers in most of the applications related to RBM. For example, lets us consider on few applications given below. [See Ref – 1 -10]

Classification: RBMs can be useful in the classification of the image, XML data, and text etc. If labeled training data is given, and the RBM is trained on the joint distribution of inputs and labels, Then we can have any of the two possibilities: (1) We can sample the missing label for a represented data from the distribution or (2) we can assign a new data to the class with the highest probability under the model [4].

Imbalanced data problem. In imbalance data problem, one class dominates another. To solve this issue we generally generate artificial examples for the dominated class, using Synthetic Oversampling Technique. For each of the newly created example, we apply Gibbs sampling. Finally, we label newly created example and store in training data [5, 6].

Noisy labels problem. This is one of the most common problem in classification. In this case, some of the examples in training data contain incorrectly assigned labels. So to correct those labels, RBM is trained for each of the classes separately. Each of the trained models is used as an oracle to detect uncorrected labelled data. Finally, Reconstruction error is used to determine unlabeled examples [5, 6].

Unstructured data. The data is represented in unprocessed form: images, videos, documents, XML structures. In such cases, RBM is used as domain-independent feature extractor that transforms raw data into hidden units.

RBM Vs Autoencoders.

Which is better RBM or autoencoders and why? Read below Yoshua’s answer to the roughly same question.

Does Yoshua Bengio prefer to use Restricted Boltzmann Machines or (denoising) Autoencoders as building blocks for deep networks? And why?

Reference:

David H Ackley, Georey E Hinton, and Terrence J Sejnowski. A learning algorithm for boltzmann machines. Cognitive science, 9(1):147{169, 1985.
S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi. Optimization by simulated annealing. Science, 220(4598):671{680, 1983.
G Hinton and S Osindero. A fast learning algorithm for deep belief nets. Neural computation, 2006.
Geoffrey Hinton (2010). A Practical Guide to Training Restricted Boltzmann Machines. UTML TR 2010–003, University of Toronto."
Maciej Zieba, Jakub M. Tomczak, Adam Gonczarek: RBM-SMOTE: Restricted Boltzmann Machines for Synthetic Minority Oversampling Technique. ACIIDS (1) 2015: 377-386.
Jakub M. Tomczak, Maciej Zieba: Classification Restricted Boltzmann Machine for comprehensible credit scoring model. Expert Syst. Appl. 42(4): 1789-1796 (2015.
Hinton, G.E.: Learning multiple layers of representation. Trends in Cognitive Sciences 11(10), 428–434 (2007)
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Hinton, G. E. (2002). "Training Products of Experts by Minimizing Contrastive Divergence" (PDF). Neural Computation. 14 (8): 1771–1800.
Fischer, Asja, and Christian Igel. "An introduction to restricted Boltzmann machines." Iberoamerican Congress on Pattern Recognition. Springer, Berlin, Heidelberg, 2012.

Suheel YOUSUF Wani

Building Scalable AI-Driven Reproducible Bioinformatics Workflows | IIITB

6 年

informative

要查看或添加评论，请登录

Niraj Kumar, Ph.D.的更多文章

Internal Covariate Shift and Batch Normalization

2023年3月25日

Internal Covariate Shift and Batch Normalization

Internal Covariate Shift Internal covariate shift [1,2,3] refers to the phenomenon where the distribution of inputs to…
Forced/Guided Learning in Deep Learning

2023年3月11日

Forced/Guided Learning in Deep Learning

The forced/guided type deep learning techniques have proven their ability in any model that outputs in sequences. For…
Deep Clustering (A Self-Supervised Learning System)

2023年2月18日

Deep Clustering (A Self-Supervised Learning System)

If you are interested in any of the following, How do I develop a deep learning model, that can learn to do clustering?…
Time to Welcome - “The Quantum Deep Learning”

2023年1月21日

Time to Welcome - “The Quantum Deep Learning”

The Quantum World is Approaching Us The MIT xPRO - Quantum Computer Ai, highlighted the status of quantum AI by using…
Deep Learning for Dynamic Graph

2022年4月30日

Deep Learning for Dynamic Graph

Introduction. It is well understood that adding the time dimension to each and every component of the graph helps us in…
Winning Ensemble Classification Strategies

2020年6月6日

Winning Ensemble Classification Strategies

These days (1) due to the increase in the complexity of data, (2) data quality-related issues, and (2) the demand for…
Simplest Tutorials on BERT and XLNet

2020年1月25日

Simplest Tutorials on BERT and XLNet

XLNet XLNet: is a generalized autoregressive pre-training method that (1) enables learning bidirectional contexts by…
Video Book on Deep Learning

2019年12月13日

Video Book on Deep Learning

I am happy to present a video book on deep learning. Thanks for all the email messages and suggestions.

3 条评论
Deep Learning for NLP Part-2

2019年10月12日

Deep Learning for NLP Part-2

Sequence transduction plays a very important role in natural language processing. The ability to transform and…
Loss Functions: Cross-Entropy, Categorical Cross-Entropy Loss, Binary Cross-Entropy Loss

2019年1月22日

Loss Functions: Cross-Entropy, Categorical Cross-Entropy Loss, Binary Cross-Entropy Loss

The following contains tutorial videos on (1) Cross-Entropy, (2) Categorical Cross-Entropy Loss, and (3) Binary…

See all articles

Probabilistic graphical models for Deep Learning Part-1 (Restricted Boltzmann Machines)

Niraj Kumar, Ph.D.

AI/ML R&D Leader | Driving Innovation in Generative AI, LLMs & Explainable AI | Strategic Visionary & Patent Innovator | Bridging AI Research with Business Impact

The summary of some other basic characteristics of RBM:

Video links for more information about architecture and training of RBMs

RBM Vs Autoencoders.

Reference:

Niraj Kumar, Ph.D.的更多文章

社区洞察

其他会员也浏览了

Table Parsing Made Simple with Homegrown Neural Networks - Part 3: Building a Neural Network with Semantic & Positional Features

Understanding deep learning models as overcoming limitations of previous models

From RNNs to Transformers: A Paradigm Shift in Deep Learning

Top 12 Deep learning Features

Deep Learning in Action: Building and Training a Neural Network for MNIST Classification and Exploring Backpropagation Through Gradient Descent

Table Parsing Made Simple with Homegrown Neural Networks - Part 4: Training Pipeline Coding Insights

Unlocking the Future of Finance: Deep Learning Models for Time Series Forecasting

BxD Primer Series: Deep Q-Network (DQN) Reinforcement Learning Models

Demystifying Artificial Neural Networks (ANNs): A Beginners Guide to Navigating Machine Learning in Healthcare

The summary of some other basic characteristics of RBM:

Video links for more information about architecture and training of RBMs

RBM Vs Autoencoders.

Reference:

Niraj Kumar, Ph.D.的更多文章

Internal Covariate Shift and Batch Normalization

Forced/Guided Learning in Deep Learning

Deep Clustering (A Self-Supervised Learning System)

Time to Welcome - “The Quantum Deep Learning”

Deep Learning for Dynamic Graph

Winning Ensemble Classification Strategies

Simplest Tutorials on BERT and XLNet

Video Book on Deep Learning

Deep Learning for NLP Part-2

Loss Functions: Cross-Entropy, Categorical Cross-Entropy Loss, Binary Cross-Entropy Loss

社区洞察

其他会员也浏览了

Table Parsing Made Simple with Homegrown Neural Networks - Part 3: Building a Neural Network with Semantic & Positional Features

Understanding deep learning models as overcoming limitations of previous models

From RNNs to Transformers: A Paradigm Shift in Deep Learning

Top 12 Deep learning Features

Deep Learning in Action: Building and Training a Neural Network for MNIST Classification and Exploring Backpropagation Through Gradient Descent

Table Parsing Made Simple with Homegrown Neural Networks - Part 4: Training Pipeline Coding Insights

Unlocking the Future of Finance: Deep Learning Models for Time Series Forecasting

BxD Primer Series: Deep Q-Network (DQN) Reinforcement Learning Models

Demystifying Artificial Neural Networks (ANNs): A Beginners Guide to Navigating Machine Learning in Healthcare