BxD Primer Series: Markov Chain Neural Networks
Hey there ??
Welcome to BxD Primer Series where we are covering topics such as Machine learning models, Neural Nets, GPT, Ensemble models, Hyper-automation in ‘one-post-one-topic’ format. Today’s post is on?Markov Chain Neural Networks. Let’s get started:
The What:
Markov Chain Neural Networks (MCNNs) model the probabilistic relationship between sequential inputs. Unlike traditional neural networks that use fixed-size input vectors, MCNNs can handle variable-length input sequences by modeling the sequence as a Markov chain.
At a high level, MCNN consists of two main components: a neural network and a Markov chain.
Markov chain defines a set of states, where each state represents a particular input or feature in sequence. It then defines a set of transition probabilities that determine the likelihood of moving from one state to another. These transition probabilities are learned from training data using maximum likelihood estimation or Bayesian inference. We have covered the core concepts of Markov chains in a previous edition, check?here.
Markov chain represents the probabilistic relationship between successive inputs in a sequence. In a Markov chain, the probability of transitioning from one state to another is dependent only on the current state and not on any previous states.
Neural network component of MCNN is used to map current state of Markov chain to an output prediction. This mapping is learned by adjusting the weights of neural network to minimize the difference between predicted and actual outputs in training data.
During prediction, Markov chain is used to estimate probabilities of transitioning to each possible next state, and neural network is used to predict the output given the estimated probabilities.
Order of Markov Chain:
In a first-order Markov Chain, the probability of transitioning to next state depends only on the current state, and not on any previous states. This means that the current state fully captures all relevant information needed to predict next state.
For example, say we are trying to predict the weather. In a first-order model, the probability of weather being sunny or rainy tomorrow depends only on whether it is sunny or rainy today. We don't need to consider any previous days' weather to make prediction.
In a higher-order Markov Chain, the probability of transitioning to next state depends on the current state as well as the?k?previous states, where?k?is the order of Markov chain.
For example, say we are trying to predict the likelihood of a patient developing a certain disease. In a higher-order model, the probability of developing disease tomorrow depends on the current health status of patient, as well as their health status in previous?k?days.
The How:
Workings of a Hopfield neural network:
Step 1: Define the Markov Chain:
Step 2:?Network Initialization:
Step 3:?State Initialization: Let?x(0) = [x_1(0), x_2(0), ... , x_N(0)]?be the initial state vector, where?x_i(0)?represents the initial (0’th) activation of?i’th?neuron.
Step 4:?State Update: At each time step?t, compute the new state vector?x(t)?using following equation:
Where
Step 5:?Markov Chain Transition:
Step 6:?Repeat steps 4 and 5 for predefined number of time steps or until convergence.
Step 7:?Output:
Using Cases of?Markov Chain Neural Network:
Image Recognition:
Text classification and sentiment analysis:
Anomaly detection:
Goal here is to identify data points that deviate significantly from the norm in a dataset. It has many applications, such as detecting fraud in financial transactions, identifying defects in manufacturing processes, and monitoring health of complex IT systems.
Recommendation systems:
Time-series forecasting:
The Why:
Reasons for Using Markov Chain Neural Networks:
The Why Not:
Reasons for Not Using Markov Chain Neural Networks:
Time for you to support:
In next edition, we will cover Hopfield Neural Networks.
Let us know your feedback!
Until then,
Have a great time! ??