Recurrent Neural Networks
Mounic Madiraju
Web Solutions Architect @ Sonova Group | Ex- AirAsia, Monster, CGI | Cloud Solutions | Digital Transformation | Data & Analytics | Gen AI | LLM | Omnichannel CX | Delivering Customized, High-Impact Solutions
An RNN is a category of artificial neural network (ANN) and finds its major applications in Natural Language Processing (NLP) and speech recognition. These networks are designed to identify the sequential characteristics in the data and make use of the detected patterns to predict the next expected scenario.
The RNNs are deployed in Deep Learning and for the designing of models that try to simulate the human neural activity inside the brain. These are especially useful in the cases where the context is crucial to predict the outcome and vary from other artificial neural networks as they incorporate feedback loops for the processing of data sequences which inform the final output. The final output can be a data sequence also. The loops help in information persistence and this effect is usually described as holding memory.
Working:
The RNNs keeps a memory of the past and the current decisions are dependent on the information from the past. The RNNs take vector(s) as input and generates one or more output vectors. These output vectors are affected by the applied weights on the input vectors like simple neural networks. They are also influenced by a state vector which represents the context-based or previous inputs or outputs. So the output from and input varies depending upon the previous inputs of the sequence.
Parameter Sharing in RNN:
Parameter sharing occurs when one filter applied to the previous layer, results in complexity. It is so because of the number of units that are involved in that specific layer. All of these units will share the same weight. This is the reason it is called “parameter sharing”.
In case of RNNs same weights are applied to different items in the input again and again. The parameters are shared across the inputs in RNNs. If these parameters are not shared then the RNN becomes more like vanilla networks where each input requires their own weights. This puts a constraint on the length of input to be fixed. This makes it impossible to incorporate a series input where the length is unknown or differing.
Elman networks and Jordan networks
They are also called as the “Simple RNN”. Elman is a three-layer network which also includes a set of context units. The middle layer is hidden and is connected to these units with a weight 1. The Jordan networks on the other hand are fed using an outer layer rather than the hidden layer. The context units in these networks are also known as the state layer. They connect to themselves recurrently.
In the case of Elman Filters,
In case of the Jordan filters,
Hopfield Network
In this type of RNNs, the connections are symmetric. Stationary inputs are required for this type of RNN because it does not process the data sequence. Convergence is guaranteed in this case.
IndRNN (Independently RNN)
These types of networks address the exploding issues and the gradient vanishing in fully connected RNNs. The neurons are independent of each other and receive their own past information as the context information. The backpropagation of gradient can be regularized for avoiding the gradient vanishing and exploding for keeping the short or long-term memories. The IndRNN can be trained with robustness using the non-saturated nonlinear functions like ReLU. The deep networks can be trained using the skip connections.
Continuous Time RNNs
A system of ordinary differential equations is used in continuous time recurrent neural network (CTRNN) for modelling the effects on the neuron in case of an incoming pike train.
The rate of change of activation for a given neuron in the network with action potential is denoted as
Libraries
There are a number of libraries available for RNNs most common of which are
- Apache Singa
- Caffe: It supports GPU and CPU. It is developed in C++ and has wrappers for MATLAB and Python
- Deeplearning4j: this library is built for deep learning in Java and Scala. Supports GPU only and allows the creation of custom layers. Integration with Hadoop and Kafka is supported.
- Microsoft Cognitive Tool
- Pytorch: Developed in Python, it has strong GPU acceleration and supports tensor and dynamic neural networks.
Application of RNN
A few of vast applications of RNNs include Robot Control, Machine translator, Speech recognition and synthesis, anomaly detection in time series, music composition, action recognition and prediction in medical care pathways.
Senior Analyst (SEO Strategy) - Helping businesses to generate leads through Organic SEO and AI SEO (ChatGPT)
4 年Nice blog