登录查看更多内容

The rationale behind the creation of Long Short-Term Memory (LSTM) networks

Ellie Kulsuma

发布日期: 2023年10月15日

Long short-term memory (LSTM) networks, one of the most advanced deep learning architectures for sequence learning tasks, such as handwriting recognition, speech recognition, or time series prediction.

LSTM networks belong to the class of recurrent neural networks (RNNs). To understand RNNs, you need to have basic knowledge of the Artificial Neural Network (ANN).

The Artificial Neural Network (ANN) functions are similar to the human brain and nervous system which are a form of AI. ANNs can be trained with datasets to conduct prediction models and learn the intrinsic relationships without parameters. These ANN models are being used as an efficient tool to reveal nonlinear relationships between inputs and outputs. Unlike conceptual models, using ANN models only deals with mathematical relationships between inputs and outputs which are not defined.

If you're new here, I suggest beginning by reading the previous post which thoroughly explore some of the most popular Deep Learning Algorithms.

To delve deeper into these topics, consider subscribing to this newsletter

The commonly used ANN model (feed-forward neural network) comprises three layers of input, hidden and output.

the ANN model can be mathematically formulated as:

领英推荐

Optimizing hidden layers of neural networks: AI web…

Rakuten Symphony 5 个月前

Understanding the Perceptron: The First Step in Deep…

Khichad Technologies 3 周前

Understanding Neural Networks: A Guide to Machine…

Doug Rose 2 个月前

Where??????is the input value to node?i,??????is the output at node?k,???1?is activation function (nonlinear) for the hidden layer and???2?is activation function (linear) for the output layer.?N?and?M?represent the number of neurons in the input and hidden layers, respectively.????????and????????are biases of the?jth neuron in the hidden layer and the?kth neuron in the output layer.????????is the weight between the input node?i?and the hidden node?j, and????????the weight between the hidden node?j?and the output node?k

Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTM) networks are both types of neural networks designed for sequential data processing, where the order of the data points matters.

The Long Short-Term Memory (LSTM) network was invented with the goal of addressing the vanishing gradients problem.

The vanishing gradients problem in Recurrent Neural Networks (RNNs) is a challenge that arises during the training process. It occurs when the gradients of the loss function with respect to the parameters diminish exponentially as they are backpropagated through time.

The vanishing gradients problem is particularly problematic when training RNNs to capture long-term dependencies in sequential data. If the model cannot effectively update the parameters associated with distant time steps, it may struggle to remember relevant information over extended sequences. This limitation makes it difficult for traditional RNNs to excel in tasks that require the understanding of context or relationships between events occurring far apart in a sequence.

To address the vanishing gradients problem, more advanced architectures, such as Long Short-Term Memory (LSTM) networks and Gated Recurrent Units (GRUs), have been developed.

Next week I will cover Long Short-Term Memory (LSTM) networks and Gated Recurrent Units (GRUs).

To delve deeper into these topics, consider subscribing to this newsletter

Technology & Business

547 位关注者

要查看或添加评论，请登录

Ellie Kulsuma的更多文章

Instruction Fine-Tuning for LLM using Open Instruction-Tuned LLM Dolly from Databricks

2024年12月19日

Instruction Fine-Tuning for LLM using Open Instruction-Tuned LLM Dolly from Databricks

Instruction fine-tuning has emerged as a game-changing technique in the world of large language models (LLMs). What is…
How to change PowerPoint presentation from Landscape to Portrait A4

2024年7月9日

How to change PowerPoint presentation from Landscape to Portrait A4

Let’s face it, we’re all tired of searching Google or YouTube for how to resize PowerPoint slides, only to waste 10…
What is Kolmogorov Complexity?

2024年6月26日

What is Kolmogorov Complexity?

Definiton: Kolmogorov complexity is the length of the ultimately compressed version of a file (i.e.
Top emerging roles in the field of Artificial Intelligence (AI) 2023

2023年11月5日

Top emerging roles in the field of Artificial Intelligence (AI) 2023

Most popular AI roles as of 2023 As Artificial Intelligence (AI) advances, several key roles are emerging and gaining…

2 条评论
What is the acceptable description of the concept of fuzzy algorithm?

2023年10月8日

What is the acceptable description of the concept of fuzzy algorithm?

A fuzzy algorithm is an ordered set of fuzzy instructions which upon execution yield an approximate solution to a given…
Explore some of the most popular Deep Learning Algorithms.

2023年8月11日

Explore some of the most popular Deep Learning Algorithms.

Deep learning focuses on training artificial neural networks with multiple layers (hence the term ‘deep’) to learn and…
can AI read you just by scanning you?

2023年6月21日

can AI read you just by scanning you?

As a scary fact. yes.
Are you getting enough sleep? If not, what steps can you take?

2023年5月11日

Are you getting enough sleep? If not, what steps can you take?

Individual sleep needs can vary, but it is important to ensure you are getting adequate rest. Alarmingly, approximately…
AI used in carbon offset projects

2023年3月12日

AI used in carbon offset projects

AI, a knowledge-intensive frontier technology, is creating a new virtual workforce to replace the traditional workforce…
Timeout bugs in cloud servers and cloud applications

2022年12月15日

Timeout bugs in cloud servers and cloud applications

The need to access data quickly is increasing. Therefore, effective and efficient infrastructure management is…

See all articles

The rationale behind the creation of Long Short-Term Memory (LSTM) networks

Ellie Kulsuma

领英推荐

Technology & Business

547 位关注者

Ellie Kulsuma的更多文章

社区洞察

其他会员也浏览了

Understanding Hidden Layers in a Neural Network for Machine Learning Purposes

You, Me and Bayesian Neural Networks (BNNs)

BxD Primer Series: Convolutional Neural Networks

Hyperparameter Tuning of Neural Networks: From Dirichlet Lens

BxD Primer Series: Long Short-Term Memory (LSTM) Neural Networks

Introduction to Neural Networks - Basics

Recurrent Neural Networks in Deep Learning — Part 1

Understanding Convolutional Neural Networks (CNNs)

What is Neural Network?

领英推荐

Technology & Business

547 位关注者

Ellie Kulsuma的更多文章

Instruction Fine-Tuning for LLM using Open Instruction-Tuned LLM Dolly from Databricks

How to change PowerPoint presentation from Landscape to Portrait A4

What is Kolmogorov Complexity?

Top emerging roles in the field of Artificial Intelligence (AI) 2023

What is the acceptable description of the concept of fuzzy algorithm?

Explore some of the most popular Deep Learning Algorithms.

can AI read you just by scanning you?

Are you getting enough sleep? If not, what steps can you take?

AI used in carbon offset projects

Timeout bugs in cloud servers and cloud applications

社区洞察

其他会员也浏览了

Understanding Hidden Layers in a Neural Network for Machine Learning Purposes

You, Me and Bayesian Neural Networks (BNNs)

BxD Primer Series: Convolutional Neural Networks

Hyperparameter Tuning of Neural Networks: From Dirichlet Lens

BxD Primer Series: Long Short-Term Memory (LSTM) Neural Networks

Introduction to Neural Networks - Basics

Recurrent Neural Networks in Deep Learning — Part 1

Understanding Convolutional Neural Networks (CNNs)

What is Neural Network?