登录查看更多内容

Why RNN?

CHETAN SALUNKE

Data Scientist| Globally Certified Tensorflow Developer |Silver Medal in Master Of Statistics |ML| DL| NLP|LLM| Gen AI| Promt Engineering IBM Certified Data Professional| Python| SQL| Power BI| Statistics.

发布日期: 2024年4月5日

RNN stands for RECURRENT NEURAL NETWORK.

RNN is a type of neural network that can remember things. It does this by having connections between its nodes that loop back around to the same node. This allows the network to keep track of what it has seen or heard in the past, which can be helpful for tasks like machine translation or text generation. Imagine RNN as a brain that can remember things. It's like having a bunch of friends in a circle, and you pass a note around. Each friend adds something to the note based on what they've seen before, and this note keeps going around. So, the network can remember what it's seen or heard before because it keeps looping back to what it already knows.

then a question arises,

Why we are not using ANN or CNN?

The answer lies in the type of data that we are using to process.

Sequential Data: sequential data is a type of the data where sequence or order of the data points matters. In simple words, the values of the current data points depend on the previous data points. For example text data. In the text data if we change the order of data then the contextual meaning of data will change. Also, we can consider time series data where the value of the current time depends on its previous time. A practical example of time series data is stock data.

Our sentence will have a proper meaning when words are in proper sequence. It's important to text in the proper order. So Sequential data and its order have a lot of importance in applications. The modeling related to sequence data is known as sequence modeling.

Then what's wrong with ANN and CNN?

Let's see the reasons why we can't use ANN and CNN for sequential modeling.

1. Fixed Input and output neurons.

We know once we fixed input and output neurons in ANN and CNN then we can't change it through iterations. Where the problems like machine translation, we can't be sure how many words will form from translation as an output.

As you can see in the above image, I use Google to translate the text from English to Hindi. In English, I passed 7 words, but in return got 10 letters. It proved my fact that in such scenarios, output never will be fixed and we can't assign an exact number of output neurons to it.

2. Parameter Sharing

Using convolution operation we can share the parameters because of its Invariance translation property. If you slightly change the words in a sentence but the overall meaning stays the same, it's like changing small details in an image. Convolution operation can still recognize the pattern because it shares parameters, but Artificial Neural Networks might struggle because they don't share parameters as effectively.

领英推荐

Neural Network Gradient Descent: Machine Learning…

Doug Rose 9 个月前

Neural Network Chain Rule: Understanding the…

Doug Rose 9 个月前

Transformers without pain ??

Ibrahim Sobh - PhD 4 年前

3. Computations

Imagine you want to build a system that automatically tags each word in a sentence with its part of speech, like noun, verb, adjective, etc. To do this, you might use a technique called one-hot encoding.

One-hot encoding is like making a big table where each row represents a word in your vocabulary, and each column represents a possible part of speech (like noun, verb, etc.). When you see a word, you mark the column for its part of speech with a 1, and all the other columns stay at 0.

Now, if you have a large vocabulary and many possible parts of speech, your table would be huge, right? This means your input data becomes very big because you have to represent each word as a 1 in its part of the speech column and all other columns as 0s.

As a result, you end up with a massive table, which leads to a lot of computations and a lot of empty cells (sparse matrices). This can make your system slow and inefficient, especially when dealing with large amounts of text data.

4. Independent of Previous outputs

When we are working with ANN, we assume that prediction on one label/category will be independent of the next prediction. Because each example is treated as independent. But what if I want to predict the next word or I want to make a bot taking previous outputs into consideration? In such scenarios, we can only use RNN.

Due to this many problems, we searched for a method that will be helpful in all the above scenarios.

Adityaraj Patil

Aspiring Actuary | CM1

11 个月

Superb ??

1 次回应

要查看或添加评论，请登录

CHETAN SALUNKE的更多文章

Introduction to Azure DevOps

2024年8月11日

Introduction to Azure DevOps

Azure DevOps is a powerful suite of tools from Microsoft that facilitates seamless collaboration and continuous…
Delve deeper into R-squared.

2024年5月13日

Delve deeper into R-squared.

A good model can have a low R2 value. On the other hand, a biased model can have a high R2 value! R-squared is a…
Why LSTM?

2024年5月9日

Why LSTM?

because simple RNN suffers two main problems 1)Vanishing Gradient problem 2)Exploding Gradient Problem what is the…

2 条评论
How RNN Works?

2024年4月11日

How RNN Works?

RNN Stands for Recurrent Neural Network. Recurrent has its very proper meaning, Returning or happening time after time.
Why we prefer Convolution Neural Networks (CNN) for Image data?

2024年3月12日

Why we prefer Convolution Neural Networks (CNN) for Image data?

The answer of this Question hidden in the Architecture of the Convolution Neural Network which is quite uncommon than…
???? Discovering Adjusted R-squared: Your Guide to Better Regression Models! ????

2023年8月9日

???? Discovering Adjusted R-squared: Your Guide to Better Regression Models! ????

Why the Adjusted R-Square get increase only by adding a significant variable to the model? What is Mathematics and…

1 条评论

See all articles

Why RNN?

CHETAN SALUNKE

Data Scientist| Globally Certified Tensorflow Developer |Silver Medal in Master Of Statistics |ML| DL| NLP|LLM| Gen AI| Promt Engineering IBM Certified Data Professional| Python| SQL| Power BI| Statistics.

1. Fixed Input and output neurons.

2. Parameter Sharing

领英推荐

3. Computations

4. Independent of Previous outputs

CHETAN SALUNKE的更多文章

社区洞察

其他会员也浏览了

ARTIFICIAL NEURAL NETWORK Notes from the AI Advance course-Class 25 by Irfan Malik & Dr Sheraz Naseer (Xeven Solutions)

Neural model in 60sec - How does an AI model work?

Multilayer Network, Threshold Unit, Feedforward Network.

KAN Do

Part 2 — How machines Learn

How to Master LLMs: Part 2 — Understanding Backpropagation and Its Role in AI

Understanding Neural Networks and GPT: A Comprehensive Guide

Configuring a Neural Network Output Layer

AI has to defend or explain too!

LSTMs: A Beginner-Friendly Guide

1. Fixed Input and output neurons.

2. Parameter Sharing

领英推荐

3. Computations

4. Independent of Previous outputs

CHETAN SALUNKE的更多文章

Introduction to Azure DevOps

Delve deeper into R-squared.

Why LSTM?

How RNN Works?

Why we prefer Convolution Neural Networks (CNN) for Image data?

???? Discovering Adjusted R-squared: Your Guide to Better Regression Models! ????

社区洞察

其他会员也浏览了

ARTIFICIAL NEURAL NETWORK Notes from the AI Advance course-Class 25 by Irfan Malik & Dr Sheraz Naseer (Xeven Solutions)

Neural model in 60sec - How does an AI model work?

Multilayer Network, Threshold Unit, Feedforward Network.

KAN Do

Part 2 — How machines Learn

How to Master LLMs: Part 2 — Understanding Backpropagation and Its Role in AI

Understanding Neural Networks and GPT: A Comprehensive Guide

Configuring a Neural Network Output Layer

AI has to defend or explain too!

LSTMs: A Beginner-Friendly Guide