Long Short-Term Memory (LSTM)

Long Short-Term Memory (LSTM)

Long Short-Term Memory (LSTM) is a type of recurrent neural network (RNN) designed for handling sequences of data.

No alt text provided for this image

Imagine you're reading a story, and you want to understand the context of the current sentence. But sometimes, you need to remember information from earlier sentences to really get what's going on. That's what LSTM is!

No alt text provided for this image

It has three "gates":

  1. Forget Gate: It decides what information from the previous step to forget or remember.
  2. Input Gate: It decides what new information to add to the cell.
  3. Output Gate: It decides what the next hidden state should be.

Confused???!!!

No alt text provided for this image

Let's say, we have the number sequence: [2, 4, 6, 8, 10] and we want to predict the next number.

  1. Forget Gate: Just like an LSTM forgets some information, let's say the LSTM "forgets" that the sequence started with 2 and 4. It only keeps in mind 6, 8, and 10, thinking they're the most important.
  2. Input Gate: Now, the LSTM decides what new information to add. It sees the pattern of increasing by 2 between each number. The input gate lets this new info in, like adding 12 to the sequence.
  3. Output Gate: The LSTM then figures out what the next hidden state should be. Given the pattern it's learned, it predicts 12 as the next number.

No alt text provided for this image

Likewise LSTM reads, remembers, and predicts in a sequence of data.

Here we will implement an LSTM using Python and the popular library called TensorFlow to predict the next number in a sequence.

  1. Generate Data: Create number sequence.
  2. Input and Target: Divide sequence for learning.
  3. Normalize Data: Scale between 0-1.
  4. Build LSTM: Create learning framework.
  5. Compile Model: Set learning settings.
  6. Train Model: Learn from data.
  7. Evaluate Model: Measure learning quality.
  8. Make Predictions: Guess next numbers.
  9. Denormalize Predictions: Rescale predictions.
  10. Print Comparison: Compare guesses and reality.

Come on let's implement it!

No alt text provided for this image
import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense

# Generate example data
sequence_length = 10
data_size = 1000

# Generate a sequence of numbers
sequence = np.arange(data_size)

# Create input and target data
X = []
y = []
for i in range(data_size - sequence_length):
? ? X.append(sequence[i:i+sequence_length])
? ? y.append(sequence[i+sequence_length])

X = np.array(X)
y = np.array(y)

# Normalize the data
X = X / data_size
y = y / data_size

# Split the data into training and testing sets
split_ratio = 0.8
split_idx = int(split_ratio * len(X))
X_train = X[:split_idx]
y_train = y[:split_idx]
X_test = X[split_idx:]
y_test = y[split_idx:]

# Build the LSTM model
model = Sequential([
? ? LSTM(50, input_shape=(sequence_length, 1)),
? ? Dense(1)
])

# Compile the model
model.compile(optimizer='adam', loss='mean_squared_error')

# Train the model
model.fit(X_train.reshape(-1, sequence_length, 1), y_train, epochs=50, batch_size=32)

# Evaluate the model
loss = model.evaluate(X_test.reshape(-1, sequence_length, 1), y_test)
print("Test loss:", loss)

# Make predictions
predictions = model.predict(X_test.reshape(-1, sequence_length, 1))

# Denormalize the predictions
predictions_denormalized = predictions * data_size

# Print the first few predictions and actual values
for i in range(5):
? ? print("Predicted:", predictions_denormalized[i][0], "Actual:", y_test[i] * data_size)        

要查看或添加评论,请登录

Kiruthika Subramani的更多文章

  • RAG System with Video

    RAG System with Video

    Hello Everyone,It’s Friday, and guess who’s back? Hope you all had a fantastic week! This week, let’s dive into…

    2 条评论
  • Building a RAG System using Gemini API

    Building a RAG System using Gemini API

    Welcome to the first episode of AI Weekly with Krithi! In this series, we’ll explore various AI topics, tools, and…

    3 条评论
  • Evaluation methods for LLMs

    Evaluation methods for LLMs

    Hey all, Welcome back for the sixth Episode of Cup of Coffee Series with LLMs. Again we have Mr.

  • Different Fine-tuning Methods for LLMs

    Different Fine-tuning Methods for LLMs

    Hey all, Welcome back for the fifth Episode of Cup of Coffee Series with LLMs. Again we have Mr.

    1 条评论
  • Pretraining and Fine Tuning LLMs

    Pretraining and Fine Tuning LLMs

    Hey all, Welcome back for the fourth Episode of Cup of Coffee Series with LLMs. Again we have Mr.

  • Architecting Large Language Models

    Architecting Large Language Models

    Hey all, Welcome back for the third Episode of Cup of Coffee Series with LLMs. Again we have Mr.

  • LLMs #2

    LLMs #2

    Hey all, Welcome back for the second Episode of Cup of Coffee Series with LLMs. Again we have Mr.

    2 条评论
  • LLM's Introduction

    LLM's Introduction

    Hello Everyone! Kiruthika here, after a long. I am back with the cup of coffee series with LLMs.

    2 条评论
  • Transformers

    Transformers

    Hello, folks! Kiruthika is back after a long break. Yep, let's get started with our Cup of Coffee Series! Today, we…

    4 条评论
  • Generative Adversarial Network (GAN)

    Generative Adversarial Network (GAN)

    ??????Pour yourself a virtual cup of coffee with GANs after a long. Finally, we are stepping into 19 th week of this…

    1 条评论

社区洞察

其他会员也浏览了