Long Short-Term Memory (LSTM)
Kiruthika Subramani
Innovating AI for a Better Tomorrow | AI Engineer | Google Developer Expert | Author | IBM Dual Champion | 200+ Global AI Talks | Master's Student at MILA
Long Short-Term Memory (LSTM) is a type of recurrent neural network (RNN) designed for handling sequences of data.
Imagine you're reading a story, and you want to understand the context of the current sentence. But sometimes, you need to remember information from earlier sentences to really get what's going on. That's what LSTM is!
It has three "gates":
Confused???!!!
Let's say, we have the number sequence: [2, 4, 6, 8, 10] and we want to predict the next number.
领英推荐
Likewise LSTM reads, remembers, and predicts in a sequence of data.
Here we will implement an LSTM using Python and the popular library called TensorFlow to predict the next number in a sequence.
Come on let's implement it!
import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense
# Generate example data
sequence_length = 10
data_size = 1000
# Generate a sequence of numbers
sequence = np.arange(data_size)
# Create input and target data
X = []
y = []
for i in range(data_size - sequence_length):
? ? X.append(sequence[i:i+sequence_length])
? ? y.append(sequence[i+sequence_length])
X = np.array(X)
y = np.array(y)
# Normalize the data
X = X / data_size
y = y / data_size
# Split the data into training and testing sets
split_ratio = 0.8
split_idx = int(split_ratio * len(X))
X_train = X[:split_idx]
y_train = y[:split_idx]
X_test = X[split_idx:]
y_test = y[split_idx:]
# Build the LSTM model
model = Sequential([
? ? LSTM(50, input_shape=(sequence_length, 1)),
? ? Dense(1)
])
# Compile the model
model.compile(optimizer='adam', loss='mean_squared_error')
# Train the model
model.fit(X_train.reshape(-1, sequence_length, 1), y_train, epochs=50, batch_size=32)
# Evaluate the model
loss = model.evaluate(X_test.reshape(-1, sequence_length, 1), y_test)
print("Test loss:", loss)
# Make predictions
predictions = model.predict(X_test.reshape(-1, sequence_length, 1))
# Denormalize the predictions
predictions_denormalized = predictions * data_size
# Print the first few predictions and actual values
for i in range(5):
? ? print("Predicted:", predictions_denormalized[i][0], "Actual:", y_test[i] * data_size)