登录查看更多内容

Long Short-Term Memory (LSTM)

Kiruthika Subramani

Innovating AI for a Better Tomorrow | AI Engineer | Google Developer Expert | Author | IBM Dual Champion | 200+ Global AI Talks | Master's Student at MILA

发布日期: 2023年8月24日

+ 关注

Long Short-Term Memory (LSTM) is a type of recurrent neural network (RNN) designed for handling sequences of data.

Imagine you're reading a story, and you want to understand the context of the current sentence. But sometimes, you need to remember information from earlier sentences to really get what's going on. That's what LSTM is!

It has three "gates":

Forget Gate: It decides what information from the previous step to forget or remember.
Input Gate: It decides what new information to add to the cell.
Output Gate: It decides what the next hidden state should be.

Confused???!!!

Let's say, we have the number sequence: [2, 4, 6, 8, 10] and we want to predict the next number.

领英推荐

9 Tips to Design Hallucination-Free RAG/LLM Systems

Vincent Granville 2 个月前

Fast Classification and Clustering via Image…

Vincent Granville 2 年前

The Encoder Component of the Transformer Architecture:…

Ajay Taneja 1 年前

Forget Gate: Just like an LSTM forgets some information, let's say the LSTM "forgets" that the sequence started with 2 and 4. It only keeps in mind 6, 8, and 10, thinking they're the most important.
Input Gate: Now, the LSTM decides what new information to add. It sees the pattern of increasing by 2 between each number. The input gate lets this new info in, like adding 12 to the sequence.
Output Gate: The LSTM then figures out what the next hidden state should be. Given the pattern it's learned, it predicts 12 as the next number.

Likewise LSTM reads, remembers, and predicts in a sequence of data.

Here we will implement an LSTM using Python and the popular library called TensorFlow to predict the next number in a sequence.

Generate Data: Create number sequence.
Input and Target: Divide sequence for learning.
Normalize Data: Scale between 0-1.
Build LSTM: Create learning framework.
Compile Model: Set learning settings.
Train Model: Learn from data.
Evaluate Model: Measure learning quality.
Make Predictions: Guess next numbers.
Denormalize Predictions: Rescale predictions.
Print Comparison: Compare guesses and reality.

Come on let's implement it!

import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense

# Generate example data
sequence_length = 10
data_size = 1000

# Generate a sequence of numbers
sequence = np.arange(data_size)

# Create input and target data
X = []
y = []
for i in range(data_size - sequence_length):
? ? X.append(sequence[i:i+sequence_length])
? ? y.append(sequence[i+sequence_length])

X = np.array(X)
y = np.array(y)

# Normalize the data
X = X / data_size
y = y / data_size

# Split the data into training and testing sets
split_ratio = 0.8
split_idx = int(split_ratio * len(X))
X_train = X[:split_idx]
y_train = y[:split_idx]
X_test = X[split_idx:]
y_test = y[split_idx:]

# Build the LSTM model
model = Sequential([
? ? LSTM(50, input_shape=(sequence_length, 1)),
? ? Dense(1)
])

# Compile the model
model.compile(optimizer='adam', loss='mean_squared_error')

# Train the model
model.fit(X_train.reshape(-1, sequence_length, 1), y_train, epochs=50, batch_size=32)

# Evaluate the model
loss = model.evaluate(X_test.reshape(-1, sequence_length, 1), y_test)
print("Test loss:", loss)

# Make predictions
predictions = model.predict(X_test.reshape(-1, sequence_length, 1))

# Denormalize the predictions
predictions_denormalized = predictions * data_size

# Print the first few predictions and actual values
for i in range(5):
? ? print("Predicted:", predictions_denormalized[i][0], "Actual:", y_test[i] * data_size)

要查看或添加评论，请登录

Kiruthika Subramani的更多文章

RAG System with Video

2024年9月13日

RAG System with Video

Hello Everyone,It’s Friday, and guess who’s back? Hope you all had a fantastic week! This week, let’s dive into…

2 条评论
Building a RAG System using Gemini API

2024年9月6日

Building a RAG System using Gemini API

Welcome to the first episode of AI Weekly with Krithi! In this series, we’ll explore various AI topics, tools, and…

3 条评论
Evaluation methods for LLMs

2024年5月22日

Evaluation methods for LLMs

Hey all, Welcome back for the sixth Episode of Cup of Coffee Series with LLMs. Again we have Mr.
Different Fine-tuning Methods for LLMs

2024年5月10日

Different Fine-tuning Methods for LLMs

Hey all, Welcome back for the fifth Episode of Cup of Coffee Series with LLMs. Again we have Mr.

1 条评论
Pretraining and Fine Tuning LLMs

2024年5月5日

Pretraining and Fine Tuning LLMs

Hey all, Welcome back for the fourth Episode of Cup of Coffee Series with LLMs. Again we have Mr.
Architecting Large Language Models

2024年5月2日

Architecting Large Language Models

Hey all, Welcome back for the third Episode of Cup of Coffee Series with LLMs. Again we have Mr.
LLMs #2

2024年4月29日

LLMs #2

Hey all, Welcome back for the second Episode of Cup of Coffee Series with LLMs. Again we have Mr.

2 条评论
LLM's Introduction

2024年4月26日

LLM's Introduction

Hello Everyone! Kiruthika here, after a long. I am back with the cup of coffee series with LLMs.

2 条评论
Transformers

2023年12月25日

Transformers

Hello, folks! Kiruthika is back after a long break. Yep, let's get started with our Cup of Coffee Series! Today, we…

4 条评论
Generative Adversarial Network (GAN)

2023年10月24日

Generative Adversarial Network (GAN)

??????Pour yourself a virtual cup of coffee with GANs after a long. Finally, we are stepping into 19 th week of this…

1 条评论

See all articles

Long Short-Term Memory (LSTM)

Kiruthika Subramani

Innovating AI for a Better Tomorrow | AI Engineer | Google Developer Expert | Author | IBM Dual Champion | 200+ Global AI Talks | Master's Student at MILA

领英推荐

Kiruthika Subramani的更多文章

社区洞察

其他会员也浏览了

TensorFlow - Aamir?P

Kaggle “Dogs vs. Cats” Challenge?—?Complete Step by Step Guide?—?Part 2

Hand Gesture Recognition using ML Algorithms

Demystifying LSTM Models: A Guide to Gradient-Based Sensitivity Analysis

A simple CNN In TensorFlow: Practical CIFAR-10 Guide

Artificial Intelligence - Part 6.4 - Neural Network/Machine Learning Logistic Regression Algorithm

Decoding the Transformers: A Dive into GPT with TensorFlow

Table Parsing Made Simple with Homegrown Neural Networks (Part 1: Automating Large-Scale Table Processing)

Hand Gesture Recognition using ML Algorithms

Ch:14.1 Types of GAN's with?Math.

领英推荐

Kiruthika Subramani的更多文章

RAG System with Video

Building a RAG System using Gemini API

Evaluation methods for LLMs

Different Fine-tuning Methods for LLMs

Pretraining and Fine Tuning LLMs

Architecting Large Language Models

LLMs #2

LLM's Introduction

Transformers

Generative Adversarial Network (GAN)

社区洞察

其他会员也浏览了

TensorFlow - Aamir?P

Kaggle “Dogs vs. Cats” Challenge?—?Complete Step by Step Guide?—?Part 2

Hand Gesture Recognition using ML Algorithms

Demystifying LSTM Models: A Guide to Gradient-Based Sensitivity Analysis

A simple CNN In TensorFlow: Practical CIFAR-10 Guide

Artificial Intelligence - Part 6.4 - Neural Network/Machine Learning Logistic Regression Algorithm

Decoding the Transformers: A Dive into GPT with TensorFlow

Table Parsing Made Simple with Homegrown Neural Networks (Part 1: Automating Large-Scale Table Processing)

Hand Gesture Recognition using ML Algorithms

Ch:14.1 Types of GAN's with?Math.