登录查看更多内容

Day 22 — Gated Recurrent Units (GRU)

Ime Eti-mfon

Data Scientist | Machine Learning Engineer | Data Program Community Ambassador @ ALX

发布日期: 2025年2月12日

+ 关注

Concept: Simplified LSTM.
Implementation: Update gate.
Evaluation: Performance, complexity.

CONCEPT

Gated Recurrent Units (GRUs) are a type of recurrent neural network (RNN) designed to handle the vanishing gradient problem that affects traditional RNNs. GRUs are similar to Long Short-Term Memory (LSTM) units but are simpler and have fewer parameters, making them computationally more efficient.

KEY FEATURES OF GRU

Update Gate: Decides how much of the previous memory to keep.
Reset Gate: Decides how much of the previous state to forget.
Memory Cell: Combines the current input with the previous memory, controlled by the update and reset gates.

KEY STEPS

Reset Gate: Determines how to combine the new input with the previous memory.
Update Gate: Determines the amount of previous memory to keep and combine with the new candidate state.
New State Calculation: Combines the previous state and the new candidate state based on the update gate.

IMPLEMENTATION

Let’s implement a GRU for a sequence prediction problem using Keras.

# Import necessary libraries

import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import GRU, Dense
from sklearn.preprocessing import MinMaxScaler

import warnings                                         
warnings.simplefilter(action = 'ignore')

# Generate synthetic sequential data

data = np.sin(np.linspace(0, 100, 1000))

# Prepare the dataset

def create_dataset(data, time_step=1):
    X, y = [], []
    for i in range(len(data) - time_step - 1):  # i is defined *inside* the loop
        a = data[i:(i + time_step)]
        X.append(a)
        y.append(data[i + time_step])
    return np.array(X), np.array(y)

# Scale the data

scaler = MinMaxScaler(feature_range=(0, 1))
data = scaler.fit_transform(data.reshape(-1, 1))  # Reshape for scaling

# Create the dataset with timesteps

time_step = 10
X, y = create_dataset(data, time_step)

# Reshape X for LSTM (if you're using one later) - Important!

X = X.reshape(X.shape[0], X.shape[1], 1)

# Split the data into train and test sets

train_size = int(len(X) * 0.8)
X_train, X_test = X[:train_size], X[train_size:]
y_train, y_test = y[:train_size], y[train_size:]

# Create the GRU model

model = Sequential([
    GRU(50, input_shape = (time_step, 1)),
    Dense(1)
])

领英推荐

Large Language Models: An In-Depth Exploration of LLMs…

Adria Business & Technology 4 个月前

Transformer Theory Made Simple

RayMing PCB 5 个月前

The power to revolutionize AI lies upon passive…

Zander Labs 1 年前

# Compile the model

model.compile(optimizer = 'adam', loss = 'mean_squared_error')

# Train the model

model.fit(X_train, y_train, epochs = 50, batch_size = 1, verbose = 1)

# Evaluate the model

loss = model.evaluate(X_test, y_test, verbose = 0)
print(f'Test Loss: {loss}')

# Predict the next value in the sequence

last_sequence = X_test[-1].reshape(1, time_step, 1)
predicted_value = model.predict(last_sequence)
predicted_value = scaler.inverse_transform(predicted_value)
print(f'Predicted Value: {predicted_value[0][0]}')

EXPLANATION OF THE CODE

Data Generation: We generate synthetic sequential data using a sine function.
Dataset Preparation: We create sequences of 10 time steps to predict the next value.
Data Scaling: Normalize the data to the range [0, 1] using MinMaxScaler.
Dataset Creation: Create the dataset with input sequences and corresponding labels.
Train-Test Split: Split the data into training and test sets.
Model Creation:

GRU Layer: A GRU layer with 50 units.
Dense Layer: A fully connected layer with a single output neuron for regression.

7. Model Compilation: We compile the model with the Adam optimizer and mean squared error loss function.

8. Model Training: Train the model for 50 epochs with a batch size of 1.

9. Model Evaluation: Evaluate the model on the test set and print the loss.

10. Prediction: Predict the next value in the sequence using the last sequence from the test set.

ADVANCED FEATURES OF GRUs

Bidirectional GRU: Processes the sequence in both forward and backward directions.
Stacked GRU: Uses multiple GRU layers to capture more complex patterns.
Attention Mechanisms: Allows the model to focus on important parts of the sequence.
Dropout Regularization: Prevents overfitting by randomly dropping units during training.
Batch Normalization: Normalizes the inputs to each layer, improving training speed and stability.

# Example with Stacked GRU and Dropout
from tensorflow.keras.layers import Dropout

# Create the stacked GRU model
model = Sequential([
    GRU(50, return_sequences=True, input_shape=(time_step, 1)),
    Dropout(0.2),
    GRU(50),
    Dense(1)
])

# Compile, train, and evaluate the model (same as before)
model.compile(optimizer='adam', loss='mean_squared_error')
model.fit(X_train, y_train, epochs=50, batch_size=1, verbose=1)
loss = model.evaluate(X_test, y_test, verbose=0)
print(f"Test Loss: {loss}")

APPLICATIONS

GRUs are widely used in various fields such as:

Natural Language Processing (NLP): Language modeling, machine translation, text generation.
Time Series Analysis: Stock price prediction, weather forecasting, anomaly detection.
Speech Recognition: Transcribing spoken language into text.
Video Analysis: Activity recognition, video captioning.
Music Generation: Composing music by predicting sequences of notes.

GRUs’ ability to capture long-term dependencies while being computationally efficient makes them a popular choice for sequential data tasks.

Download the Jupyter Notebook file for Day 22 here.

Ghulam Ali

Software Engineer | Data Scientist | WordPress Developer

3 周

Great insights on Gated Recurrent Units, Ime! Your commitment to sharing knowledge in your 30 Days Data Science Series is truly inspiring. Keep it up!

1 次回应

Ime Eti-mfon

Data Scientist | Machine Learning Engineer | Data Program Community Ambassador @ ALX

3 周

Want to know more about Machine Learning? Connect with me.

1 次回应

查看更多评论

要查看或添加评论，请登录

Ime Eti-mfon的更多文章

30 Days, 30 Concepts: A Deep Dive into Machine Learning

2025年2月24日

30 Days, 30 Concepts: A Deep Dive into Machine Learning

Introduction Over the past month, I completed a 30-day Data Science learning challenge focused on Machine Learning…

3 条评论
Day 30 — Hyperparameter Optimization

2025年2月23日

Day 30 — Hyperparameter Optimization

Concept: Model tuning. Implementation: Grid search, random search.

3 条评论
Day 29 — Model Deployment and Monitoring

2025年2月22日

Day 29 — Model Deployment and Monitoring

CONCEPT Model Deployment and Monitoring involve the processes of making trained machine learning models accessible for…

1 条评论
Day 28 — Time Series Analysis and Forecasting

2025年2月21日

Day 28 — Time Series Analysis and Forecasting

CONCEPT Time Series Analysis involves analyzing data points collected over time to extract meaningful statistics and…

1 条评论
Day 27 — Natural Language Processing (NLP)

2025年2月20日

Day 27 — Natural Language Processing (NLP)

CONCEPT Natural Language Processing (NLP) is a field of artificial intelligence focused on enabling computers to…

1 条评论
Day 26?-?Ensemble?Learning

2025年2月20日

Day 26?-?Ensemble?Learning

CONCEPT Ensemble learning is a machine learning technique where multiple models (learners) are trained to solve the…

1 条评论
Day 25 — Transfer Learning

2025年2月19日

Day 25 — Transfer Learning

Concept: Pre-trained models. Implementation: Fine-tuning.

1 条评论
Day 24 - Generative Adversarial Networks (GANs)

2025年2月18日

Day 24 - Generative Adversarial Networks (GANs)

Concept: Generative models. Implementation: Generator, discriminator.

5 条评论
Day 23 — Autoencoders

2025年2月17日

Day 23 — Autoencoders

Concept: Data compression. Implementation: Encoder, decoder.

1 条评论
Day 21 — Long Short-Term Memory (LSTM)

2025年2月11日

Day 21 — Long Short-Term Memory (LSTM)

Concept: Improved RNN. Implementation: Memory cells.

See all articles

Day 22 — Gated Recurrent Units (GRU)

Ime Eti-mfon

Data Scientist | Machine Learning Engineer | Data Program Community Ambassador @ ALX

CONCEPT

KEY FEATURES OF GRU

KEY STEPS

IMPLEMENTATION

领英推荐

EXPLANATION OF THE CODE

ADVANCED FEATURES OF GRUs

APPLICATIONS

Ime Eti-mfon的更多文章

社区洞察

其他会员也浏览了

How Transformer Models Compare to Traditional RNNs in Sequence-to-Sequence Tasks

Top 10 AI Engineer Interview Questions II

Transforming Internal Audit: The Role of Artificial Intelligence

Key Concepts of GenerativeAI

Test-Time Training (TTT): A New Approach to Sequence Modeling

Transformers: The Gateway to Natural Language Processing (NLP)

Understanding AI Transformers: Revolutionizing Natural Language Processing

How Transformers work in deep learning and NLP: an intuitive introduction?

Transformer Encoder: A Closer Look at its Key Components

The Evolutionary Tale of Language Models: From RNNs to GPT and Beyond

CONCEPT

KEY FEATURES OF GRU

KEY STEPS

IMPLEMENTATION

领英推荐

EXPLANATION OF THE CODE

ADVANCED FEATURES OF GRUs

APPLICATIONS

Ime Eti-mfon的更多文章

30 Days, 30 Concepts: A Deep Dive into Machine Learning

Day 30 — Hyperparameter Optimization

Day 29 — Model Deployment and Monitoring

Day 28 — Time Series Analysis and Forecasting

Day 27 — Natural Language Processing (NLP)

Day 26?-?Ensemble?Learning

Day 25 — Transfer Learning

Day 24 - Generative Adversarial Networks (GANs)

Day 23 — Autoencoders

Day 21 — Long Short-Term Memory (LSTM)

社区洞察

其他会员也浏览了

How Transformer Models Compare to Traditional RNNs in Sequence-to-Sequence Tasks

Top 10 AI Engineer Interview Questions II

Transforming Internal Audit: The Role of Artificial Intelligence

Key Concepts of GenerativeAI

Test-Time Training (TTT): A New Approach to Sequence Modeling

Transformers: The Gateway to Natural Language Processing (NLP)

Understanding AI Transformers: Revolutionizing Natural Language Processing

How Transformers work in deep learning and NLP: an intuitive introduction?

Transformer Encoder: A Closer Look at its Key Components

The Evolutionary Tale of Language Models: From RNNs to GPT and Beyond