How to Build and Train a Sequential Model: Every AI Fresher Must Know

How to Build and Train a Sequential Model: Every AI Fresher Must Know

How to Build and Train a Sequential Model for Traffic Estimation: Traffic estimation is an essential application of AI, helping predict congestion and optimize urban mobility. In this guide, we will build and train a Sequential model using TensorFlow and Keras to forecast traffic levels over time for five major roads based on historical data.

What is a Sequential Model?

A Sequential Model is a type of deep learning model where layers are stacked in a linear fashion, meaning data flows from one layer to the next without loops or branches. It is commonly used for tasks like image classification, time-series forecasting, and natural language processing.

How Does Sequence Modeling Work?

Sequence modeling trains AI systems on chronologically ordered data to capture patterns and make predictions over time. More specifically, sequence models like recurrent neural networks process inputs as sequences, with each data point conditioned on those preceding it.

The model iterates through data points, maintaining an encoded representation of the sequence history at each step. This sequential processing allows the model to learn complex time-based patterns like trends, seasonality, and long-range dependencies in data.

The sequence model is trained to make predictions by estimating the probability distribution over next values, given the sequence of past context. This modeling of ordered data as interdependent steps enables the model to develop a sense of continuity and dynamics within data.

By absorbing implicit rules about events unfolding over time, sequence models can gain limited foresight to make informed predictions about what may follow based on sequenced history. This time-based conditioning provides valuable context for inference compared to assessing data points independently. With extensive training over representative sequences, models can become skilled at leveraging the past to anticipate the future.

Why is Sequence Modeling Important?

Sequence modeling is crucial for AI systems to understand data that unfolds over time. Unlike static data, temporal sequences have complex time-based patterns like trends, cycles, and lagged effects. By processing data as interdependent sequenced steps, models can learn these nuanced time dynamics rather than viewing data points in isolation.

This time-based conditioning enables models to make more contextual and accurate predictions and decisions — understanding how the past leads to the future. Sequence modeling has unlocked AI advancements in speech, text, video, forecasting, anomaly detection, and more.

Features of a Sequential Model:

  • Simplicity: Layers are added one after another.
  • Flexibility: Can include different types of layers like Dense, Convolutional, LSTM, etc.
  • Ease of use: Simple API for defining models in Keras and TensorFlow.

Example 1: A Basic Feedforward Neural Network

This example demonstrates a Sequential model for simple classification tasks:

from keras.models import Sequential
from keras.layers import Dense

model = Sequential([
    Dense(64, activation='relu', input_shape=(10,)),
    Dense(32, activation='relu'),
    Dense(1, activation='sigmoid')
])

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model.summary()
        

Explanation:

  • The first layer has 64 neurons with ReLU activation and expects 10 input features.
  • The second layer has 32 neurons.
  • The output layer has 1 neuron with a sigmoid activation for binary classification.

Example 2: Sequential Model for Time-Series Prediction

A simple LSTM-based Sequential model for time-series forecasting:

from keras.models import Sequential
from keras.layers import LSTM, Dense

model = Sequential([
    LSTM(50, return_sequences=True, input_shape=(10, 1)),
    LSTM(50),
    Dense(1)
])

model.compile(optimizer='adam', loss='mse')
model.summary()
        

Explanation:

  • Uses LSTM layers to process time-series data.
  • The final Dense layer outputs a single numerical prediction.

The Sequential model is a powerful yet easy-to-use framework for building deep learning models. It works best for tasks where data flows in a straightforward manner through each layer.

Understanding the Dataset

The dataset (traffic-data.csv) contains traffic congestion levels for five roads (Road 1 to Road 5) over time. The goal is to train a model that can predict the next six timestamps of traffic levels based on the past six observations.

How to Build and Train a Sequential Model for Traffic Estimation

Step-by-Step Model Development

1. Data Preparation

The dataset is first loaded and normalized:

data = df.values
data = data - data.min(axis=0)
data = data / data.max(axis=0)
        

Normalization ensures that all features have values between 0 and 1, improving model stability and convergence.

We define the number of past observations (n_past = 6) and future predictions (n_future = 6). The dataset is then split into training and validation sets.

2. Creating Windowed Datasets

To create meaningful inputs and targets, we use a function that slices the dataset into overlapping windows:

def windowed_dataset(series, batch_size, n_past, n_future):
    ds = Dataset.from_tensor_slices(series)
    ds = ds.window(size=n_past + n_future, shift=1, drop_remainder=True)
    ds = ds.flat_map(lambda w: w.batch(n_past + n_future))
    ds = ds.map(lambda w: (w[:n_past], w[n_past:]))
    return ds.batch(batch_size).prefetch(1)
        

This function ensures that each input has six past observations and predicts six future ones.

3. Building the Model

A Sequential model is created using LSTMs (Long Short-Term Memory) for sequence learning. The architecture includes:

  • Bidirectional LSTM layers to capture past and future dependencies.
  • Dense layer for feature extraction.
  • Reshape layer to structure the output in the required format.

model = Sequential([
    Bidirectional(LSTM(32, return_sequences=True, input_shape=(n_past, n_features))),
    Bidirectional(LSTM(32)),
    Dense(n_features * n_future, activation='relu'),
    Reshape((n_future, n_features)),
])
        

4. Compiling and Training the Model

The model is compiled with Mean Squared Error (MSE) loss and the Adam optimizer:

model.compile(loss='mse', optimizer='adam', metrics=['mae'])
        

Early stopping is used to stop training once the validation Mean Absolute Error (MAE) reaches 0.15 or below:

early_stopping = EarlyStopping(monitor='val_mae', mode='min', patience=10, verbose=1, min_delta=0.005)
        

Finally, the model is trained:

model.fit(train_set, epochs=1000, validation_data=valid_set, callbacks=[early_stopping])
        

5. Saving and Reloading the Model

After training, the model is saved and can be reloaded later for predictions:

model.save("traffic_model.h5")
saved_model = load_model("traffic_model.h5")
        

The model program:

import os
import pandas as pd
from keras import Sequential
from keras.callbacks import EarlyStopping, Callback
from keras.layers import Bidirectional, LSTM, Dense, Reshape
from keras.saving import load_model
from tensorflow.data import Dataset
from tensorflow.random import set_seed

def traffic_model():
    # Load the dataset
    csv_file = 'traffic-data.csv'
    if not os.path.exists(csv_file):
        raise FileNotFoundError(f"Dataset {csv_file} not found!")
    
    df = pd.read_csv(csv_file, sep=',', index_col='date', header=0)
    
    # Normalize the data
    data = df.values
    data = data - data.min(axis=0)
    data = data / data.max(axis=0)
    
    n_features = len(df.columns)
    n_past = 6
    n_future = 6
    batch_size = 8
    
    # Set seed for reproducibility
    set_seed(1)
    
    # Split dataset into training and validation
    split_time = int(len(data) * 0.5)
    x_train = data[:split_time]
    x_valid = data[split_time:]
    
    train_set = windowed_dataset(x_train, batch_size, n_past, n_future)
    valid_set = windowed_dataset(x_valid, batch_size, n_past, n_future)
    
    # Define the Sequential model
    model = Sequential([
        Bidirectional(LSTM(32, return_sequences=True, input_shape=(n_past, n_features))),
        Bidirectional(LSTM(32)),
        Dense(n_features * n_future, activation='relu'),
        Reshape((n_future, n_features)),
    ])
    
    # Compile the model
    model.compile(loss='mse', optimizer='adam', metrics=['mae'])
    
    # Set up early stopping callback
    early_stopping = EarlyStopping(monitor='val_mae', mode='min', patience=10, verbose=1, min_delta=0.005)
    
    # Train the model
    model.fit(train_set, epochs=1000, validation_data=valid_set, callbacks=[early_stopping])
    
    return model

def windowed_dataset(series, batch_size, n_past, n_future):
    ds = Dataset.from_tensor_slices(series)
    ds = ds.window(size=n_past + n_future, shift=1, drop_remainder=True)
    ds = ds.flat_map(lambda w: w.batch(n_past + n_future))
    ds = ds.map(lambda w: (w[:n_past], w[n_past:]))
    return ds.batch(batch_size).prefetch(1)

# Train and save the model
if __name__ == '__main__':
    my_model = traffic_model()
    my_model.save("traffic_model.h5")
    saved_model = load_model("traffic_model.h5")
    saved_model.summary()
        

The data used (traffic-data.csv')

The dataset will include timestamps and traffic congestion levels (scaled from 0 to 100).

date,Road 1,Road 2,Road 3,Road 4,Road 5
2024-03-01,35,50,40,60,45
2024-03-02,38,55,42,58,47
2024-03-03,40,53,45,62,50
2024-03-04,45,60,48,65,55
2024-03-05,50,62,52,67,58
2024-03-06,55,65,54,70,60
2024-03-07,52,63,50,68,57
2024-03-08,48,60,46,65,52
2024-03-09,42,58,44,62,50
2024-03-10,38,55,41,60,47
2024-03-11,35,50,39,58,45
2024-03-12,30,48,35,55,42
2024-03-13,28,45,33,53,40
2024-03-14,25,43,30,50,38
2024-03-15,22,40,28,48,35
2024-03-16,20,38,26,45,33
2024-03-17,18,35,25,43,30
2024-03-18,22,40,27,48,33
2024-03-19,28,45,30,52,37
2024-03-20,30,48,33,55,40
2024-03-21,35,50,38,58,42
2024-03-22,38,53,40,60,45
2024-03-23,42,57,44,62,48
2024-03-24,45,60,47,65,52
2024-03-25,50,63,50,68,55
2024-03-26,55,65,54,70,60
2024-03-27,52,63,50,68,57
2024-03-28,48,60,46,65,52
2024-03-29,42,58,44,62,50
2024-03-30,38,55,41,60,47        

Conclusion

This approach provides a solid foundation for time-series forecasting using deep learning. By leveraging LSTMs and Bidirectional layers, the model effectively learns temporal dependencies in traffic patterns, making it a valuable tool for AI developers working on intelligent transportation systems.

Start your AI Transformation

Understanding AI

AI Architecture and Models

Difference between AI, ML and Gen AI

The Right time to transform your career in AI

Popular Software Certifications to enhance your AI Development Skills

  1. Java Certifications: Offered by Oracle, these certifications validate proficiency in Java programming and development.
  2. Python Certifications: Certifications such as PCEP, PCED, PCET, PCAP, and PCPP from the Python Institute help professionals showcase their expertise in Python programming.
  3. AI, ML, and Gen AI Certifications: Certifications from institutions like Google, AWS, and Azure and Databricks validate knowledge in artificial intelligence, machine learning, and generative AI technologies.
  4. Data Scientist or Data Engineer Certifications: Certifications from institutions like Databricks, the Python Institute, AWS, Google, and Azure validate expertise in data analysis, database management, and cloud computing.
  5. AWS Cloud Certifications: AWS certifications (e.g., AWS Certified Solutions Architect, AWS Certified Developer) demonstrate expertise in Amazon Web Services cloud solutions.
  6. Google Cloud Certifications: Certifications such as Google Associate Cloud Engineer and Google Professional Cloud Architect validate proficiency in Google Cloud Platform services.
  7. Microsoft Azure Certifications: Microsoft offers certifications like Azure Fundamentals, Azure AI, and Azure Solutions Architect to establish expertise in Microsoft’s cloud ecosystem.


要查看或添加评论,请登录

MyExamCloud的更多文章