Harnessing AWS Serverless Architecture for Cost-Effective Machine Learning: A Case Study on Car Price Prediction

Tharindu Sankalpa

Lead ML Engineer at IFS | MSc in Big Data Analytics | Google & AWS Certified ML Engineer

发布日期: 2024年4月26日

In today's fast-paced technological landscape, the integration of machine learning (ML) into cloud architectures is not just a trend but a substantial lever for competitive advantage. This article delves into a project that not only showcases ML deployment but does so through an economical, serverless approach using AWS cloud services.

Project Overview

The GitHub project, Car Price Prediction - MLOps on AWS, demonstrates the use of AWS's serverless stack for cost-effective production of ML systems through the predictive analysis of car prices based on various attributes such as mileage, age, and make. The core of this project revolves around AWS Lambda, API Gateway, and other AWS services, aligning with best practices in MLOps to automate and monitor all steps of ML system construction.

Cost-Effective Serverless Solutions

One of the significant challenges in deploying machine learning models is managing costs, especially with the infrastructure needed to support data processing and model predictions. By adopting a serverless architecture, we capitalize on the 'pay-as-you-go' model—paying only for the compute time we consume without bearing the cost of idle computational resources. This approach is not only cost-efficient but also scales automatically with the application's usage patterns, making it exceptionally advantageous for systems with unpredictable loads.

AWS Lambda and API Gateway: A Dynamic Duo

AWS Lambda has been instrumental in running our code in response to events, such as changes in data or new input data, effectively eliminating the need to manage servers. Furthermore, AWS API Gateway acts as the front door to our Lambda functions, enabling us to create, publish, maintain, and secure APIs at any scale. This integration facilitates the building of robust, scalable, and secure API endpoints that handle our ML model's inference requests efficiently.

Commercial and Operational Advantages

The serverless architecture not only reduces operational costs but also simplifies deployment and scalability. It allows businesses to deploy ML models that can adapt quickly to varying loads without the need for forecasting traffic. For startups and established companies alike, this means quicker go-to-market times and lower upfront investments, making innovative projects more feasible and budget-friendly.

Secure and Manage API Integrations

Using AWS API Gateway in conjunction with Lambda has allowed us to enhance the security and manageability of our applications. API Gateway supports various mechanisms for controlling access to our APIs, including throttling, authentication, and authorization practices through AWS Identity and Access Management (IAM). These features ensure that our endpoints remain secure against unauthorized access, providing peace of mind along with cutting-edge technology.

Technical Implementation

Git Repository Structure

https://github.com/VLTSankalpa/car-price-pred-aws-mlops

├── README.md
├── data
│   └── dataset.npz              # Raw and preprocessed data
├── images
│   └── kde.png                  # Images used in README documentation
├── lambda-ct-pipeline
│   └── ct_lambda_function.py    # Lambda function for continuous training pipeline
├── lambda-model-endpoint
│   ├── Dockerfile               # Dockerfile for building Lambda deployment image
│   ├── main.py                  # Main script for Lambda function initialization
│   └── model_endpoint_lambda_function.py  # Lambda function for model predictions
├── model
│   ├── finalized_linear_model.pkl  # Saved final linear model
│   ├── label_encoder.pkl           # Label encoder for categorical data preprocessing
│   ├── model.py                    # Script for model training and evaluation
│   ├── onehot_encoder.pkl          # One-hot encoder for categorical data preprocessing
│   ├── scaler.pkl                  # Scaler object for numerical data normalization
│   └── train.csv                   # Training dataset
└── notebooks
    └── development-notebook.ipynb  # Jupyter notebooks on Model Preparation for Deployment including EDA, data visualization, data preprocessing, and model code refinements

Directories and Files

/notebooks: Jupyter notebooks on Model Preparation for Deployment including EDA, data visualization, data preprocessing, and model code refinements.
/model: Contains all trained machine learning models and their corresponding encoders, along with the training dataset.finalized_linear_model.pkl: The serialized final linear regression model ready for predictions.label_encoder.pkl, onehot_encoder.pkl, scaler.pkl: Serialization of preprocessing encoders.model.py: Initial model training python code provided.
/lambda-model-endpoint:Dockerfile: Defines the Docker container used to deploy the Lambda function.test.py: Unit test for the model endpoint Lambda function.model_endpoint_lambda_function.py: Implements the Lambda function to serve the model predictions.
/lambda-ct-pipeline: Holds the AWS Lambda function for continuous training of the machine learning model.
/data: Contains raw and preprocessed datasets used in model training.
/images: Includes images used within the README documentation to explain concepts or results.

Setting Up the Development Environment

To ensure the reproducibility and efficiency of our machine learning project, I established a robust local development environment on a Mac M1 Pro, utilizing Miniconda for environment management. This setup was crucial for maintaining consistency across development and production environments, particularly given the complex nature of predictive modeling and data analysis. Here's how I configured the local environment:

Creating an Isolated Conda Environment: To prevent any conflicts between package dependencies, I created a dedicated Conda environment named car-price-pred-mlops. This isolated environment ensures that all necessary Python packages are managed effectively.

conda create --name car-price-pred-mlops python

Activating the Conda Environment: By activating this environment, I ensured that all subsequent Python and command-line operations were encapsulated within this defined scope.

conda activate car-price-pred-mlops

Installing Jupyter Notebook: Jupyter Notebook, an indispensable tool for interactive coding and data visualization, was installed to facilitate the exploration and visualization of data as well as the iterative development of machine learning models.

conda install jupyter

Configuring the IPython Kernel: This step involved installing the IPython kernel, which is essential for running Python code in Jupyter. I also registered this kernel under a specific display name to ensure it was readily identifiable and selectable within the Jupyter interface.

conda install ipykernel

python -m ipykernel install --user --name car-price-pred-mlops --display-name "Car Price Prediction MLOps"

Launching Jupyter Notebook: Finally, the Jupyter Notebook server was initiated, allowing for the management and execution of development notebooks directly from the browser.

jupyter notebook

Integrating Additional Tools and Services

Docker: Essential for creating reproducible environments that mimic production settings, Docker was installed to containerize our functions and services. Detailed instructions for Docker installation are available on its official website.
AWS Command Line Interface (CLI): To interact directly with AWS services from the command line, I installed and configured the AWS CLI. This tool simplifies tasks such as deploying applications and managing cloud resources.

aws configure

Python Libraries: A suite of Python libraries necessary for data handling, analysis, and machine learning was installed via a requirements file, ensuring all developers could synchronize their environments easily.

pip install -r requirements.txt

This meticulous setup not only streamlined our development process but also ensured that our transition from a local testing environment to AWS cloud deployment was seamless and error-free.

Dataset

https://car-price-pred-mlops.s3.ap-south-1.amazonaws.com/train.csv

Step 1: Model Preparation for Deployment

The journey to deploying our Car Price Prediction model on AWS began with extensive exploratory data analysis (EDA) and data visualization. This foundational step was crucial for refining our approach and enhancing the Python scripts provided. Through thorough analysis, I gained a deep understanding of the dataset, which facilitated the identification and implementation of necessary data preprocessing steps such as normalization and encoding.

After optimizing the data, I proceeded to train and evaluate the linear regression model. Upon achieving satisfactory results, I serialized the model and its encoders, then uploaded them to AWS S3. This pivotal step ensured that our model could be seamlessly integrated and executed within an AWS Lambda function.

Exploratory Data Analysis (EDA)

In the EDA phase, I utilized a suite of standard templates developed to streamline this process. This practice is essential for ensuring that the dataset is clean, well-understood, and primed for feature engineering and model development. Our EDA objectives included:

Listing Columns: To understand the features available within the dataset.
Analyzing Dataset Shape: To determine the size and scope of the data.
Reviewing Data Types: To ascertain the necessary data conversions.
Identifying Unique Values: To detect anomalies or irregularities.
Converting Data Types: To adjust specific columns for proper analysis.
Handling Missing Values: To address gaps in the dataset.
Generating Summary Statistics: To gain insights into the data's distribution and central tendencies.

!pip install --quiet pandas numpy matplotlib seaborn statsmodels scipy scikit-learn boto3
import pickle
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import statsmodels.api as sm
from scipy import stats
from sklearn.preprocessing import LabelEncoder, OneHotEncoder, MinMaxScaler
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score, mean_absolute_error

# Confirming that the libraries have been imported correctly
print("Libraries have been successfully imported!")

# Load the dataset
df = pd.read_csv("../model/train.csv")

# Display the first few rows of the dataframe
df.head(10)

# Display the columns in the DataFrame
df.columns.tolist()

# Droping index col
df = df.drop('Unnamed: 0', axis=1)

# Display data types of columns
df.dtypes

# check for all unique values in each column
for column in df.columns:
    print(f"Unique values in '{column}':", df[column].unique()[:20])
    print("Number of unique values:", len(df[column].unique()), "\n\n")

# Check for missing values after conversion
missing_values = df.isnull().sum()
print("Missing Values After Conversion:\n", missing_values)

# Display summary statistics for numerical columns
df.describe()

# recheck data types and non-null counts
df.info()

Insights from EDA

We categorized variables as either categorical or numerical based on their attributes and relevance to the model:

Categorical Variables

Fuel_Type: Categories like 'Diesel', 'Petrol', and 'CNG', significant for pricing and performance.
Doors: Number of doors, treated as categorical to reflect different car body styles.
Automatic: Binary indicator for transmission type.
MetallicCol: Binary indicator of metallic paint finish.

Numerical Variables

Kilometers: Reflects mileage, affecting the car's value and usage.
HorsePower: Engine power, impacting performance.
CC: Engine capacity, quantifying engine size.
Wt: Car weight, relevant to dynamics and efficiency.
SellingPrice: The target variable, indicating the car's selling price.
Age: Car age, numerically significant as it impacts value and condition.

Data Visualization

Data visualization played a key role in identifying patterns, relationships, and trends within the dataset. We employed various visualization techniques to provide a comprehensive exploration of the data, aiding in understanding its dynamics, especially in relation to car pricing. These techniques included:

Kernel Density Estimate (KDE) Plots: For understanding numerical data distribution.
Q-Q Plots: To assess distribution normality.
Histograms: Ideal for observing data distribution and shape.
Boxplots: For a graphical representation of data through quartiles, highlighting outliers.
Scatter Plots: To explore correlations between variables.
Heatmaps: For visualizing correlations among multiple variables.
Count Plots: To visualize distributions of categorical data.

Detailed Visualization Insights

# Enhance default plot aesthetics with Seaborn
sns.set(style="whitegrid")

# Identify numerical columns
numerical_columns = ['Kilometeres', 'HorsePower', 'CC', 'Wt', 'SellingPrice', 'Age'] 

# Calculate the number of subplot rows needed
n = len(numerical_columns)
n_rows = n // 4 + (n % 4 > 0)

# Create a high-resolution figure
fig, axes = plt.subplots(n_rows, 4, figsize=(20, 5 * n_rows), dpi=120)
fig.tight_layout(pad=5.0)

# Flatten the axes array for easier iteration
axes = axes.flatten()

# Loop through the numerical columns to create KDE plots
for i, column in enumerate(numerical_columns):
    sns.kdeplot(data=df, x=column, ax=axes[i], fill=True)
    axes[i].set_title(f'KDE Plot of {column}', fontsize=10)
    axes[i].tick_params(axis='both', which='major', labelsize=8)

# Hide any unused subplot areas
for j in range(i + 1, len(axes)):
    fig.delaxes(axes[j])

plt.subplots_adjust(top=0.9)  # Adjust the top padding
plt.suptitle('KDE Plots for Numerical Columns', fontsize=14, y=1.02)  # Add a main title and adjust its position
plt.show()

Through matplotlib and seaborn, we enhanced visual representations to provide deeper insights into each variable. Our observations from KDE plots revealed:

Kilometers: A right-skewed distribution, showing lower mileage for most cars.
HorsePower: A multi-modal distribution, indicating common horsepower ratings.
CC (Engine Size): Multiple peaks suggesting prevalent engine sizes, skewed towards smaller sizes.
Wt (Weight): Normally distributed around a central value.
SellingPrice: Right-skewed, indicating a concentration of cars in the lower price range.
Age: Right-skewed, suggesting a prevalence of newer cars.

These visual and analytical explorations were essential for advancing to the subsequent stages of model training and deployment, ensuring our model was robust and ready for real-world application.

Data Preprocessing

Normalization

To handle potential scale discrepancies among these numerical features, we use the MinMaxScaler from scikit-learn. This scaler transforms each feature to a range between 0 and 1, maintaining the distribution but aligning the scales. This is crucial as it prevents attributes with larger ranges from dominating those with smaller ranges, which is important for many machine learning algorithms.

numerical_columns = ['Kilometeres', 'HorsePower', 'CC', 'Wt', 'Age'] 
categorical_columns = ['Fuel_Type', 'Doors', 'Automatic', 'MetallicCol']
label_column = 'SellingPrice'

# Normalize numerical columns to scale the data
scaler = MinMaxScaler()
df[numerical_columns] = scaler.fit_transform(df[numerical_columns])

Label Encoding

For the Doors feature, which is ordinal, we apply label encoding. This approach converts the categorical labels into a single integer column, preserving the order, which is appropriate for ordinal data.

label_encoder = LabelEncoder()
df['Doors'] = label_encoder.fit_transform(df['Doors'])

One-Hot Encoding

The Fuel_Type feature is treated with one-hot encoding, which is essential for nominal categorical data. This method transforms each categorical value into a new binary column, ensuring that the model interprets these attributes correctly without any implicit ordering.

# OneHotEncoder for 'Fuel_Type'
encoder = OneHotEncoder(drop=None)  # Corrected by removing the non-existent parameter
encoded_features = encoder.fit_transform(df[['Fuel_Type']])

Feature Transformation

After encoding, we handle the transformation from sparse to dense formats. Many machine learning algorithms require a dense matrix format, so we convert the sparse matrix obtained from one-hot encoding into a dense format. This is performed using the .toarray() method, which is necessary to integrate these features into the main DataFrame seamlessly.

# Convert to dense format if you need a dense matrix instead of a sparse one
encoded_features_dense = encoded_features.toarray()  # Converts the sparse matrix to a dense array

# Get feature names for the new columns
columns = encoder.get_feature_names_out(['Fuel_Type'])

# Create a DataFrame with the encoded features
encoded_df = pd.DataFrame(encoded_features_dense, columns=columns)

# Concatenate with the original DataFrame
df = pd.concat([df.drop(['Fuel_Type'], axis=1), encoded_df], axis=1)

# 'Automatic' and 'MetallicCol' are already binary, no need to encode further.
# However, if you want to ensure they are of type 'category', you can do:
df['Automatic'] = df['Automatic'].astype('category')
df['MetallicCol'] = df['MetallicCol'].astype('category')

Integration with Original DataFrame

The newly created dense matrix columns are named according to the unique values in Fuel_Type and then concatenated back to the original DataFrame. Columns derived from Fuel_Type are added, and the original Fuel_Type column is dropped to avoid redundancy.

Final Adjustments

For binary categorical features like Automatic and MetallicCol, which are already in a binary format, we explicitly cast them to a 'category' type to ensure consistency in data types across the DataFrame. This step is important for some types of statistical analysis and modeling in Python.

Training Data Preparing

This code performs the following operations:

Splits the data into feature (X) and label (y) arrays.
Uses train_test_split twice to create a train set (60% of the data), a validation set (20%), and a test set (20%).
Saves the training, validation, and test sets to an .npz file, which can then be loaded for training.

# Identify features and label
X = df.drop(['SellingPrice'], axis=1)  # Features
y = df['SellingPrice']  # Label

# First split into training and temporary sets (temp will become validation and testing)
X_train, X_temp, y_train, y_temp = train_test_split(X, y, test_size=0.2, random_state=42)

# Split the temporary set into validation and test sets
X_val, X_test, y_val, y_test = train_test_split(X_temp, y_temp, test_size=0.5, random_state=42)

# Now we have X_train, y_train, X_val, y_val, X_test, and y_test

# Save the arrays as .npz file
np.savez('../data/dataset.npz', X_train=X_train, y_train=y_train, X_val=X_val, y_val=y_val, X_test=X_test, y_test=y_test)

# Confirm the file has been saved
print("Arrays saved as dataset.npz")

Model Training

Improvement have done to the provided linear regression model is codes for training and evaluating the model. The model is trained on the training set and evaluated on the validation set.

# Create a linear regression object
model = LinearRegression()

# Train the model using the training set
model.fit(X_train, y_train)

Model Evaluation

To evaluate the performance of the trained model, the following metrics are calculated using the validation set:

Mean Squared Error (MSE): Represents the average of the squares of the errors—i.e., the average squared difference between the estimated values and the actual value.
R-Squared (R2): Provides an indication of goodness of fit and therefore a measure of how well unseen samples are likely to be predicted by the model.
Mean Absolute Error (MAE): Measures the average magnitude of the errors in a set of predictions, without considering their direction.
Mean Absolute Percentage Error (MAPE): Measures the accuracy as a percentage, and is commonly used to forecast error in predictive modeling.
Root Mean Squared Error (RMSE): The square root of the mean of the squared errors; RMSE is a good measure of how accurately the model predicts the response.

# Predict on the validation set
y_val_pred = model.predict(X_val)

# Calculate MSE 
mse_val = mean_squared_error(y_val, y_val_pred)
# Calculate R-squared
r2_val = r2_score(y_val, y_val_pred)
# Calculate MAE
mae_val = mean_squared_error(y_val, y_val_pred, squared=False)
# Calculate MAPE - Note: We handle zero division issues explicitly
mape_val = np.mean(np.abs((y_val - y_val_pred) / y_val).replace(np.inf, np.nan)) * 100
# Calculate RMSE
rmse_val = mean_squared_error(y_val, y_val_pred, squared=True)

print(f"Validation MSE: {mse_val}")
print(f"Validation R^2: {r2_val}")
print(f"Validation MAE: {mae_val}")
print(f"Validation MAPE: {mape_val}%")
print(f"Validation RMSE: {rmse_val}")

Saving the Encoders and Model (Training and Prediction Consistency)

To maintain consistency in data preprocessing between training and prediction phases, it is essential to serialize and save the encoders and model after training. This ensures that the exact preprocessing steps used during training are applied during prediction.

Serialization Process:

MinMaxScaler, LabelEncoder, and OneHotEncoder are saved using Python’s pickle module, which serializes Python objects into binary format.
The linear regression model is also serialized post-training.

import pickle

# Save the scaler
with open('../model/scaler.pkl', 'wb') as f:
    pickle.dump(scaler, f)

# Save the label encoder
with open('../model/label_encoder.pkl', 'wb') as f:
    pickle.dump(label_encoder, f)

# Save the OneHot encoder
with open('../model/onehot_encoder.pkl', 'wb') as f:
    pickle.dump(encoder, f)

Loading and Using Encoders for New Data

When making predictions with new data, the saved encoders and model are loaded back into the environment. This guarantees that the new data undergoes identical transformations as the training data, providing accurate and consistent predictions.

# Load the scaler
with open('../model/scaler.pkl', 'rb') as f:
    loaded_scaler = pickle.load(f)

# Load the label encoder
with open('../model/label_encoder.pkl', 'rb') as f:
    loaded_label_encoder = pickle.load(f)

# Load the OneHot encoder
with open('../model/onehot_encoder.pkl', 'rb') as f:
    loaded_onehot_encoder = pickle.load(f)

Uploading Serialized Files to AWS S3

For the AWS Lambda function to access the model and encoders, they are uploaded to an AWS S3 bucket. This provides a scalable and secure storage solution accessible by the Lambda function.

Upload Commands:

# Upload the serialized model and encoders to S3
aws s3 cp ../model/scaler.pkl s3://car-price-pred-mlops/scaler.pkl
aws s3 cp ../model/label_encoder.pkl s3://car-price-pred-mlops/label_encoder.pkl
aws s3 cp ../model/onehot_encoder.pkl s3://car-price-pred-mlops/onehot_encoder.pkl
aws s3 cp ../model/finalized_linear_model.pkl s3://car-price-pred-mlops/finalized_linear_model.pkl

Confirm Upload:

# List files in the S3 bucket to confirm upload
aws s3 ls s3://car-price-pred-mlops

By following these steps, the model and encoders are effectively serialized, stored, and made ready for deployment. The AWS Lambda function can retrieve these files from S3, ensuring that the model predictions are based on the same preprocessing logic as was used during model training.

Step 2: Deployment on AWS Lambda

This section details the deployment process of an AWS Lambda function designed to predict car prices using a trained model stored on AWS S3. The function processes input data in JSON format, applies necessary preprocessing, and outputs the predicted selling price.

Architecture Overview

AWS Lambda: Hosts the Python-based prediction function.
Amazon S3: Stores serialized machine learning models and preprocessors.
Amazon ECR (Elastic Container Registry): Stores Docker images configured to run the Lambda function.
AWS IAM: Manages permissions for Lambda function to access AWS resources.

Deployment Steps

AWS Lambda and Deployment Constraints

AWS Lambda functions are powerful tools for running serverless applications, which means they execute code in response to events and automatically manage the computing resources required. However, when deploying Lambda functions, developers must consider certain constraints, particularly related to the deployment package size. AWS Lambda limits the uncompressed package size to 250 MB across all the function's layers. This includes all the code and its dependencies.

For Python-based Lambda functions, which often require numerous libraries (especially data science projects using libraries like pandas and sklearn), this size limitation can be quickly reached. When the required libraries exceed this size limit, developers must find alternative ways to deploy their applications.

Using Docker Containers for Lambda

One effective solution to the package size limitation is the use of Docker containers. AWS Lambda supports container images as a way to package and deploy functions. By using Docker, you can create a container image that includes the Lambda function and all its dependencies, regardless of size, as long as the total container image size does not exceed 10 GB.

This approach is beneficial for machine learning models that rely on heavy libraries for data processing and model inference. Docker not only helps circumvent the size limitations but also provides a consistent environment from development to production, reducing the chances of encountering "works on my machine" issues.

Python Script Development for Lambda Function

The provided Python script is a Lambda function designed to serve as an endpoint for a machine learning model that predicts car prices. Here's a breakdown of how the script works:

Imports and Dependencies: The script begins by importing necessary Python modules including json for handling JSON data, boto3 for AWS services interaction, pickle for object serialization, and pandas along with sklearn preprocessing tools.

Helper Function - load_from_s3: A helper function is defined to facilitate the loading of serialized objects (like ML models and encoders) from an S3 bucket. This function uses boto3 to access an object in S3, reads it into bytes, and deserializes it using pickle.

Lambda Handler Function:lambda_handler is the main function that AWS Lambda calls when the function is invoked. This function performs several key tasks:

Loading Model and Encoders: It loads a pre-trained machine learning model and data preprocessing encoders from an S3 bucket.
Data Preprocessing: Incoming JSON data, which represents new car attributes, is converted into a pandas DataFrame. Numerical data is scaled, and categorical data is transformed using label encoding and one-hot encoding.
Model Prediction: After preprocessing, the function uses the loaded model to predict the car price and constructs a response containing the predicted price.

JSON Response: Finally, the function wraps the prediction in a JSON structure and returns it, making it suitable for integration with web applications or other services that might consume this endpoint.

import json
import boto3
import pickle
import pandas as pd
from sklearn.preprocessing import MinMaxScaler, LabelEncoder, OneHotEncoder

def load_from_s3(bucket, object_key):
    """
    Loads a serialized object from an Amazon S3 bucket.

    Parameters:
        bucket (str): The name of the S3 bucket.
        object_key (str): The key of the object within the S3 bucket.

    Returns:
        object: The deserialized object from S3.
    """
    s3_client = boto3.client('s3')
    response = s3_client.get_object(Bucket=bucket, Key=object_key)
    object_bytes = response['Body'].read()
    return pickle.loads(object_bytes)

def lambda_handler(event, context):
    """
    Handles incoming requests to the Lambda function.

    Parameters:
        event (dict): Contains all the information about the incoming request.
        context (LambdaContext): Provides runtime information to your handler.

    Returns:
        dict: The response object with statusCode and the body.
    """
    # Define the bucket where models and encoders are stored.
    BUCKET = 'car-price-pred-mlops'

    # Load the machine learning model and preprocessing encoders from S3
    model = load_from_s3(BUCKET, 'finalized_linear_model.pkl')
    scaler = load_from_s3(BUCKET, 'scaler.pkl')
    label_encoder = load_from_s3(BUCKET, 'label_encoder.pkl')
    onehot_encoder = load_from_s3(BUCKET, 'onehot_encoder.pkl')

    # Decode the incoming JSON payload
    car_data = json.loads(event['body'])
    df = pd.DataFrame([car_data])

    # Log initial data
    print("Initial DataFrame:", df)

    # Define the columns that need preprocessing
    numerical_columns = ['Kilometeres', 'HorsePower', 'CC', 'Wt', 'Age']
    categorical_columns = ['Fuel_Type', 'Doors', 'Automatic', 'MetallicCol']

    # Apply normalization to numerical columns using MinMaxScaler
    df[numerical_columns] = scaler.transform(df[numerical_columns])

    # Encode 'Doors' column using LabelEncoder
    df['Doors'] = label_encoder.transform(df['Doors'])

    # Apply OneHotEncoder to the 'Fuel_Type' column and integrate results into the DataFrame
    encoded_features = onehot_encoder.transform(df[['Fuel_Type']])
    encoded_features_df = pd.DataFrame(encoded_features.toarray(), 
                                       columns=onehot_encoder.get_feature_names_out(['Fuel_Type']))
    df = pd.concat([df.drop('Fuel_Type', axis=1), encoded_features_df], axis=1)

    # Ensure binary columns are treated as categorical types
    df['Automatic'] = df['Automatic'].astype('category')
    df['MetallicCol'] = df['MetallicCol'].astype('category')

    # Log processed data
    print("Processed DataFrame:", df)

    # Perform model prediction
    X = df.drop('SellingPrice', axis=1, errors='ignore')  # Exclude the target variable if present
    prediction = model.predict(X)

    # Prepare the JSON response containing the prediction
    response = {
        'statusCode': 200,
        'body': json.dumps({'predicted_price': prediction.tolist()})
    }

    return response

1. Prepare Docker Environment

Create Dockerfile:

# Use the AWS provided base image for Python 3.8
FROM public.ecr.aws/lambda/python:3.8

# Copy function code and any additional files
COPY . ${LAMBDA_TASK_ROOT}

# Install OS packages if necessary
RUN yum install -y gcc-c++

# Install Python dependencies
RUN pip install --no-cache-dir boto3 pandas scikit-learn

# Set the CMD to your handler (this could be the file name and the function handler)
CMD ["model_endpoint_lambda_function.lambda_handler"]

Above Dockerfile is designed for deploying a Python-based AWS Lambda function using a Docker container. Here's a brief explanation of each step in the Dockerfile:

Base Image: FROM public.ecr.aws/lambda/python:3.8 This line specifies the base image to use, which is an AWS-provided image optimized for Python 3.8 Lambda functions. It includes the necessary environment to run a Lambda function.
Copying Code: COPY . ${LAMBDA_TASK_ROOT} This command copies all the files in the current directory (where the Dockerfile is located) into the container's Lambda task root directory. The ${LAMBDA_TASK_ROOT} variable refers to the default directory where the Lambda function code is executed.
Installing OS Packages: RUN yum install -y gcc-c++ This installs necessary operating system packages using Yum, the package manager for Amazon Linux. Here, gcc-c++ is installed, which might be required for compiling Python packages that have C++ extensions.
Installing Python Dependencies: RUN pip install --no-cache-dir boto3 pandas scikit-learn Installs the required Python libraries (boto3, pandas, scikit-learn) using pip. The -no-cache-dir option is used to reduce the size of the build by not storing the extra cache data.
Setting the Command: CMD ["model_endpoint_lambda_function.lambda_handler"] Sets the default command to execute when the container starts, which is invoking the lambda_handler function from the model_endpoint_lambda_function Python file. This is the entry point of your Lambda function.

Authenticate Docker to AWS ECR:

aws ecr get-login-password --region ap-south-1 | docker login --username AWS --password-stdin 637423276370.dkr.ecr.ap-south-1.amazonaws.com

Create a Repository in AWS ECR:

docker tag lambda-function-image:latest 637423276370.dkr.ecr.ap-south-1.amazonaws.com/lambda-function-repo:latest

Build and Tag the Docker Image:

docker tag lambda-function-image:latest 637423276370.dkr.ecr.ap-south-1.amazonaws.com/lambda-function-repo:latest

Push the Docker Image to ECR:

docker push 637423276370.dkr.ecr.ap-south-1.amazonaws.com/lambda-function-repo:latest

2. Deploy Lambda Function

Create Lambda Function:

aws lambda create-function --function-name model-endpoint-v2 \
      --package-type Image \
      --code ImageUri=637423276370.dkr.ecr.ap-south-1.amazonaws.com/lambda-function-repo:latest \
      --role arn:aws:iam::637423276370:role/model-endpoint-lambda \
      --region ap-south-1 \
      --architectures arm64 \
      --timeout 120 \
      --memory-size 1024

3. Test Lambda Function

Invoke the Lambda Function with Sample Data:

aws lambda invoke \
      --function-name model-endpoint-v2 \
      --payload '{"body": "{\"Kilometeres\": 45000, \"Doors\": 2, \"Automatic\": 0, \"HorsePower\": 110, \"MetallicCol\": 1, \"CC\": 1500, \"Wt\": 950, \"Age\": 2, \"Fuel_Type\": \"Diesel\"}"}' \
      response.json

4. Configure Concurrency for Scalability

Set Reserved Concurrency:

aws lambda put-function-concurrency --function-name model-endpoint-v2 --reserved-concurrent-executions 100

This setup ensures that the Lambda function can handle concurrent requests efficiently, maintaining performance during peak times.

Function Logic and Operations

Read Model and Encoders: The Lambda function begins by loading the serialized model and preprocessors from an S3 bucket.
Data Preprocessing: It then preprocesses incoming JSON data using the same methods (scaling and encoding) used during model training.
Prediction: The model makes predictions based on the preprocessed data.
Response: The function packages the predicted selling price into a JSON response.

Testing the Function Locally

A Python script simulates the environment and tests the Lambda function locally, ensuring the function operates as expected before deployment.

Step 3: Monitoring and Observability

To effectively monitor and observe the AWS Lambda function's performance and behavior, following steps of integrating it with AWS CloudWatch for metrics, logs, and alerts is crucial. This setup provides visibility into the function's operation, helps identify performance bottlenecks, and alerts to potential issues.

1. Enable CloudWatch Logs for Lambda Function

AWS Lambda automatically monitors functions, reporting metrics through Amazon CloudWatch. We just have to ensure logging is enabled in the Lambda function’s IAM role. This role needs permission to write logs to CloudWatch. The necessary policy (AWSLambdaBasicExecutionRole) includes permissions for logs creation.

The print statements in Lambda python function will direct these logs to CloudWatch under the /aws/lambda/model-endpoint-v2 log group.

2. Monitor Execution Time and Invocation Frequency

CloudWatch Metrics: AWS Lambda automatically sends these metrics to CloudWatch:
Duration: Measures the elapsed runtime of your Lambda function in milliseconds.
Invocations: Counts each time a function is invoked in response to an event or invocation API call.

3. Monitor Model Inference Errors

Custom Metrics: If your model throws specific errors (e.g., inference errors), you might want to log these explicitly and create custom CloudWatch metrics using these logs.
Implement Error Handling in Lambda Code:

import logging
  import boto3
  
  logger = logging.getLogger()
  logger.setLevel(logging.INFO)
  cloudwatch = boto3.client('cloudwatch')
  
  def lambda_handler(event, context):
      try:
          # Your model inference code
      except Exception as e:
          logger.error("Model inference failed: %s", str(e))
          cloudwatch.put_metric_data(
              MetricData=[
                  {
                      'MetricName': 'ModelInferenceErrors',
                      'Dimensions': [
                          {'Name': 'FunctionName', 'Value': context.function_name}
                      ],
                      'Unit': 'Count',
                      'Value': 1
                  },
              ],
              Namespace='MyApp/Lambda'
          )
          raise

4. Set Up CloudWatch Alerts

Create CloudWatch Alarms: Use these to get notified about issues like high latency or increasing error rates.Go to the CloudWatch console → Alarms → Create alarm.Select the metric (e.g., Duration, Errors), specify the threshold (e.g., Duration > 3000 ms), and set the period over which this is measured.Configure actions to notify you via SNS (Simple Notification Service) when the alarm state is triggered.

Step 3: API Gateway and Security

In this step, I have set up the AWS API Gateway to accept JSON data, pass it to my AWS Lambda function for predictions, return the prediction results as JSON in the API response, and secure it using API keys. Below, I have documented the manual configuration steps step by step.

Step 1: Create API Gateway

Log into the AWS Management Console and navigate to the API Gateway service.
Create a New API:Choose REST API and click on Build.Select New API, provide a name (e.g., "CarPricePredictionAPI"), and set the endpoint type to Regional.Click on Create API.

Step 2: Create Resource and Method

Create a Resource:Under the newly created API, select Create Resource.Enter a resource name, predict, and ensure the Enable API Gateway CORS option is checked if necessary.Click on Create Resource.
Create a POST Method:Select the new resource, click on Create Method,Select the method type as POST:For Integration type, select Lambda Function.Enable Use Lambda Proxy integration.Select your Lambda function, model-endpoint-v2.Click on Create Method.

Step 3: Define and Enable Request Validation

Create a Model for Input Validation:Under Models, click Create Model.Name the model CarRequestModel, set Content Type to application/json, and define the schema based on your JSON structure as follows:

{
    "$schema": "https://json-schema.org/draft-04/schema#",
    "title": "Car Input",
    "type": "object",
    "properties": {
      "body": {
        "type": "string"
      }
    },
    "required": ["body"]
  }

Click Create.

Assign the Model to the POST Method:

Go to your POST method and select edit Method Request.
Under Method request settings > Request validator, select Validate Body.
Set the Request Validator to "Validate body...".
Then under Request body, click Add model, set Content Type to application/json and select the created Model, CarRequestModel.
Click on Save.

Step 4: Deploy API and Configure Stage

Deploy the API:
Click on Deploy API. Select a New Stage and then give a name, prod.Click on Deploy.
Note the Invoke URL provided after deployment for later use.

Step 5: Secure API with API Keys

Create an API Key: Go to API Keys from the left navigation menu.Click on Create API Key.Name the key and choose Auto generate, then save it.Note down the API Key for client use.
Require API Keys for the POST Method: Go to your POST method and select edit Method Request.Under Method request settings, set API Key Required to "true".Click on Save.
Create a Usage Plan and Associate API Key: Go to Usage Plans from the left navigation menu, click on Create usage plan.Name the plan and set throttling and quota as needed (20 requests per second, 10 requests and 100 requests per month).Click Create usage plan.Then go to the created usage plan and associate your API stage by clicking and selecting add stage.

Go to the API Keys tab in the plan, click on Add API Key, and select your created key to associate the API key with the usage plan.

Step 6: Test Your API

Using cURL:

curl -X POST https://bnar8ox2ge.execute-api.ap-south-1.amazonaws.com/prod/predict \
  -H "Content-Type: application/json" \
  -H "x-api-key: bFZ7JTbTUMRgvIAY4BC45Dxb6wo61TD3sgIP5670" \
  -d '{
    "body": "{\"Kilometeres\": 323002, \"Doors\": 4, \"Automatic\": 1, \"HorsePower\": 110, \"MetallicCol\": 1, \"CC\": 1500, \"Wt\": 950, \"Age\": 2, \"Fuel_Type\": \"Diesel\"}"
  }'

Step 7: Monitor and Maintain

Use CloudWatch for monitoring and logging API calls.
Regularly update and review API security settings and usage plans.

Conclusion

This project not only underscores the technical feasibility of integrating machine learning with cloud-based serverless computing but also highlights the strategic economic benefits of this approach. Businesses looking to implement similar technologies can draw from the insights provided in this case study to optimize their operations in cost, scale, and efficiency.

By leveraging AWS's serverless technologies, companies can turn the challenge of digital transformation into a strategic advantage, ensuring they stay agile and responsive in a rapidly evolving market landscape.

Data & Analytics

11 个月

Sounds like a revolutionary approach. How do you handle scalability challenges? Tharindu Sankalpa

要查看或添加评论，请登录

Tharindu Sankalpa的更多文章

Implementing Deep Reinforcement Learning (Deep Q-Learning) for the Frozen Lake Environment ?? with PyTorch ??

2024年11月19日

Implementing Deep Reinforcement Learning (Deep Q-Learning) for the Frozen Lake Environment ?? with PyTorch ??

In the dynamic realm of artificial intelligence, reinforcement learning has become a cornerstone for training agents to…
Deterministic Builds for Python Generative AI Application with improved reproducibility

2024年7月7日

Deterministic Builds for Python Generative AI Application with improved reproducibility

Managing dependencies is a big deal when developing AI services in Python. We usually rely on pip, virtualenv, and…
New Skillsets in the Era of AI Coding Assistants: Intelligent Coding and Strategic Development

2024年5月21日

New Skillsets in the Era of AI Coding Assistants: Intelligent Coding and Strategic Development

Introduction In recent years, you may have heard the buzz around terms like "generative AI" and "large language…

1 条评论
Unlocking Scalable ML Workflows: The Comprehensive Guide to Kubeflow - Part 1

2024年3月14日

Unlocking Scalable ML Workflows: The Comprehensive Guide to Kubeflow - Part 1

Kubeflow is an open-source platform designed to enable the deployment, orchestration, monitoring, and management of…
MLOps: Mitigating the Hidden High-Interest Technical Debt in Production AI Systems

2024年3月6日

MLOps: Mitigating the Hidden High-Interest Technical Debt in Production AI Systems

In the rapidly evolving landscape of technology, data science and machine learning have emerged as cornerstone…

1 条评论
Illusion of ML Effort Allocation: Expectation vs. Reality

2024年2月18日

Illusion of ML Effort Allocation: Expectation vs. Reality

Introduction In the current era, the excitement surrounding generative AI and large language models is noticeable. Many…

1 条评论
Deep Learning Development Environment Setup: TensorFlow GPU-enabled Bare-Metal Server Setup

2024年2月5日

Deep Learning Development Environment Setup: TensorFlow GPU-enabled Bare-Metal Server Setup

This comprehensive guide will assist you in configuring a TensorFlow GPU-enabled deep learning development environment.…

See all articles