登录查看更多内容

The Cheapest Way to Deploy an AI Model on AWS ?

Georges Awono

Cloud Architect for Data & AI Platforms - Transforming Business Goals to Technical Strategies ?

发布日期: 2024年12月4日

Imagine you run a small e-commerce company and want to integrate image recognition into your workflow. For example, you might want to analyze product images uploaded by customers to categorize them or detect defects in photos. The challenge? You need this AI model to be accessible, efficient, and most importantly, you have a very low budget !

Today we are going to explore an architecture that will allow you to deploy your AI model for less than 2$ per month !

Without further due, here is our final architecture :?

Amazon S3: To store your trained AI model as a physical file.
AWS Lambda: To load the model, feed it input data, and return predictions without needing a dedicated server.
Amazon API Gateway: To create a REST API that allows users to interact with the model and get predictions.

Cheap Serverless Architecture of AI Model on AWS

Let’s dive deeper into the architecture and its components.

From Notebook to S3: Prepare and serialize (saved) Your AI Model

The story starts with your data scientist developing the AI model in a jupyter notebook. The data scientist validates his model and now you want it live, ready to make predictions according to the photo your users will upload.?

The first step is to serialize the model, in other words save your model as an object.?

Model serialization is the process of converting a trained machine learning model into a format that can be saved as a physical file. This step is crucial for deploying your model because it allows you to store the model’s structure, parameters, and weights in a way that other applications can load and use later.?

Common serialization formats include pickle and joblib for Python-based models. These formats capture everything needed to reconstruct the model, making it easy to transfer and deploy in production environments. For example, after training an image recognition model in a Jupyter notebook, you would serialize it into a file and upload it to a storage service like Amazon S3, where it becomes accessible for inference tasks. Serialization ensures that your model is portable, reusable, and ready to be integrated into deployment workflows

Upload to S3: Once serialized, upload the model file to an Amazon S3 bucket. This serves as your central storage, where it’s accessible to the Lambda function when needed.

Amazon API Gateway: Create RESTful APIs easily

API Gateway is an AWS service that allows us to build APIs very easily. It exposes a RESTful API endpoint where our users can send requests (like uploading an image) and receive predictions.

API Gateway creates RESTful APIs that :?

Are HTTP-based
Enable stateless client-server communication
Implement standard HTTP methods such as GET, POST, PUT, PATCH and DELETE.

API Gateway handles all the tasks involved in accepting and processing up to hundreds of thousands of concurrent API calls. These tasks include traffic management, authorization and access control, monitoring, and API version management.?

What I like the most about API Gateway, is its native lambda integration. In a few clicks, you can create an API, create a route and assign a lambda function to handle the request. For example, here, I have created an API named “mytestapi”, with an integration with a lambda function named “MyLambdaToCallAIModel”. Each time a user will send a POST request to my API, at “/predict” path, my lambda function will be called and will receive as input the payload (here an image) of the user’s post request.?

API Gateway lets you add caching to your APIs by provisioning a cache and setting its size in gigabytes. This means responses from your endpoints can be cached, reducing the number of calls to your backend and improving the speed of API requests.

You can also use API Gateway to implement throttling. Throttling allows you to set limits on how many requests your API can handle, both in terms of concurrent requests and burst requests. If the limit is exceeded, users will receive a “429 Too Many Requests” response. This helps protect your backend from being overwhelmed by high traffic.

AWS Lambda: Serverless Model Execution

AWS Lambda allows you to run your AI model without provisioning or managing servers. Here’s how it works in this architecture:

Fetch the Model: When triggered, the Lambda function retrieves the model file from S3.
Run Inference: The function loads the model into memory and processes the input data (e.g., an image) to generate predictions.
Output Results: Lambda returns the prediction results to the API Gateway.

The keys benefits of using AWS Lambda include Pay-as-you-go pricing model : you are only charged for the compute time used; and no need to manage infrastructure as Lambda scales up and down based on demand.?

Final Architecture

Here is a final view of our AI model deployment architecture :?

Storage: Amazon S3 ensures your model is stored securely at a low cost, with no unnecessary overhead.
Compute: AWS Lambda only charges for the exact time your model runs, eliminating idle costs.
User Interaction: API Gateway provides an affordable way to expose your model to users, charging only for API calls made.

Together, these services allow you to deploy an AI model with minimal upfront cost and a pay-as-you-go model that scales with your business needs.

This architecture is perfect for businesses looking to integrate machine learning into their workflows without breaking the bank. Whether you’re building an image recognition system or any other AI-powered service, this approach ensures you get started quickly, efficiently, and affordably.

#Cloud #AI #AWS #APIGateway #S3 #Lambda

要查看或添加评论，请登录

Georges Awono的更多文章

Securing your BigQuery Data (Part 1)

2025年3月25日

Securing your BigQuery Data (Part 1)

BigQuery is one of the best Cloud Data Warehouses on the market, used for managing and analyzing large datasets. But…
How to Slash Your S3 Bill Without Sacrificing Performance (Part 1)

2025年3月16日

How to Slash Your S3 Bill Without Sacrificing Performance (Part 1)

Amazon S3 is one of Cloud services that is deceptively cheap at the beginning. When your data grows to Terabytes and…
How To Significantly Reduce Your Amazon Redshift Bill Without Sacrificing Performance

2025年3月3日

How To Significantly Reduce Your Amazon Redshift Bill Without Sacrificing Performance

Redshift is a powerful platform for data warehousing and analytics, but too often companies let their clusters run at…
ETL vs. ELT: Pick the Right approach based on your team, not just trends

2025年2月13日

ETL vs. ELT: Pick the Right approach based on your team, not just trends

The ETL vs. ELT debate has been around for years.

3 条评论
Modern Data Platforms on AWS, Part 1: Services to Extract and Manipulate Data

2025年1月28日

Modern Data Platforms on AWS, Part 1: Services to Extract and Manipulate Data

In the Data World, Google Cloud and Azure have made a name for themselves with services like Google BigQuery and…

1 条评论
Stop Losing Data – Let Amazon SQS Handle the Load

2025年1月19日

Stop Losing Data – Let Amazon SQS Handle the Load

Imagine this: your IoT sensors are sending critical data—temperature, pressure, or performance metrics—to your backend.…

2 条评论
How AWS Cloud handles internet access for your servers

2024年11月20日

How AWS Cloud handles internet access for your servers

When you deploy applications on AWS, your servers—known as instances—are hosted on the cloud. These instances are…
How to Accelerate Data Uploads to Amazon S3

2024年11月3日

How to Accelerate Data Uploads to Amazon S3

Imagine this: your company has offices in Europe and Argentina, and your Argentinian team regularly uploads large files…
Handling Streaming data with Amazon Kinesis Data Streams

2024年10月20日

Handling Streaming data with Amazon Kinesis Data Streams

What is streaming data? Streaming data refers to information that is generated continuously and in real-time, usually…
Hybrid Cloud Storage with AWS Volume Gateway : store your data on-premise and/or in the cloud

2024年10月4日

Hybrid Cloud Storage with AWS Volume Gateway : store your data on-premise and/or in the cloud

There are various scenarios where your current on-premise infrastructure might need a boost in terms of storage…

1 条评论

See all articles

From Notebook to S3: Prepare and serialize (saved) Your AI Model

Amazon API Gateway: Create RESTful APIs easily

AWS Lambda: Serverless Model Execution

Final Architecture

Georges Awono的更多文章

Securing your BigQuery Data (Part 1)

How to Slash Your S3 Bill Without Sacrificing Performance (Part 1)

How To Significantly Reduce Your Amazon Redshift Bill Without Sacrificing Performance

ETL vs. ELT: Pick the Right approach based on your team, not just trends

Modern Data Platforms on AWS, Part 1: Services to Extract and Manipulate Data

Stop Losing Data – Let Amazon SQS Handle the Load

How AWS Cloud handles internet access for your servers

How to Accelerate Data Uploads to Amazon S3

Handling Streaming data with Amazon Kinesis Data Streams

Hybrid Cloud Storage with AWS Volume Gateway : store your data on-premise and/or in the cloud