The Cheapest Way to Deploy an AI Model on AWS ?

The Cheapest Way to Deploy an AI Model on AWS ?

Imagine you run a small e-commerce company and want to integrate image recognition into your workflow. For example, you might want to analyze product images uploaded by customers to categorize them or detect defects in photos. The challenge? You need this AI model to be accessible, efficient, and most importantly, you have a very low budget !

Today we are going to explore an architecture that will allow you to deploy your AI model for less than 2$ per month !

Without further due, here is our final architecture :?

  • Amazon S3: To store your trained AI model as a physical file.
  • AWS Lambda: To load the model, feed it input data, and return predictions without needing a dedicated server.
  • Amazon API Gateway: To create a REST API that allows users to interact with the model and get predictions.


Cheap Serverless Architecture of AI Model on AWS

Let’s dive deeper into the architecture and its components.

From Notebook to S3: Prepare and serialize (saved) Your AI Model

The story starts with your data scientist developing the AI model in a jupyter notebook. The data scientist validates his model and now you want it live, ready to make predictions according to the photo your users will upload.?

The first step is to serialize the model, in other words save your model as an object.?

Model serialization is the process of converting a trained machine learning model into a format that can be saved as a physical file. This step is crucial for deploying your model because it allows you to store the model’s structure, parameters, and weights in a way that other applications can load and use later.?

Common serialization formats include pickle and joblib for Python-based models. These formats capture everything needed to reconstruct the model, making it easy to transfer and deploy in production environments. For example, after training an image recognition model in a Jupyter notebook, you would serialize it into a file and upload it to a storage service like Amazon S3, where it becomes accessible for inference tasks. Serialization ensures that your model is portable, reusable, and ready to be integrated into deployment workflows

Upload to S3: Once serialized, upload the model file to an Amazon S3 bucket. This serves as your central storage, where it’s accessible to the Lambda function when needed.

Amazon API Gateway: Create RESTful APIs easily

API Gateway is an AWS service that allows us to build APIs very easily. It exposes a RESTful API endpoint where our users can send requests (like uploading an image) and receive predictions.

API Gateway creates RESTful APIs that :?

  • Are HTTP-based
  • Enable stateless client-server communication
  • Implement standard HTTP methods such as GET, POST, PUT, PATCH and DELETE.

API Gateway handles all the tasks involved in accepting and processing up to hundreds of thousands of concurrent API calls. These tasks include traffic management, authorization and access control, monitoring, and API version management.?

What I like the most about API Gateway, is its native lambda integration. In a few clicks, you can create an API, create a route and assign a lambda function to handle the request. For example, here, I have created an API named “mytestapi”, with an integration with a lambda function named “MyLambdaToCallAIModel”. Each time a user will send a POST request to my API, at “/predict” path, my lambda function will be called and will receive as input the payload (here an image) of the user’s post request.?


Amazon API Gateway

API Gateway lets you add caching to your APIs by provisioning a cache and setting its size in gigabytes. This means responses from your endpoints can be cached, reducing the number of calls to your backend and improving the speed of API requests.

You can also use API Gateway to implement throttling. Throttling allows you to set limits on how many requests your API can handle, both in terms of concurrent requests and burst requests. If the limit is exceeded, users will receive a “429 Too Many Requests” response. This helps protect your backend from being overwhelmed by high traffic.

AWS Lambda: Serverless Model Execution

AWS Lambda allows you to run your AI model without provisioning or managing servers. Here’s how it works in this architecture:

  • Fetch the Model: When triggered, the Lambda function retrieves the model file from S3.
  • Run Inference: The function loads the model into memory and processes the input data (e.g., an image) to generate predictions.
  • Output Results: Lambda returns the prediction results to the API Gateway.

The keys benefits of using AWS Lambda include Pay-as-you-go pricing model : you are only charged for the compute time used; and no need to manage infrastructure as Lambda scales up and down based on demand.?

Final Architecture

Here is a final view of our AI model deployment architecture :?

  • Storage: Amazon S3 ensures your model is stored securely at a low cost, with no unnecessary overhead.
  • Compute: AWS Lambda only charges for the exact time your model runs, eliminating idle costs.
  • User Interaction: API Gateway provides an affordable way to expose your model to users, charging only for API calls made.


Final Architecture

Together, these services allow you to deploy an AI model with minimal upfront cost and a pay-as-you-go model that scales with your business needs.

This architecture is perfect for businesses looking to integrate machine learning into their workflows without breaking the bank. Whether you’re building an image recognition system or any other AI-powered service, this approach ensures you get started quickly, efficiently, and affordably.


#Cloud #AI #AWS #APIGateway #S3 #Lambda


要查看或添加评论,请登录

Georges Awono的更多文章