AWS Lambda & RDS in VPC: The Best Practice
Introduction
Serverless is an excellent concept of cloud computing. It simplifies cloud application development by allowing the developers to focus on delivering business logic without worrying about computing capacity planning and ongoing maintenance.
AWS Lambda is one of the essential services when building a serverless application on AWS. While most of the serverless application tutorials use managed NoSQL database services such as DynamoDB, it is still a common scenario for Lambda to connect to an RDS (Relational Database Service) database instance.
To start the discussion, I will use Lambda functions to develop serverless APIs that read CSV from an S3 bucket and write the data to the MySQL database table.
The most straightforward way for a Lambda function to access an RDS would be to allow RDS to be publicly accessible. When the RDS is publicly accessible, Lambda can directly connect to the RDS. By default, AWS deploys Lambda into an AWS-managed VPC, which will have internet access.
I start by designing the architecture as below.
For testing or non-product applications, this model is okay. But there are security risks in allowing RDS to be publicly accessible. Imagine the RDS credentials are accidentally hardcoded in the code and published to the public git repository. Anyone with RDS credentials can connect to the RDS over the internet.?
Therefore, placing the RDS instance in public subnets and enabling public access is generally not considered a best practice.?
Best Practice of Deploying RDS
Data is undoubtedly one of the most critical assets of modern business, and security is among the top priority when considering enterprise solutions. Many whitepapers and knowledgebase articles guide the best practice of RDS deployments, such as Trend's AWS RDS Best Practices, Security best practices for Amazon RDS and Security in Amazon RDS.
One of the standard best practice suggestions is to deploy an RDS instance into isolated subnets within a VPC.
Other best practices to consider include the following:
Following the RDS deployment best practice, I revised the architecture drawing. I deployed the RDS instance in the isolated subnets within a VPC. I use Secret Manager to store the database credentials in this revision.
If I deploy this revision and try to access the API endpoint, I should encounter a Lambda function timeout error.
The root cause of the timeout error is Lambda function failed to connect to the RDS in the VPC. The RDS instance is now placed inside an isolated subnet. The Lambda function outside the VPC will not be able to connect to the RDS instance directly. You can compare this by visiting an on-premises server in the enterprise network (LAN), which sits behind the firewall. Without proper routing and forward configuration, the traffic from the internet will not be able to reach the server.
Placing the RDS instance inside the isolated subnet is a good security measurement but will break the connection between the RDS and the other software components. Under such circumstances, what would be the best practice for the Lambda function to access the RDS?
Lambda within a VPC
Will placing the Lambda in the VPC allow it to connect to the RDS? The quick answer is yes.
AWS allows assigning a VPC (tenancy VPC) to the Lambda function. However, while it is good practice to put the RDS in the isolated subnet in VPC, putting Lambda functions in the VPC is somewhat controversial. Some knowledgebase articles like this suggest that it's best practice NOT to put the Lambda function in a VPC unless the function must access other resources in the VPC. A lambda function is a short-lived computing instance. It will maximise its power when used together with other AWS services. Confining the Lambda function within a network boundary like a traditional virtual computer will bring little benefit. We will discuss this later.
To allow the Lambda function to connect to the RDS instance in the VPC, I modified the architecture drawing to move the Lambda function inside the same VPC where the RDS instance is deployed. However, redeploying this revision will not fix the API access error.
By default, Lambda functions are deployed with 'no-VPC' attached. AWS manages the 'no-VPC' Lambda function's network with internet access. However, the Lambda functions placed in the tenancy VPC can only have private IP addresses, and deploying the Lambda functions in the public subnets is impossible. Therefore, Lambda in a VPC can't use an internet gateway to access the internet.
As a result, while assigning the same VPC as the RDS to Lambda will allow the Lambda functions to connect to the RDS instance, it also isolates the Lambda function from the internet. AWS-managed services can only be reached via the internet. In my solution, S3 and Secret Manager will become unreachable from the Lambda function after assigning the VPC to the Lambda function.
Although the internet connection from Lambda in the tenancy VPC can't be built automatically, two approaches can help bridge the Lambda function with other AWS services or even access the internet.
We can deploy a NAT (Network Address Translate) gateway if full internet access is required, but if the Lambda functions only need to access specific AWS-managed services, we can use VPC Endpoints.
Connect Lambda to the Internet via NAT
Using NAT devices to route internet traffic is common in enterprise network topology. NAT is a ‘classic’ traffic forward technique. It requires a public IP address to be attached to the NAT gateway. The local subnets’ private IP addresses are mapped to the public IP address to allow computers in the local subnet to access the internet.
I revised the architecture by using NAT to connect the Lambda function to the internet.
This deployment should fix the S3 and Secret Manager accessing issues. In my application, the Lambda function only needs to access S3 and Secret Manager services. In this scenario, using NAT is not the most cost-effective solution.?
AWS NAT Gateway is pricing based on the gateway usage hours plus the data the gateway processes. Let’s take the Sydney region as an example with 1 NAT gateway deployed and 1TB data transferred per month:
730 hours in a month x 0.059 USD = 43.07 USD (Gateway usage hourly cost)
1,000 GB per month x 0.059 USD = 59.00 USD (NAT Gateway data processing cost)
43.07 USD + 59.00 USD = 102.07 USD (NAT Gateway processing and month hours)
1 NAT Gateways x 102.07 USD = 102.07 USD (Total NAT Gateway usage and data processing cost)
Total NAT Gateway usage and data processing cost (monthly): 102.07 USD
Note that NAT Gateway measures all the data it processes. Suppose my APIs become an essential component of a busy ETL pipeline. It will read TBs data from the S3 bucket monthly. And it will incur significant data processing charges!
In my application, I only use specific AWS-managed services. To avoid the S3 data travelling via the NAT Gateway, I should use VPC Endpoints to access AWS-managed services.
领英推荐
Connect Lambda to the AWS Services via VPC Endpoints
A VPC endpoint enables a private connection between AWS services and the VPC. It allows instances within a VPC to connect to AWS services without traversing the internet.?
Accessing AWS-managed services via VPC Endpoints is a highly recommended approach by many technical blogs. Of course, if the Lambda function must access the internet, it will still require a NAT Gateway.
There are two types of VPC Endpoints: gateway endpoints and interface endpoints. Gateway endpoints are only used for S3 and DynamoDB services. There is no additional charge for using the gateway endpoint.
For all the other available AWS-managed services, AWS interface endpoints are used to link instances/functions inside VPC to them. The interface endpoint is billed based on usage hours and how much data the endpoints are processed. But compared to a NAT gateway, deploying one VPC endpoint is significantly cheaper.
Using the Sydney region again as an example: for 1 VPC interface endpoint deployed with 1TB of data transferred per month, the cost breakdown is as follows:
1 VPC endpoints x 1 ENIs per VPC endpoint x 730 hours in a month x 0.013 USD = 9.49 USD (Monthly cost for endpoint ENI)
Monthly cost for Interface endpoints: 9.49 USD
Tiered price for: 1000 GB
1000 GB x 0.0100000000 USD = 10.00 USD
Total tier cost = 10.0000 USD (PrivateLink data processing cost)
Total data processing cost: 10 USD
9.49 USD + 10 USD = 19.49 USD (Total PrivateLink Cost)
Total PrivateLink endpoints and data processing cost (monthly): 19.49 USD
I revised the architecture drawing to switch from NAT to VPC Endpoint. The new deployment will generate one gateway endpoint for S3 and one interface endpoint for Secret Manager. Redeploying the serverless application should see the API works properly.
To use specific AWS-managed services within the VPC, one interface endpoint per service per VPC needs to be created. The interface endpoints must be kept alive during the application's lifetime otherwise the application will encounter unexpected errors.
When I review my solution cost, the ten bucks monthly charge for a single secret stored in the Secret Manager doesn’t sound like a good justice for using the interface endpoint. Assuming that each serverless application creates its own VPC and interface endpoints, it would be a big waste of investment. To improve the interface endpoints usage efficiency, it is good practice to group the applications into one VPC to reuse the interface endpoints.
It will start getting more annoying if the Lambda Functions need to access different AWS-managed services and I will have to create separate interface endpoints for each of the services my application relies on. When the application requires ten or more interface endpoints, the cost will become comparable to the NAT deployment.
10 VPC endpoints x 1 ENIs per VPC endpoint x 730 hours in a month x 0.013 USD = 94.90 USD (Monthly cost for endpoint ENI)
Monthly cost for Interface endpoints: 94.90 USD
Tiered price for: 1000 GB
1000 GB x 0.0100000000 USD = 10.00 USD
Total tier cost = 10.0000 USD (PrivateLink data processing cost)
Total data processing cost: 10 USD
94.90 USD + 10 USD = 104.90 USD (Total PrivateLink Cost)
Total PrivateLink endpoints and data processing cost (monthly): 104.90 USD
Therefore, it might be worth exploring the options that can minimise the use of AWS-managed service. In my application, I avoided using Secret Manager by enabling RDS IAM authentication.
But hang on, although it is sensible to be aware of the solution cost, instead of leveraging the usage of AWS-managed services I’m pushing the solution away from using more AWS-managed services. It doesn’t smell good!
Rethinking
Looking back on the journey, I started by looking for a solution to allow the Lambda functions to connect to the RDS in the VPC. I placed the Lambda in the same VPC as RDS to enable connectivity between the Lambda function and RDS. The solution gradually evolved to allow the rest parts of the application to work properly under the new deployment environment.
A Lambda function is a short-lived virtual computer instance, and unlike a classic cloud virtual computer, it doesn’t have full access to the underlying computing resources. Therefore, to maximise the power of the Lambda function, it should be used with other AWS services.
AWS offers more than 200+ services. Normally, cloud user – including a Lambda function – accesses AWS-managed service via the internet. By moving a Lambda function into the tenancy, VPC is isolated from the rest of the AWS services. To build the connection from a Lambda function to an AWS-managed service, I must ‘plug’ a private link for each required service.
Now, let me recap the notes in this AWS knowledgebase article:
"It's a best practice to not put your Lambda function in an Amazon VPC unless the function must access other resources in the VPC."
This best practice rule makes much more sense now!
As for my situation, it is a valid case that my Lambda function must access the RDS in the VPC. Removing the VPC assignment from Lambda deployment will bring me back to the original point where the database connection is broken.
Let's change the angle of viewing the problem. Instead of isolating Lambda from most AWS services, why don't I segment the lambda functions that need to be in a VPC from the rest of the functions?
Mixed Lambda Deployment
I split the Lambda function into two Lambda functions, with the new Lambda function containing the code to access RDS in the VPC. The idea is to differentiate the Lambda functions that need to access RDS from other functions and only place those functions (that need to access RDS) in the VPC. For illustration purposes, I keep the code to read CSV from S3 in the original Lambda function.
The solution was updated as in the diagram. I add an SNS (Simple Notification Service) resource to allow the CSV reading function to invoke the RDS writing function. The CSV reading function is deployed with no tenancy VPC attachment. The RDS writing function is placed inside the same VPC as the RDS.?RDS IAM authentication is used to replace username password access. Therefore there is no need to visit Secret Manager.
The revised solution target to place only small parts of Lambda functions in the VPC while ensuring the rest of the Lambda functions can be deployed with no tenancy VPC attachment. It is a balanced solution to achieve what I plan to do.
To summarise my approach, I want to extend the quote above:
It's a best practice to not put your Lambda function in an Amazon VPC unless the function must access other resources in the VPC. If you have to do so, try to minimise the scope of the Lambda function that has to be put in the VPC.
Conclusion
AWS Lambda is a versatile tool for building low-cost, powerful serverless applications. When dealing with the interoperability issue between Lambda and RDS, following the best practice rules is recommended if you can.
As a quick checklist, the best practice rules that can be applied to RDS and Lambda when dealing with VPC include:
Managing Director at Forecast | GAICD | MBA
1 年Great article Huaifeng Qin