Demystifying Proxies || Managed Reverse Proxy Architecture in AWS
In computer networking, a proxy is a component that acts as an intermediary between a client requesting a resource and the server providing that resource. It improves privacy, security, and performance in the process.
Instead of connecting directly to a server that can fulfill a request for a resource, such as a file or web page, the client directs the request to the proxy server, which evaluates the request and performs the required network transactions. This serves as a method to simplify or control the complexity of the request, or provide additional benefits such as load balancing, privacy, or security. Proxies were devised to add structure and encapsulation to distributed systems. A proxy server thus functions on behalf of the client when requesting service, potentially masking the true origin of the request to the resource server.
Distinguishing Between Forward Proxy and Reverse Proxy:-
A reverse proxy server is a powerful tool that acts as an intermediary between clients and backend servers, such as web servers or application servers. Unlike a traditional forward proxy, which sits between clients and the internet, a reverse proxy sits between clients and one or more servers.
When a client makes a request, the reverse proxy server forwards that request to the appropriate backend server on behalf of the client. It then takes the response from the backend server and sends it back to the client. This process effectively hides the backend server’s identity and internal structure from the clients.
Reverse proxies offer several benefits, including enhanced security by protecting the backend servers from direct exposure to the internet, load balancing to distribute client requests across multiple servers, and caching to improve performance by serving cached content to the clients.
Some of the key use cases of reverse proxies include:-
With the trends to autonomous teams and microservice style architectures, web frontend tiers are challenged to become more flexible and integrate different components with independent architectures and technology stacks. Two scenarios are prominent:
What these scenarios have in common is that they consist of loosely coupled components that are seamlessly hidden to the end user behind a common interface. Often, a reverse proxy serves content from one single entry domain but retrieves the content from different origins.
Example in above figure, is addressing one specific domain name, and depending on the path prefix, It retrieves the content from an on-premises webserver, from a webserver running on EC2, or from Amazon S3 Static Hosting, in the figure represented by the prefixes /Alok, /home, and /office, respectively. If we forward the path to the webserver without the path prefix, the component would not know what prefix it is run under and the prefix could be changed any time without impacting the component, thus making the component context-unaware.
Traditionally a reverse proxy tier would run with rewrite rules to different origins (E.g. Apache/Nginix). In this post we look into managed alternatives in AWS that take away the heavy lifting of running and scaling the proxy infrastructure.
Application Load Balancer
Application load balancer can server as a reverse proxy. Lets look at the configuration steps.
Step: 1
On EC2 console, scroll to the bottom and click on Load balancers.
Step: 3
Create a Target Group for your instance
Step: 4
Name your target group and click ‘Create’. Default settings are fine here.
Step: 5
Select the target group you just created and click ‘Edit’.
Step: 6
Select a running instance you want to register with your target group (in this case, it's a WordPress blog) and click ‘Add to registered’.
After it shows up in the Registered targets, click on ‘Save’
领英推荐
Step: 7
Now, go back to your load balancers and select the one you want to proxy. Click on Listeners tab and click on “View/edit rules” of the HTTPS: 443 listener you configured while setting up SSL.
Step: 8
Tap on the + icon and click “Insert Rule”
Step: 9
Add the following 2 rules to proxy the blog server every time a user hits the blog path:
AWS Amplify Console
The AWS Amplify Console provides a Git-based workflow for hosting full stack serverless web apps with continuous deployment. Amplify Console also offers a rewrites and redirects feature, which can be used for forwarding incoming requests with different path patterns to different origins (see Figure 2).
Figure 2: Dashboard, AWS Amplify Console (rewrites and redirects feature)
Note: In Figure 2, <*> stands for a wildcard that matches any pattern. Target addresses must be HTTPS (no HTTP allowed).
This architectural option is the simplest to setup and manage and is the best approach for teams looking for the least management effort. AWS Amplify Console offers a simple interface for easily mapping incoming patterns to target addresses. It also makes it easy to serve additional static content if needed. Configuration options are limited and more complex scenarios cannot be implemented.
If you want to rewrite paths to remove the path prefix, you can accomplish this by using the wildcard pattern. The source address would contain the path prefix, but the target address would omit the prefix as seen in Figure 2.
When looking at pricing compared to the other approaches it is important to look at the outgoing traffic. With higher volumes, this can get expensive.
Amazon API Gateway
Amazon API Gateway is a fully managed service that makes it easy for developers to create, publish, maintain, monitor, and secure APIs at any scale. API Gateway’s REST API type allows users to setup HTTP proxy integrations, which can be used for forwarding incoming requests with different path patterns to different origin servers according to the API specifications (Figure 3).
Figure 3: Dashboard, Amazon API Gateway (HTTP proxy integration)
Note: In Figure 3, {proxy+} and {proxy} stand for the same wildcard pattern.
API Gateway, in comparison to Amplify Console, is better suited when looking for a higher customization degree. API Gateway offers multiple customization and monitoring features, such as custom gateway responses and dashboard monitoring.
Similar to Amplify Console, API Gateway provides a feature to rewrite paths and thus remove context from the path using the {proxy} wildcard.
API Gateway REST API pricing is based on the number of API calls as well as any external data transfers. External data transfers are charged at the EC2 data transfer rate.
Note: The HTTP integration type in API Gateway REST APIs does not support forwarding trailing slashes. If this is needed for your application, consider other integration types such as AWS Lambda integration or AWS service integration.
Amazon CloudFront and AWS Lambda@Edge
Amazon CloudFront is a fast content delivery network (CDN) service that securely delivers data, videos, applications, and APIs to customers globally with low latency and high transfer speeds. CloudFront is able to route incoming requests with different path patterns to different origins or origin groups by configuring its cache behavior rules (Figure 4).
Figure 4: Dashboard, CloudFront (Cache Behavior)
Additionally, Amazon CloudFront allows for integration with AWS Lambda@Edge functions. Lambda@Edge runs your code in response to events generated by CloudFront. In this scenario we can use Lambda@Edge to change the path pattern before forwarding a request to the origin and thus removing the context.
This approach offers most control over caching behavior and customization. Being able to add your own custom code through a custom Lambda function adds an entire new range of possibilities when processing your request. This enables you to do everything from simple HTTP request and response processing at the edge to more advanced functionality, such as website security, real-time image transformation, intelligent bot mitigation, and search engine optimization.
Amazon CloudFront is charged by request and by Lambda@Edge invocation. The data traffic out is charged with the CloudFront regional data transfer out pricing.
Conclusion
In this article we have seen four approaches to implement a reverse proxy pattern using managed services from AWS. AWS ALB, AWS Amplify Console, Amazon API Gateway, and Amazon CloudFront.
--Alok Saraswat
-- References and credits- Amazon web service documentation and blogs