Demystifying Proxies || Managed Reverse Proxy Architecture in AWS

Demystifying Proxies || Managed Reverse Proxy Architecture in AWS

In computer networking, a proxy is a component that acts as an intermediary between a client requesting a resource and the server providing that resource. It improves privacy, security, and performance in the process.

Instead of connecting directly to a server that can fulfill a request for a resource, such as a file or web page, the client directs the request to the proxy server, which evaluates the request and performs the required network transactions. This serves as a method to simplify or control the complexity of the request, or provide additional benefits such as load balancing, privacy, or security. Proxies were devised to add structure and encapsulation to distributed systems. A proxy server thus functions on behalf of the client when requesting service, potentially masking the true origin of the request to the resource server.

A proxy server is a server application that acts as an intermediary between a client requesting a resource and the server providing that resource. It improves privacy, security, and performance in the process.

Distinguishing Between Forward Proxy and Reverse Proxy:-

Differences between Forward & Reverse Proxies

A reverse proxy server is a powerful tool that acts as an intermediary between clients and backend servers, such as web servers or application servers. Unlike a traditional forward proxy, which sits between clients and the internet, a reverse proxy sits between clients and one or more servers.

When a client makes a request, the reverse proxy server forwards that request to the appropriate backend server on behalf of the client. It then takes the response from the backend server and sends it back to the client. This process effectively hides the backend server’s identity and internal structure from the clients.

Reverse proxies offer several benefits, including enhanced security by protecting the backend servers from direct exposure to the internet, load balancing to distribute client requests across multiple servers, and caching to improve performance by serving cached content to the clients.

Some of the key use cases of reverse proxies include:-

  • Load Balancing: Reverse proxies can distribute incoming client requests across multiple backend servers. This helps distribute the load, prevent server overloading, and ensure better resource utilization. Load balancing enhances the overall performance and responsiveness of the application.
  • SSL Termination: Reverse proxies can handle SSL/TLS encryption and decryption on behalf of backend servers. This offloads the resource-intensive SSL processing from backend servers, reducing their workload and simplifying certificate management.
  • Caching: Reverse proxies can store frequently requested resources in cache memory. When clients request the same resources, the reverse proxy serves the cached content directly, reducing server load and improving response times.
  • Web Acceleration: By caching static content and compressing data, reverse proxies can accelerate the loading of web pages for clients, resulting in a smoother user experience.
  • Security and DDoS Protection: Reverse proxies act as a protective barrier between the internet and backend servers. They can filter and block malicious traffic, protect against Distributed Denial of Service (DDoS) attacks, and hide the backend server’s real IP address to prevent direct attacks.
  • Web Application Firewall (WAF): Reverse proxies can act as a WAF, inspecting incoming traffic for potential threats, such as SQL injection, cross-site scripting (XSS), and other malicious activities. They help in safeguarding web applications from common vulnerabilities.
  • Single Point of Entry: Reverse proxies provide a single entry point for external clients to access multiple backend servers. This simplifies the network architecture and allows for easier management and scaling of services.
  • Protocol Conversion: Reverse proxies can translate requests from one protocol to another. For example, they can convert HTTP requests to WebSocket or other application-specific protocols, facilitating communication between clients and servers.
  • Content Compression and Optimization: Reverse proxies can compress outgoing content before sending it to clients, reducing data transfer size and improving page load times.

With the trends to autonomous teams and microservice style architectures, web frontend tiers are challenged to become more flexible and integrate different components with independent architectures and technology stacks. Two scenarios are prominent:

  • Micro-frontends, where there is a single page application and components within this page are owned by different teams
  • Web portals, where there is a landing page and subsections of the presence are owned by different teams. In the following we will refer to these as components as well.

What these scenarios have in common is that they consist of loosely coupled components that are seamlessly hidden to the end user behind a common interface. Often, a reverse proxy serves content from one single entry domain but retrieves the content from different origins.

Content is retrieved from an on-prem webserver, from a webserver running on EC2, or from Amazon S3 Static Hosting, represented by the prefixes /Alok, /home, and /office, respectively.

Example in above figure, is addressing one specific domain name, and depending on the path prefix, It retrieves the content from an on-premises webserver, from a webserver running on EC2, or from Amazon S3 Static Hosting, in the figure represented by the prefixes /Alok, /home, and /office, respectively. If we forward the path to the webserver without the path prefix, the component would not know what prefix it is run under and the prefix could be changed any time without impacting the component, thus making the component context-unaware.

Traditionally a reverse proxy tier would run with rewrite rules to different origins (E.g. Apache/Nginix). In this post we look into managed alternatives in AWS that take away the heavy lifting of running and scaling the proxy infrastructure.

Application Load Balancer

Application load balancer can server as a reverse proxy. Lets look at the configuration steps.

Step: 1

On EC2 console, scroll to the bottom and click on Load balancers.

Step: 3

Create a Target Group for your instance

Step: 4

Name your target group and click ‘Create’. Default settings are fine here.

Step: 5

Select the target group you just created and click ‘Edit’.

Step: 6

Select a running instance you want to register with your target group (in this case, it's a WordPress blog) and click ‘Add to registered’.

After it shows up in the Registered targets, click on ‘Save

Step: 7

Now, go back to your load balancers and select the one you want to proxy. Click on Listeners tab and click on “View/edit rules” of the HTTPS: 443 listener you configured while setting up SSL.

Step: 8

Tap on the + icon and click “Insert Rule

Step: 9

Add the following 2 rules to proxy the blog server every time a user hits the blog path:

  • IF Path is /blog THEN Forward to example-blog
  • IF Path is /blog/* THEN Forward to example-blog

AWS Amplify Console

The AWS Amplify Console provides a Git-based workflow for hosting full stack serverless web apps with continuous deployment. Amplify Console also offers a rewrites and redirects feature, which can be used for forwarding incoming requests with different path patterns to different origins (see Figure 2).

Figure-2

Figure 2: Dashboard, AWS Amplify Console (rewrites and redirects feature)

Note: In Figure 2, <*> stands for a wildcard that matches any pattern. Target addresses must be HTTPS (no HTTP allowed).

This architectural option is the simplest to setup and manage and is the best approach for teams looking for the least management effort. AWS Amplify Console offers a simple interface for easily mapping incoming patterns to target addresses. It also makes it easy to serve additional static content if needed. Configuration options are limited and more complex scenarios cannot be implemented.

If you want to rewrite paths to remove the path prefix, you can accomplish this by using the wildcard pattern. The source address would contain the path prefix, but the target address would omit the prefix as seen in Figure 2.

When looking at pricing compared to the other approaches it is important to look at the outgoing traffic. With higher volumes, this can get expensive.

Amazon API Gateway

Amazon API Gateway is a fully managed service that makes it easy for developers to create, publish, maintain, monitor, and secure APIs at any scale. API Gateway’s REST API type allows users to setup HTTP proxy integrations, which can be used for forwarding incoming requests with different path patterns to different origin servers according to the API specifications (Figure 3).

Figure-3

Figure 3: Dashboard, Amazon API Gateway (HTTP proxy integration)

Note: In Figure 3, {proxy+} and {proxy} stand for the same wildcard pattern.

API Gateway, in comparison to Amplify Console, is better suited when looking for a higher customization degree. API Gateway offers multiple customization and monitoring features, such as custom gateway responses and dashboard monitoring.

Similar to Amplify Console, API Gateway provides a feature to rewrite paths and thus remove context from the path using the {proxy} wildcard.

API Gateway REST API pricing is based on the number of API calls as well as any external data transfers. External data transfers are charged at the EC2 data transfer rate.

Note: The HTTP integration type in API Gateway REST APIs does not support forwarding trailing slashes. If this is needed for your application, consider other integration types such as AWS Lambda integration or AWS service integration.

Amazon CloudFront and AWS Lambda@Edge

Amazon CloudFront is a fast content delivery network (CDN) service that securely delivers data, videos, applications, and APIs to customers globally with low latency and high transfer speeds. CloudFront is able to route incoming requests with different path patterns to different origins or origin groups by configuring its cache behavior rules (Figure 4).

Figure 4

Figure 4: Dashboard, CloudFront (Cache Behavior)

Additionally, Amazon CloudFront allows for integration with AWS Lambda@Edge functions. Lambda@Edge runs your code in response to events generated by CloudFront. In this scenario we can use Lambda@Edge to change the path pattern before forwarding a request to the origin and thus removing the context.

This approach offers most control over caching behavior and customization. Being able to add your own custom code through a custom Lambda function adds an entire new range of possibilities when processing your request. This enables you to do everything from simple HTTP request and response processing at the edge to more advanced functionality, such as website security, real-time image transformation, intelligent bot mitigation, and search engine optimization.

Amazon CloudFront is charged by request and by Lambda@Edge invocation. The data traffic out is charged with the CloudFront regional data transfer out pricing.

Conclusion

In this article we have seen four approaches to implement a reverse proxy pattern using managed services from AWS. AWS ALB, AWS Amplify Console, Amazon API Gateway, and Amazon CloudFront.


--Alok Saraswat

-- References and credits- Amazon web service documentation and blogs

要查看或添加评论,请登录

Alok Saraswat的更多文章

社区洞察

其他会员也浏览了