登录查看更多内容

You're Killin' Me, AWS! Part II

David Hazar

Certified SANS Instructor | IANS Faculty Member | Consultant | Founder | Public Speaker

发布日期: 2024年5月24日

To ingress, or not to ingress, that is my question

If you missed part one of my series on the "Load Balancer Club Sandwich", you may want to check that out first, as it will save me from having to repeat myself too much.

In this article, I want to discuss what I think is an exciting, newer service from the folks at Amazon Web Services (AWS) . The service is Amazon VPC Lattice (https://docs.aws.amazon.com/vpc-lattice/latest/ug/what-is-vpc-lattice.html), and while some may disagree with my assessment, if I had to come up with a tagline for the service, I would choose either "Kubernetes for the Cloud" or "Kubernetes, it's not just for containers anymore".

The reason I use this comparison is because you create services and attach them to a network of services and then you can use "Auth policies" to designate which services can talk to other services. This allows services in the same service network to talk to one another as long as it is allowed by the auth policy. While service networks are not exactly the same as namespaces and auth policies are not quite like network policies, it seems like a good enough comparison to me. There are also ways for services to be attached to multiple service networks if cross-network access is needed.

So, why not just use Kubernetes? Well, the great thing about VPC Lattice is it covers more than just Kubernetes. Here are the non-Kubernetes targets that can be targeted as services in VPC Lattice:

These targets along with any pods that need to communicate with one another can all be registered to the same service network.

You may be asking yourself, so what? I can already set up connectivity between all these things. How is this different? Well, what if your Lambda function is in one virtual private cloud (VPC), your application load balancer (ALB) is in another, and your Kubernetes pod in yet another? Well, VPC lattice doesn't care. Once you get them all connected to the same service network with the help of the resource access manager (RAM) service, where they exist in the cloud no longer matters.

VPC Lattice is an abstraction layer on top of your cloud environment. Forget peering, hub and spoke, overlapping IPs, etc., none of that matters (well, kind of). As long as these services only need to communicate over HTTP, HTTPS, or gRPC, they only need the service network. Each service will have their own unique service network-specific fully-qualified domain name (FQDN), but this doesn't resolve to any IP address defined for any of your subnets or VPCs. So, to what address space does it resolve? Let me show you:

领英推荐

The Tides of Compute Are Changing

Ian Eyberg 8 年前

How to implement a Private Endpoint (only) OpenShift…

Alain AIROM 3 年前

Cloud Computing

Kiril Jovanovski 2 年前

Wait, 169.254.171.#, that can't be right, can it? That is the magic of VPC Lattice. Routing the traffic back to these link-local addresses will route back to the service network behind-the-scenes without needing to setup any additional routing, peering, etc. You do, however, need to set up your security groups to allow traffic to and from the Lattice service network as needed. This can easily be accomplished by using the source or destination of com.amazonaws.{region}.vpc-lattice and com.amazonaws.{region}.ipv6.vpc-lattice in your rules.

So, what's the problem? Well . . . this works great for service-to-service communication over HTTP, HTTPS, gRPC (no WebSocket support :-( ) within your cloud environment. But, what about external users of your applications. Wait . . . you want actual people to connect to your applications? Um . . . duh!

Ok, so what are my options? How do I enable ingress traffic to my public-facing applications via the service network? There have to be options, right? Gag . . . yeah, there are some options . . . I guess.

The first and probably best, and I use this word hesitantly, option is to set up a fleet of proxies behind a load balancer and configure them to translate the external DNS for each public-facing service to the internal service FQDN. The other option involves Lambda. For cost reasons, in our lab environment for SEC549: Cloud Security Architecture, I chose the Lambda function. However, that means every time one of our sites is accessed it triggers the Lambda function 10 times just for the homepage. Some of the pages will be cached after that, but those invocations would add up in a real environment.

For those interested, here is the code I am using which is a modified version of a sample from AWS found here (https://github.com/aws-samples/amazon-vpc-lattice-secure-apis/blob/main/api/src/client/fn.py). I am essentially just replacing the first "." character in the FQDN with "-service." to translate it to the FQDN of the service network and then I had to make some modifications to the example to handle images and other content types through this makeshift proxy (please go with option 1 in production).

import json
import os
import requests
import urllib
import logging
import base64

import botocore.session
from botocore.auth import SigV4Auth
from botocore.awsrequest import AWSRequest
from botocore.credentials import Credentials

logger = logging.getLogger()
logger.setLevel("INFO")

# initialization: environment variables
region = os.environ.get("AWS_REGION", "us-east-1")

# initialization: boto
session = botocore.session.get_session()

# helper functions
def build_response(output):
    # headers for cors
    headers = output["headers"] if "headers" in output else {}
    # lambda proxy integration
    response = {
        "isBase64Encoded": True,
        "statusCode": output["status_code"],
        "headers": dict(headers),
        "body": base64.b64encode(output["content"]).decode('utf-8')
    }
    return response

def parse_flag(event, flag):
    response = False
    if flag in event and event[flag]:
        response = True
    return response

def send_request(event, add_sigv4=False, debug=False):
    headers = event["headers"]
    headers["host"] = headers["host"].replace(".", "-service.", 1)
    logger.info(headers)
    querystring = f"?{urllib.parse.urlencode(event["queryStringParameters"])}" if "queryStringParameters" in event and len(event["queryStringParameters"]) > 0 else ""

    endpoint = f"https://{headers["host"] + event["path"] + querystring}" if "endpoint" not in event["body"] else event["body"]["endpoint"]
    
    logger.info(endpoint)
    method = "GET" if "httpMethod" not in event else event["httpMethod"]
    logger.info(method)
    data = "" if "body" not in event else json.dumps(event["body"])
    logger.info(str(data))
    request = AWSRequest(method=method, url=endpoint, data=data, headers=headers)
    request.context["payload_signing_enabled"] = False

    if add_sigv4:
        print(json.dumps({
            "message": "sigv4 signing the request"
        })) if debug else None
        sigv4 = SigV4Auth(session.get_credentials(), "vpc-lattice-svcs", region)
        sigv4.add_auth(request)

    timeout = 5
    output = {}
    try:
        print(json.dumps({
            "endpoint": endpoint
        })) if debug else None
        prepped = request.prepare()
        # throws requests.exceptions.ReadTimeout, requests.exceptions.ConnectionError
        if method == "POST":
            response = requests.post(prepped.url, headers=prepped.headers, data=data, timeout=timeout)
        elif method == "DELETE":
            response = requests.delete(prepped.url, headers=prepped.headers, timeout=timeout)
        else:
            response = requests.get(prepped.url, headers=prepped.headers, timeout=timeout)

        # response is of type requests.models.Response
        if response.status_code == 200:
            # throws requests.exceptions.JSONDecodeError
            output = {
                "status_code": response.status_code,
                "headers": response.headers,
                "content": response.content
            }
        else:
            output = {
                "status_code": response.status_code,
                "reason": response.reason
            }
    except requests.exceptions.ReadTimeout:
        output = {
            "status_code": 504,
            "reason": f"request to vpc lattice backend timed out ({timeout} seconds)"
        }
    except (requests.exceptions.ConnectionError):
        output = {
            "status_code": 504,
            "reason": "connection reset by peer and aborted"
        }
    except (requests.exceptions.JSONDecodeError):
        output = {
            "status_code": 200,
            "reason": "no json returned in the response",
        }

    return output

def lambda_handler(event, context):
    enable_sigv4 = parse_flag(event["body"], "sigv4")
    enable_debug = parse_flag(event["body"], "debug")
    logger.info(json.dumps({
        "enable_sigv4": enable_sigv4,
        "enable_debug": enable_debug
    }))
    output = send_request(event, add_sigv4=enable_sigv4, debug=enable_debug)

    response = build_response(output)
    return response

For those that read my last article, you probably already know where this is going, right? Why not allow you to target VPC Lattice services in the ALB? It would make things so much easier. Wait, but if we could just target an FQDN, wouldn't that work. Sure, but sadly, we cannot. I am hopeful though as I do appreciate AWS's velocity, when it comes to change and features, compared to some other providers. I won't name names here but browse through my articles and you might find a reference to a feature in another provider that was 4 years in the making.

One final note, VPCs, internet gateways, route tables, etc. do not go away completely. After all, your instances, pods, etc. still need to access the Internet. You can also still handle ingress the same way you always have, but now you have to worry about many of the things we were trying to avoid by using VPC Lattice.

VPC Lattice definitely reduces the need for peering, solves the IP address overlap problem, and reduces the complexity of routing, which may simplify our approach to application connectivity. It also allows layer 7 network policy controls vs. what you get with network access control lists (NACLs) and security groups. If they could just solve the ingress problem, I would be a happy consumer! In the process, they might just make it so I can go back to buying sandwiches instead of club sandwiches (see Part I).

要查看或添加评论，请登录

David Hazar的更多文章

You're Killin' Me, AWS!

2024年5月23日

You're Killin' Me, AWS!

Part I: The Load Balancer Club Sandwich I really do like you Amazon Web Services (AWS), but sometimes I get a little…
What Apple is to Cell Phones, Microsoft is to Identity

2024年3月22日

What Apple is to Cell Phones, Microsoft is to Identity

UPDATE: As of 5/2024, Microsoft has released their replacement for custom controls allowing third parties to integrate…
Containment in the Cloud - Their Native Firewalls Don't Always Work

2023年9月22日

Containment in the Cloud - Their Native Firewalls Don't Always Work

If you thought that preventing access to and from your resources in Google Cloud and Azure was as easy as adding in a…

6 条评论
Protecting Against Tactics Used in the MGM Breach with NL3

2023年9月19日

Protecting Against Tactics Used in the MGM Breach with NL3

I feel for MGM right now. With reports of the company losing $8.

3 条评论
I lost all my email, even with MFA enabled

2022年5月25日

I lost all my email, even with MFA enabled

I am human enough to admit that I made a huge mistake which in all likelihood caused me to lose all my emails from the…

3 条评论
SANS Vulnerability Management (VM) Maturity Model - Identify

2020年2月13日

SANS Vulnerability Management (VM) Maturity Model - Identify

Today I will be sharing the maturity model for the "Identify" phase of the PIACT model for vulnerability management…
SANS Vulnerability Management (VM) Maturity Model - Prepare

2019年11月13日

SANS Vulnerability Management (VM) Maturity Model - Prepare

The Prepare phase of the PIACT model for vulnerability management includes activities that help organizations prepare…
SANS Vulnerability Management (VM) Maturity Model - Intro & Analyze

2019年11月2日

SANS Vulnerability Management (VM) Maturity Model - Intro & Analyze

Last week I wrote an article about prioritization not being a solution to the vulnerability management problem many…
Prioritizing vulnerabilities is not a solution

2019年10月23日

Prioritizing vulnerabilities is not a solution

It seems like all I hear about in the vulnerability space right now is prioritization. It is the blockchain of…

4 条评论
Facebook Phishing, They're After Your Secrets

2014年11月10日

Facebook Phishing, They're After Your Secrets

I am sure you have all seen these posts making their way around Facebook. The surveys, quizzes, or polls with questions…

2 条评论

See all articles

You're Killin' Me, AWS! Part II

David Hazar

Certified SANS Instructor | IANS Faculty Member | Consultant | Founder | Public Speaker

To ingress, or not to ingress, that is my question

领英推荐

David Hazar的更多文章

社区洞察

其他会员也浏览了

Day 37: Azure Az-900: How much are you going to pay?

Save your Money on Azure! Goodbye Azure Bastion, Welcome Cloud Shell.

Overview of the different server offerings available on VMware cloud services (March 2024)

When not to reserve in the Cloud

Multi-Cloud is BS

Amazon Web Services to Launch AWS European Sovereign Cloud

Calculation and allocation of AWS VPC and Subnet CIDR

?? Estimate Your Cloud Costs Easily Across AWS, Azure, or GCP ??

Securing Kubernetes Workloads in GCP: The Power of Workload Identity

Is your AWS bill too high this month?

To ingress, or not to ingress, that is my question

领英推荐

David Hazar的更多文章