Network security and AWS Transit Gateway

Network security and AWS Transit Gateway

There are a few ways you can improve your networking security using AWS Transit Gateway. If you are using AWS multi account or multi region, you probably should be using a transit gateway.

What is transit gateway?

AWS Transit Gateway is a network hub. Instead of creating individual VPC peeing connections you can connect multiple VPC to a single transit gateway and have it serve as the hub for your networking.

Transit gateways have many great features and can simplify your network design. The features include:

  • Route traffic between several VPC including DNS resolution
  • Include edge connectivity like site-to-site multi path VPN
  • Central monitoring
  • Peering of multiple transit gateways

The main cost is based on the amount of data processed. There is also a small per-attachment cost.

By using the advanced routing functionality you can inspect all the traffic traversing your transit gateway by sending it via a security appliance.

There is also a new feature that I have been requesting for ages. You can now propagate references to other security group across transit gateways. This is an essential feature and it's great it is finally included. More on this below...

How to inspect VPC to VPC traffic

With AWS transit gateway it is possible to deliver all the traffic to a security appliance.

Hub and spoke model with traffic inspection
AWS transit gateway trafic inspection (credit AWS)

Look at this example - it is straight from the AWS documents. It is from the prescriptive documents on VPC to VPC traffic scanning with a transit gateway.

It uses a hub and spoke model. The transit gateway (centre) is the hub. The VPC on the left are spokes. The VPC on the right contains the security appliance. One of the great features is Availability Zones can be maintained across this flow. This reduces latency and cost. Other AZ can be used in the event of a failure.

The basic flow is as follows:

  1. Traffic originates from an application hosed on an EC2 instance in VPC 1 and is destined for a separate application in VPC 2 hosted on another EC2 instance.
  2. Traffic is directed according to the VPC route table. Because it does not match a local route (in this case anything 10.1.x.x) it is directed to the transit gateway.
  3. In the transit gateway there are 2 route tables. One associated with the spoke VPC and one associated with the appliance VPC. The route table associated with the spoke VPC directs all traffic to the appliance VPC.
  4. The next hop is the appliance VPC. Because the appliance VPC attachment has appliance mode turned on, the transit gateway determines which Transit Gateway elastic network interface to forward the traffic to, based on the 4-tuples of the IP packet.
  5. Traffic is dent to the next hop based on the subnet route table. All traffic is directed to a Gateway Load Balancer endpoint.
  6. The traffic is sent to?Gateway Load Balancer endpoint 1.
  7. The Gateway Load Balancer endpoint is logically connected to Gateway Load Balancer using AWS PrivateLink. The Gateway Load Balancer forwards traffic for inspection.
  8. Traffic can be inspected and then modified, rejected or passed on depending on the firewall policy.
  9. After each packet has been inspected, it is sent back to the Gateway Load Balancer endpoint via Gateway Load Balancer in?Appliance VPC.
  10. At the Gateway Load Balancer endpoint, the packet is sent to the transit gateway based on the appliance subnet route table.
  11. After the packet arrives at the transit gateway, it is routed based on the Appliance route table. It is routed to the destination?10.2.0.0/16?network.
  12. The packet is transmitted to the destination in VPC2 via the Transit Gateway ENI.

The whole process is followed in revers for return traffic

There area couple of other technical points to note:

  • The Gateway load balancer creates a GENEVE tunnel to the to the firewall. GENEVE is an encapsulation protocol that operates on port 6081. It operates at layer 3 of the OSI network model. It forwards traffic for any port to the target. There are multiple security appliance to choose from but they must support this protocol.
  • The Gateway load balancer uses a hashing algorithm based on the IP and selects a firewall for the duration of the flow from the firewall appliance pool.

This can be used as a simple firewall that controls traffic based on origin/destination rules (based on protocol, IP and port) or it can be used to implement deep packet inspection. Deep packet inspection also involves the breaking of secure TLS communication by having a trusted certificate on the firewall. Traffic can then be unencrypted and re-encrypted.

It is hard to implement IP/port based firewall rules in a transit gateway due to the nature of dynamic cloud networking.

Why inspect WAN traffic?

This allows complete scanning of all traffic crossing from one VPC to another. It is a pattern often used in organisations that require complete scanning of East-West traffic.

Note: In networking parlance North-South traffic refers to external traffic outside the WAN. East-West traffic refers to traffic within the WAN.

Many organisations employ some network traffic scanning at the boundary (North-South traffic). As part of a defence in depth strategy it necessary to include some internal checks as well as just perimeter security.

What are the issues?

There is a problem with the design of scanning all the traffic that flows between VPC. The major issue is that is it is a bit arbitrary.

In a modern AWS architecture I would encourage the use of multiple AWS accounts to separate concerns. This helps mitigate against any breech and limits the blast radius of any intrusion. It is a very good idea as writing complex IAM policies is error prone. It is also useful for separating administrative control or billing. I like to split workloads into separate AWS accounts. Also separate accounts should be used to separate production and development environments. I have also seen separate IAM accounts used to split types of workload like putting all the containerised workloads in one account, EC2 in another and Lambda workloads in a 3rd.

In an AWS world you can create multiple VPC when you want, but you are also forced to create separate VPC when you cross account or region boundaries.

The big problem is defining workloads and where you need to insert security controls. If you use a container platform it would be mad to deploy a Kubernetes cluster for each application. Many of your container workloads will sit in one cluster in a single VPC. Generally you want to insert the controls between independent workloads to monitor East-West traffic. But in a modern cloud environment this does not equate to a traditional data centre design.

  • It is hard to create a set of IP based rules in a dynamic cloud environment where IP address can be non-sequential and short lived
  • It can be hard to insert the controls where you need them. You may want to monitor the traffic between 2 specific apps but they may be in the same VPC. You may not want to monitor the traffic between a specific pair of apps or an app and its database as it is too chatty and you cant afford the latency.

What is the solution?

With all things security, there is not a one size fits all approach. You have to model threats and decide what risks you are really trying to mitigate.

Ofter other approaches like Zero Trust are a better idea. If you use strong 2-way authentication and encryption at every stage you can secure internal traffic.

Controls inside container clusters are important to secure internal traffic. Often network access within container platforms is overlooked. Within a cluster having strong controls like RBAC and pod security policies stops attackers moving sideways through a network. Using a service mesh if applicable can help manage the interactions of components and improve security.

The other important item is observability. You need to make sure observability is implemented where you need it.

Improved Security group Support (NEW)

This is probably one of the most requested feature in AWS history (I don't have any stats on that but I was told by AWS there had been a lot of requests). The reason is that it is such an important feature and has taken a while to be delivered. I have been asking of it pretty much since Transit Gateway was launched. I was also told (unofficially) by AWS that the reason this was taking so long was it involved some major changes to the networking stack and had some major dependancies. This feature is available with VPC peering.

I have had client projects where I have not been able to use transit gateways because of this feature being missing. We need up with lots of VPC peering an a much more complex network design!

Anyway - on to the feature. You can now reference security groups in other VPC (and in other accounts) when the VPC are connected via a transit gateway. This is a feature you have to specifically enable when you create a transit gateway.

aws?ec2?create-transit-gatewayt?\
--description?MyTransitGateway \
--options SecurityGroupReferencingSupport=enable        

Th can also be turned on and off on each attachment. Wheres it defaults to off on the transit gateway, it is on by default for attachments. You can explicitly control it when creating the attachment:

aws?ec2?create-transit-gateway-vpc-attachment?\
--transit-gateway-id?[TGW-ID] \
--vpc-id?[VPC-ID] \
--subnet-id?[SUBNET-ID]
--options SecurityGroupReferencingSupport=enable        

It can also be turned on later by modifying the transit gateway or attachment.

If you have 2 applications across 2 VPC connected by a transit gateway. They are set up as follows:

  • Application 1 is an EC2 instance in VPC1 and has a security group SG1
  • Application 2 is another EC2 instance in VPC2 and has a security group SG2
  • SG1 and SG2 do not restrict outbound traffic but only allow specified inbound traffic

If you want Application 1 to send an API request to Application 2, you would need to include a specific rule ingress rule to SG2 to allow that traffic. When you create this rule you can reference SG2 as a source.

aws ec2?authorize-security-group-ingress \
?—group-name SG2 \
—protocol tcp —port 443
—source-group SG1 \
—groupowner [ACCOUNT 1 ID]        

If the VPCs are across multiple accounts you need to add the groupowner parameter to reference the other account. Otherwise do not include this parameter. TCP port 443 is used for HTTPS traffic.

A couple of important notes:

  • This only works for inbound rules currently and not inbound
  • This only works within a single region
  • This does not work for AWS local zones or outposts currently
  • Multicast is not currently supported.

Another important note is that this will not work with traffic inspection described above. It is an either/or choice at the moment.

This feature has been available for VPC peering for some time but has been missing for transit gateways.

Conclusion

It is almost impossible for anyone to design your security for you. You need to understand your application, regulatory requirements, your threat model and your organisations appetite to risk. Network security is part of a package of security measures and it also depends on what other controls you have in place. Having said all of that my preferred approach is:

  • Use security groups for all east-west traffic
  • Use inspection if required for north-west traffic
  • Use controls within any container platform for internal traffic
  • Encrypt all traffic (including internal traffic)
  • As much as possible follow zero trust principles. Even if you can't have a complete zero trust approach, do as much as you can.
  • Use VPC flow logs for monitoring traffic
  • Design your AWS account structure to split up workloads and different concerns as much as possible

Unless you absolutely need to I would try and avoid having a security appliance scanning east-west traffic. Not only will it break the new security group functionality, it can also cause other issues. It can also add an unacceptable amount of cost and latency and force security compromises in other areas.

One of the most important tips I would have for anybody is build a relationship with your security team. They may not be particularly experienced with cloud or AWS. You need to explain the trade-offs and take them on a journey to help them understand the risks.


要查看或添加评论,请登录

Andrew Larssen的更多文章

  • Measuring the cost of Bedrock

    Measuring the cost of Bedrock

    Amazon Bedrock is a great product but it does come with one slight problem - attributing costs. At a very high level…

    2 条评论
  • Claud 3.7 Sonnet - Could this change things?

    Claud 3.7 Sonnet - Could this change things?

    First let's start with the obvious. Anthropic Claude 3.

    1 条评论
  • GraphRAG - What's it all about?

    GraphRAG - What's it all about?

    A while ago all the hype in GenAI was about RAG (Retrieval Augmented Generation). RAG is a technique to give LLM (large…

  • DeepSeek on Bedrock - the story continues...

    DeepSeek on Bedrock - the story continues...

    Just over a week ago I wrote an article about running DeepSeek on Amazon Bedrock. This is a follow on piece.

  • RAG for video

    RAG for video

    I have been looking at producing a chatbot able to answer questions based on a company knowledge base. Ideally it would…

  • DeepSeek on AWS Bedrock

    DeepSeek on AWS Bedrock

    There is a lot of talk right now about DeepSeek. I am a bit scare about running any sort of model where I don't know…

  • Amazon Bedrock Model Distillation

    Amazon Bedrock Model Distillation

    Model distillation is quite a complex term. Before we look at the Bedrock product it is worth starting out by answering…

    1 条评论
  • ReInvent keynotes update

    ReInvent keynotes update

    There have been 2 keynotes so far. Monday Night Live with Peter DeSantis and the CEO keynote with new CEO Matt Garman.

  • AWS Resource Control Policies

    AWS Resource Control Policies

    In the last couple of weeks there have been a few announcements coming out of AWS. Normally at this time of year it…

  • Advanced RAG with Amazon Bedrock

    Advanced RAG with Amazon Bedrock

    Recently I have been using Amazon Bedrock Knowledge Bases extensively. It really makes setting up a RAG solution very…

社区洞察

其他会员也浏览了