HA, multi-region, low latency app infrastructure on AWS using Terraform (Part-1)
Hi all. Good day!
In this article, I’ve shown how to create a highly available, multi-region, low latency application infrastructure on AWS cloud using the popular IaaS tool Terraform from scratch.
I’ve created an IAM instance profile, security groups, a Launch Template, an Auto Scaling group, an Application Load Balancer, AWS CloudFront, and ACM certificates.
So let's get started.
In the Terraform provider blocks, I have kept N. Virginia as the default region and Singapore as the second provider (i.e. backup region).
provider "aws" {
region = "us-east-1"
}
provider "aws" {
region = "ap-southeast-1"
alias = "Singapore"
}
Here I have used the existing default VPC, Subnets in both us-east-1 (N. Virginia) and ap-southeast-1 (Singapore) regions; two of my existing S3 buckets in both regions to store static content and my Route-53 public zone. So, I have created data source blocks.
data "aws_vpc" "myvpc" {
id = var.vpc_id
}
data "aws_subnets" "mysubnets" {
filter {
name = "vpc-id" #Don't change this name, or TF will fail
values = [var.vpc_id]
}
}
data "aws_subnet" "mysubnet" {
for_each = toset(data.aws_subnets.mysubnets.ids)
id = each.value
}
data "aws_s3_bucket" "mybucket" {
bucket = var.my_custom_bucket
}
#Route-53 data source
data "aws_route53_zone" "pub_domain_zone" {
name = var.route53_domain_name
private_zone = false
}
#Singapore region
data "aws_vpc" "myvpc_Singapore" {
provider = aws.Singapore
id = var.vpc_id_Singapore
}
data "aws_subnets" "mysubnets_Singapore" {
provider = aws.Singapore
filter {
name = "vpc-id" #Don't change this name, or TF will fail
values = [var.vpc_id_Singapore]
}
}
data "aws_subnet" "mysubnet_Singapore" {
provider = aws.Singapore
for_each = toset(data.aws_subnets.mysubnets_Singapore.ids)
id = each.value
}
data "aws_s3_bucket" "mybucket2" {
provider = aws.Singapore
bucket = var.my_second_bucket
}
First, created an IAM instance profile for the launch template, then created an AWS IAM role with assume role policy as jsonencode for ‘ec2.amazonaws.com’ and then attached an Amazon-managed policy with the role. You can create & select your policy as well. As AWS IAM is a global service, so we can use the IAM entities on the resources of both regions.
resource "aws_iam_instance_profile" "ltemplateprofile" {
name = "${var.launchtemplate_name}-profile"
role = aws_iam_role.ltrole.name
}
resource "aws_iam_role" "ltrole" {
name = "${var.launchtemplate_name}-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Action = "sts:AssumeRole"
Effect = "Allow"
Sid = ""
Principal = {
Service = "ec2.amazonaws.com"
}
},
]
})
tags = {
tag-key= "launch_template-role"
}
}
resource "aws_iam_role_policy_attachment" "ltrole_attachment" {
role = aws_iam_role.ltrole.name
policy_arn = var.launch_template_role_policy_arn
}
Next, created 2 security groups for the Auto Scaling group & and Application Load Balancer. The ASG security group only allows HTTP traffic from the ALB security group.
resource "aws_security_group" "alb_sg" {
vpc_id = data.aws_vpc.myvpc.id
ingress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
ipv6_cidr_blocks = ["::/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
ipv6_cidr_blocks = ["::/0"]
}
tags = {
Name = "ALB-SG"
}
}
resource "aws_security_group" "asg_sg" {
depends_on = [ aws_security_group.alb_sg ]
vpc_id = data.aws_vpc.myvpc.id
description = "Security group for Autoscaling group"
ingress {
from_port = 0
to_port = 0
protocol = "-1"
security_groups = [aws_security_group.alb_sg.id]
description = "Allow from ALB SG"
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
ipv6_cidr_blocks = ["::/0"]
}
tags = {
Name = "ASG-SG"
}
}
Now, created the Launch Template resource block. Here added the depends_on meta-argument so that Terraform will create an IAM instance profile and security group for the Auto Scaling group before it starts creating the Launch Template. Given the ‘open’ capacity reservation preference, selected the IAM instance profile, image ID (AMI ID), shutdown behaviour as ‘terminate’, instance type, key pair (optional), security group for ASG, tags and finally base 64 encoded userdata.sh I have used variables & resource references.
resource "aws_launch_template" "utpal-lt" {
depends_on = [ aws_iam_instance_profile.ltemplateprofile, aws_security_group.asg_sg ]
name = "lt-${var.launchtemplate_name}"
capacity_reservation_specification {
capacity_reservation_preference = "open"
}
iam_instance_profile {
name = aws_iam_instance_profile.ltemplateprofile.name
}
image_id = var.imageid
instance_initiated_shutdown_behavior = "terminate"
instance_type = var.instancetype
key_name = var.keypairname
vpc_security_group_ids = [aws_security_group.asg_sg.id]
tag_specifications {
resource_type = "instance"
tags = {
Name = "utpal-lt"
}
}
user_data = filebase64("${path.module}/userdata/userdata.sh")
}
Next, created a resource block for the AWS placement group with a partition strategy (in tfvars file). Then created an Auto Scaling group resource block. Used the depends_on meta-argument here as well.
Added its name, max and min sizes, and desired capacity. Selected the Launch Template with the latest version and the placement group using resource references. Health check type (ELB in tfvars file) and grace period. Selected subnets from the data source in VPC zone identifier argument, tags. Added lifecycle meta-argument to ignore changes of max & min sizes, desired capacity.
领英推荐
Added instance refresh block with rolling strategy and 50% as the minimum health percentage and tag as the trigger. Using this block, we can specify that whenever any configuration change occurs or new version of the Launch template is created or any tag change occurs, the ASG will do a rolling update with a specified health percentage. (Later I added ALB target group ARN after creating and configuring ALB)
Then I added a resource block for autoscaling policy (TargetTrackingScaling in tfvars file) with Average CPU Utilization as the predefined metric.
resource "aws_placement_group" "asg_placement_group" {
name = "utpal-pg"
strategy = lower(var.asg_placement_group_strategy)
}
locals {
app_suffix=element(split(".", "${var.cloudfront_subdomain_name}"),0)
}
resource "aws_autoscaling_group" "utpal-asg" {
depends_on = [ aws_launch_template.utpal-lt, aws_placement_group.asg_placement_group, aws_lb.my_alb ]
name = var.asg_name
max_size = var.asg_max_size
min_size = var.asg_min_size
desired_capacity = var.asg_desired_capacity
launch_template {
id = aws_launch_template.utpal-lt.id
version = aws_launch_template.utpal-lt.latest_version
}
health_check_grace_period = 60
health_check_type= var.asg_health_check_type
placement_group = aws_placement_group.asg_placement_group.id
vpc_zone_identifier = [for s in data.aws_subnet.mysubnet : s.id]
target_group_arns = [ aws_lb_target_group.alb_tg.arn ]
lifecycle {
ignore_changes = [ desired_capacity, max_size, min_size ]
}
tag {
key = "Name"
value = "${local.app_suffix}-LTV.${aws_launch_template.utpal-lt.latest_version}"
propagate_at_launch = true
}
instance_refresh {
strategy = "Rolling"
preferences {
min_healthy_percentage = 50
}
triggers = ["tag"]
}
}
resource "aws_autoscaling_policy" "asg-policy-alb" {
name = "utpal-asg-policy"
autoscaling_group_name = aws_autoscaling_group.utpal-asg.name
policy_type = var.asg_policy_type
target_tracking_configuration {
predefined_metric_specification {
predefined_metric_type = "ASGAverageCPUUtilization"
}
target_value = 60.0
}
}
Next, created the Load Balancer resource block with the depends_on meta-argument. It is an internet-facing Load Balancer of type Application. Selected security group, subnets, added name and tag.
Then on the target group resource block added name; selected VPC, target type (instance in tfvars file), enabled HTTP health check with ‘/’ as the path and 60 seconds deregistration delay (in tfvars file).
resource "aws_lb" "my_alb" {
depends_on = [ aws_security_group.alb_sg ]
name = "${var.lb_sufix}-alb"
internal = false
load_balancer_type = "application"
security_groups = [aws_security_group.alb_sg.id]
subnets = [for s in data.aws_subnet.mysubnet : s.id]
# access_logs {
# bucket = var.alb_log_bucket_name
# prefix = var.alb_log_prefix
# enabled = var.enable_alb_logs
# }
tags = {
Environment = "Dev"
}
}
resource "aws_lb_target_group" "alb_tg" {
name = "${var.lb_sufix}-alb-tg"
vpc_id = data.aws_vpc.myvpc.id
target_type = var.lb_target_type
port = 80
protocol = "HTTP"
health_check {
enabled = true
protocol = "HTTP"
path = "/"
}
deregistration_delay = var.lb_deregistration_delay
}
Next, added a load balancer listener resource block which depends on an ACM certificate. Selected the load balancer ARN. It uses HTTPS protocol with the 'ELBSecurityPolicy-2016-08' SSL policy to check connection requests.
Since I want to configure the listener to only accept traffic from my CloudFront resource and deny all other traffic with a 403-status code and a custom error message, so in the default action block I have configured it like this.
Then added a load balancer listener rule resource block to add a second listener rule as the 1st priority. Selected the listener ARN. Configured the condition block to check the custom HTTP header name and value. (Note: please treat this HTTP header name and value as password sensitive and use HTTPS lister protocol) Then it will forward the traffic with this custom http header to the target group. Later I added this to the CloudFront origin block section.
resource "aws_lb_listener" "alb_listener" {
depends_on = [ aws_acm_certificate.alb_acm_cert ]
load_balancer_arn = aws_lb.my_alb.arn
protocol = "HTTPS"
port = 443
ssl_policy = "ELBSecurityPolicy-2016-08"
certificate_arn = aws_acm_certificate.alb_acm_cert.arn
default_action {
type = "fixed-response"
fixed_response {
content_type = "text/plain"
message_body = "You are not allowed to view this app"
status_code = "403"
}
}
}
resource "aws_lb_listener_rule" "withcfheader" {
listener_arn = aws_lb_listener.alb_listener.arn
priority = 1
condition {
http_header {
http_header_name = var.custom_header_name1 #Sensitive
values = ["${var.custom_header_value1}"] #Sensitive
}
}
action {
type = "forward"
target_group_arn = aws_lb_target_group.alb_tg.arn
}
tags = {
OnlyAccept = "fromCloudFront"
}
}
I have created an AWS ACM certificate resource block for the ALB listener. It will do DNS validation and will use a ‘create before destroy’ lifecycle rule.
As part of the DNS validation, a Route-53 record resource block will add the record name, value and type of the ACM certificate in my Route-53 public zone with 60 seconds TTL. Then an ACM certificate validation resource block will validate this. I also created an aws_route53_record block so that the alias record got added for the Application load balancer.
resource "aws_acm_certificate" "alb_acm_cert" {
domain_name = var.alb_subdomain_name
validation_method = "DNS"
tags = {
Environment = "Dev"
}
lifecycle {
create_before_destroy = true
}
}
resource "aws_route53_record" "alb_acm_cert_domain_record" {
for_each = {
for dvo in aws_acm_certificate.alb_acm_cert.domain_validation_options : dvo.domain_name => {
name = dvo.resource_record_name
record = dvo.resource_record_value
type = dvo.resource_record_type
}
}
allow_overwrite = true
name = each.value.name
records = [each.value.record]
ttl = 60
type = each.value.type
zone_id = data.aws_route53_zone.pub_domain_zone.zone_id
}
resource "aws_acm_certificate_validation" "alb_acm_cert_validation" {
certificate_arn = aws_acm_certificate.alb_acm_cert.arn
validation_record_fqdns = [for record in aws_route53_record.alb_acm_cert_domain_record : record.fqdn]
}
Hence the infrastructure codes for the primary region (N. Virginia) are ready.
(To be continued, Next part link)
Please click here to get my other articles.
Thanks for reading the article. Could you read my other articles on LinkedIn too? And a humble request. I'm looking for a new job and would appreciate your support. I have 5.5+ years of experience in the following skills- AWS, Azure, Azure DevOps, Terraform, Kubernetes etc. I am currently serving as a DevOps Engineer at Accenture.