Automated website hosting using Terraform || EFS for persistent storage
Khushi Thareja
Aspiring DevOps-Cloud Architect | RHCE v8 | 4x Redhat Certified | 3x Microsoft Certified
Terraform is a great tool to create cloud infrastructure as code. It surely beats clicking around the AWS GUI and mitigates the risk of error. Maybe our infrastructure contains hundreds of resources, why to create it manually everytime ? Terraform defines an infrastructure as code to manage the full lifecycle — create new resources, manage existing ones, and destroy those no longer needed. Terraform makes it easy to re-use configurations for similar infrastructure, helping you avoid mistakes and save time. So why not use Terraform ? This article guides you through automating the launch of a website on the Amazon AWS Cloud and also integrates the instance with EFS for persistent storage. Let us look at these steps one by one for better understanding of our infrastructure.
- Since we would like to configure file system, we create vpc, subnet, route table as well as route table associations dynamically and then attach them to the file system. Therefore our first step is to configure all these and then launch the file system and further attach it to our ec2 instance.
- Next, we Create an AWS instance using the EC2 service from the AWS Cloud and getting all the softwares installed inside the instance.
- Next, we Pull the code sent by the developer on GitHub and cloning the same into the required folder inside our EC2 instance.
- We create an S3 bucket for the storage of all the static data like images , videos , documents etc.
- This will be sent to all the edge locations using another service from AWS cloud ie. CloudFront so that none of our customer accross the globe faces latency.
- Our Final webpage would be displayed over the browser automatically.
One thing that should be noted here is, that the use of file system is : providing persistent storage. One main feature of file system is that it is not a region based service which means unlike ebs, if we attach file system to any instance launched in a perticular region and in future any use case arises to launch the same instance in another region or services like auto scaler launch it in another region we donot loose our data.
Let's start building the code for the same..
#provider provider "aws" { profile = "khushi" region = "ap-south-1" }
Provider is used to specify the cloud provider that we are going to use. It is important to specify the cloud provider since the Plugins are to be downloaded for the perticular cloud provider. These plugins are the one which makes terraform intelligent. Profile specifies from which account you are logging in. This would pick up the credentials for that account from your local system.
# vpc resource "aws_vpc" "tf_vpc" { cidr_block = "10.0.0.0/16" enable_dns_support = true enable_dns_hostnames = true tags= { Name = "task2-tf-vpc" } }
aws_vpc is the resource available in AWS for creating a vpc. cidr-block is a required parameter and specifies the IP range for the VPC. enable_dns_support is an optional parameter which is a boolean flag to enable/disable DNS support in the VPC and defaults true. enable_dns_hostname is also an optional parameter to enable/disable DNS hostnames in the VPC and defaults false.
# subnet resource "aws_subnet" "tf_subnet" { depends_on = [ aws_vpc.tf_vpc ] vpc_id = aws_vpc.tf_vpc.id availability_zone = "ap-south-1a" cidr_block = "10.0.1.0/24" map_public_ip_on_launch = true tags= { Name = "task2-tf-subnet" } }
depends_on is a function which would specify that the creation of this resource is dependent on some other resource. while building the code through terraform if we do not specify dependency then the process might hang due to various dependency reasons. Now, lets talk about the parameters that we have used. vpc_id would specify that in which vpc we would like to create our subnet. Since we would like to specify the vpc we created in previous step so, using this kind of variable helps us in specifying the vpc dynamically. availability_zone specifies the AZ in which we would like to create our subnet. cidr_block specifies the subnet ip range . Remeber that the subnet range should be inside the vpc range only. map_ip_on_launch is an optional parameter which specifies true to indicate that instances launched into the subnet should be assigned a public IP address. Default is false.
# internet gateway resource "aws_internet_gateway" "tf_ig" { depends_on = [ aws_vpc.tf_vpc ] vpc_id = aws_vpc.tf_vpc.id tags = { Name = "task2-tf-ig" } }
What is an internet gateway? An internet gateway is a virtual router you can add to enable direct connectivity to the internet. Resources that need to use the gateway for internet access must be in a public subnet and have public IP address. Each public subnet that needs to use the internet gateway must have a route table route that specifies the gateway as the target. This internet gateway resource uses vpc_id as a parameter which specifies that we want our internet gateway in the same vpc that we just created.
# route table resource "aws_route_table" "tf_route" { depends_on = [ aws_vpc.tf_vpc ] vpc_id = aws_vpc.tf_vpc.id route { cidr_block = "0.0.0.0/0" gateway_id = aws_internet_gateway.tf_ig.id } tags = { Name = "task2-tf-route" } }
A route table contains a set of rules, called routes, that are used to determine where network traffic from your subnet or gateway is directed. Again we provided the vpc_id and gateway_id for the same purpose as before.
# route association resource "aws_route_table_association" "tf_assoc" { depends_on = [ aws_subnet.tf_subnet ] subnet_id = aws_subnet.tf_subnet.id route_table_id = aws_route_table.tf_route.id }
Route table association is the association between a route table and a subnet, internet gateway, or virtual private gateway.
Next, we come to creating and attaching our file system.
Amazon Elastic File System (Amazon EFS) provides a simple, scalable, fully managed elastic NFS file system for use with AWS Cloud services and on-premises resources. There are tons of advantages available for the File system eg: shared file storage, Scalable performance, Dynamic elasticity, Fully managed, Cost-effective, Security and compliance.
#securitygroup resource "aws_security_group" "tf_efs_sg" { name = "tf_efs_sg" description = "Communication-efs" vpc_id = aws_vpc.tf_vpc.id ingress { from_port = 0 to_port = 0 protocol = "-1" cidr_blocks = ["0.0.0.0/0"] } egress { from_port = 0 to_port = 0 protocol = "-1" cidr_blocks = ["0.0.0.0/0"] } tags = { Name = "tf-task2-sg" } }
Before creating the file system, we create a security group which would be used further in our resources. This code is used to create the security group. For VPC ID we use the previously used variable because we would like the security group should be applicable to our self created VPC. Egress is used to setup for the outbound traffic and here it has been set to all ports .Ingress is the traffic coming to our websites. Since our EFS would also use the same security group for communicating with the ec2 instance, either you allow the required port in your security group or allow all the ports in ingress as well egress. Allowing all ports in ingress in not recommended.
# create efs resource "aws_efs_file_system" "tf_efs" { creation_token = "tf-EFS-task2" tags = { Name = "awsEFS" } }
In the create efs resource you just need to specify the creating token which is unique for all the file systems and the name.
# mount efs resource "aws_efs_mount_target" "tf_mount" { depends_on = [ aws_efs_file_system.tf_efs, aws_subnet.tf_subnet, aws_security_group.tf_efs_sg ] file_system_id = aws_efs_file_system.tf_efs.id subnet_id = aws_subnet.tf_subnet.id security_groups = [aws_security_group.tf_efs_sg.id] }
This block of code would mount the EFS with our subnet and attach the security groups made by us in the previous steps.
# access point efs resource "aws_efs_access_point" "efs_access" { depends_on = [ aws_efs_file_system.tf_efs, ] file_system_id = aws_efs_file_system.tf_efs.id }
Amazon EFS access points are application-specific entry points into an EFS file system that make it easier to manage application access to shared datasets. We just need to specify the file system id and this resource would automatically create access points for you.
#ec2 instance launch resource "aws_instance" "tf_task2_ec2_webserver" { depends_on = [ aws_vpc.tf_vpc, aws_subnet.tf_subnet, aws_efs_file_system.tf_efs, ] ami = "ami-08706cb5f68222d09" instance_type = "t2.micro" subnet_id = aws_subnet.tf_subnet.id security_groups = [ aws_security_group.tf_efs_sg.id ] key_name = "mykey" connection { type = "ssh" user = "ec2-user" private_key = file("C:/Users/HP/Downloads/mykey.pem") host = aws_instance.tf_task2_ec2_webserver.public_ip } provisioner "remote-exec" { inline = [ "sudo su <<END", "yum install git php httpd amazon-efs-utils -y", "rm -rf /var/www/html/*", "/usr/sbin/httpd", "efs_id=${aws_efs_file_system.tf_efs.id}", "mount -t efs $efs_id:/ /var/www/html", "git clone https://github.com/khushi20218/cloud1.git /var/www/html/", "END", ] } tags = { Name = "tf_task2_ec2_webserver" } }
This is used to launch an EC2 instance over which we would like to run our website. Amazon provides many images for launching the OS known as AMI(Amazon Machine Image). The unique AMI ID can be taken from the management console.
Instance type: we need to specify the instance type according to our requirement such as Number of CPU's , RAM etc.
The key used is a precreated key which is stored inside our local system. Then we establish a Connection so that we can do ssh on the instance so that we could run the commands on the remote system to install all the required softwares.
Provisioner are used in two ways : for remote (on the remote system ) and local (on our own local base system). Here we used remote exec using which we install httpd, git and php.
(Note that these commands are according to the os that we used ie. the Amazon linux. if you would like to run the commands on some other os, you should know the commands accordingly)
# s3 bucket resource "aws_s3_bucket" "tf_s3bucket" { bucket = "098web1bucket2" acl = "public-read" region = "ap-south-1" tags = { Name = "098web1bucket2" } }
Creating an s3 bucket would avoid any latency faced to access the static data of any website from anywhere in the world.
# adding object to s3 resource "aws_s3_bucket_object" "tf_s3_image-upload" { depends_on = [ aws_s3_bucket.tf_s3bucket, ] bucket = aws_s3_bucket.tf_s3bucket.bucket key = "download.jpg" source = "C:/Users/HP/Documents/cloud/download.jpg" acl = "public-read" }
Now, we need to uplaod the object (ie. the static data) to the s3 bucket that we just created. Key is the name of the file after the object is uploaded in the bucket and source is the path of the file to be uploaded.
# cloudfront variable variable "oid" { type = string default = "S3-" } locals { s3_origin_id = "${var.oid}${aws_s3_bucket.tf_s3bucket.id}" } # cloudfront distribution resource "aws_cloudfront_distribution" "tf_s3_distribution" { depends_on = [ aws_s3_bucket_object.tf_s3_image-upload, ] origin { domain_name = "${aws_s3_bucket.tf_s3bucket.bucket_regional_domain_name}" origin_id = "${local.s3_origin_id}" } enabled = true default_cache_behavior { allowed_methods = ["DELETE", "GET", "HEAD", "OPTIONS", "PATCH", "POST", "PUT"] cached_methods = ["GET", "HEAD"] target_origin_id = "${local.s3_origin_id}" forwarded_values { query_string = false cookies { forward = "none" } } viewer_protocol_policy = "allow-all" min_ttl = 0 default_ttl = 3600 max_ttl = 86400 } restrictions { geo_restriction { restriction_type = "none" } } viewer_certificate { cloudfront_default_certificate = true } connection { type = "ssh" user = "ec2-user" private_key = file("C:/Users/HP/Downloads/mykey.pem") host = aws_instance.tf_task2_ec2_webserver.public_ip } provisioner "remote-exec" { inline = [ "sudo su <<END", "sudo echo \"<img src='https://${aws_cloudfront_distribution.tf_s3_distribution.domain_name}/${aws_s3_bucket_object.tf_s3_image-upload.key}' height='200' width='200' >\" >> /var/www/html/index.php", "END", ] } }
CloudFront is the service that is provided by the AWS in which they create small data centres where they store our data to achieve low latency. We need to keep in mind the following few points while configuring the cloudfront
- Specify the domain name and the origin id.
- Set the default_cache_behavior which is a required block of code.
- Set the viewer_protocol_policy specifying the default and maximum TTL.
- Set any restrictions if required (whitelist & blacklist).
- Set the viewer_certificate as true.
Now, we need to put the URL provided by the cloudfront into the code provided by the developer so that the client can see it.
To write the code into an already existing file we have to be the root user, and right now we are ec2-user by default. so, either We can login as the root user - This is possible from GUI or CLI but from Terraform code this is not possible directly. Therefore we Switch user to root on the fly - When we use this command: sudo su — root , a child shell is created. If this is written on the local system, this command works seamlessly. But if this is written from a remote system, here Terraform, it fails to get the child shell. So the final correct solution is use sudo to run a shell and use a heredoc to feed it commands.
This command: echo \"<img src='https://${aws_cloudfront_distribution.s3_distribution.domain_name}/${aws_s3_bucket_object.image-upload.key}' height='200' width='200' >\" >> /var/www/html/index.php gets copied in the end of the webpage and therefore the client is able to see the final webpage.
# opening via chrome resource "null_resource" "website" { depends_on = [ aws_cloudfront_distribution.tf_s3_distribution, ] provisioner "local-exec" { command = "start chrome https://${aws_instance.tf_task2_ec2_webserver.public_ip}/" } }
This is another additional thing : as and when we run the code the chrome automatically opens the public url of the website . This is one way of testing that our site has no bug. Now, lets see how the terraform builds this code .
Continue with terraform apply command which would first create a plan for you about what all resources are to be added, changed or destroyed. Terraform goes and finds out the file having extension .tf in the current folder and starts building the infrastructure and would one by one create all the resources that you asked for. (Note that it requires a stable internet connection. so, ensure that you have one !!). After the apply is complete let us go and check through GUI that all our demanded resources have been created or not.
And as soon as our apply is complete our website is launched in the chrome as per our last resource requirement !!
Thats all !! Do leave your valuable feedbacks . For any queries or correction feel free to contact.
Sr. SRE @ Zscaler | LFX'25 @ KubeArmor | Building Scalable, Reliable & Cost-Optimized Solutions
4 年Nicely written ??