CLOUD COMPUTING PART-2
Ashutosh Pandey
| DevOps Engineer @ Orange Business ?? | Navigating the World of Containers ????
AUTOMATING THE AWS INFRASTRUCTURE WITH EFS SERVICE USING TERRAFORM
Hello welcome to all my second article of cloud computing hope you like my previous article where i give some basics about cloud and we have seen how we can automate the creation of infrastructure over the public cloud AWS by Infrastructure as a Code tool called Terraform. So i'm adding another approach to whatever we've done previous, in this i'm going to use a storage service of AWS Cloud name Amazon Elastic File System (EFS) what's new in this service lets find out !!!
We know there are various types of services are provided by Cloud Computing vendors these services having their advantages & disadvantages i can't say disadvantages because every service having their own use cases so disadvantages is not a right word! so let's talk about EFS service before that let me introduce you some concepts about storage.
What is Storage?
Every computers, devices, mobiles, servers generates data every second, and devices have some kind of storage memory that they use for storing information either temporary or permanent. it can be a device inside or outside a computer or server.
Every companies, organisations generates massive amount of data or we can say generate big data for storing this data we need storage hardware we have various options to store big data like buy storage appliances like Dell EMC, Hitachi Storage Appliance etc. and these storage device come with fantastic maintenance and support also we need storage engineers to maintain these appliances but problem is small scale companies or startup can't afford such investment on appliance for know more why? you can visit to my first article of cloud computing where i briefly explain why startup opting cloud! anyway lets come to point yes these appliance are high cost that's okay hardware is necessary to store the data any how we can buy it but these appliance having own proprietary software means closed source software we can't customise as per our use-cases also we have to buy the software and giving maintenance charges too. so that's the reason startup companies can't take risk to buy such huge appliances..
Yes they can use some opensource software called software define storage (SDN) like Hadoop, Ceph, GlusterFS and these software are best in class for managing the storage clusters & these tools having the big community but remember one thing anyhow we need a hardware to store but we've see that startup, small scale companies can't invest on such hardware, maintaining, distributing, updating etc. So here cloud role come and play and we know cloud give us every type of services which basically need to any industries and these service may be free or sold on-demand basis allowing customers to pay only per usage and this concept is known as pay-as-we-go.
Now let's talk about more in storage in terms of cloud. a cloud which gives us storage is known as Storage as a Service (STaaS).
There are three types of Storage service available in cloud :-
- Block Storage ( as a service )
- Object Storage ( as a service )
- File storage/File system ( as a service )
Block Storage:-
Block storage are those space/storage where data into blocks and stores them as separate pieces, we can make partitions, format with a file system like FAT32, NTFS, EXT3, and EXT4. and mount on certain folder and store our files ex- Pen-Drive, Internal Hard-Disk drive, External Hard-Disk drive etc also only block storage has a capability where we can install Operating system.
- On public AWS Cloud Block storage as a service is EBS ( Elastic Block Storage )
- Private cloud Openstack Block storage as a service is Cinder .
Object Storage:-
Object storage are those storage where we can only perform upload and download and this storage is a type of Dropbox, Google Drive architecture where user can upload and download files means can be accessed directly through http/https protocol. object means file
- On public AWS Cloud Object storage as a service is AWS S3 ( Amazon Simple Storage Service ).
- Private cloud Openstack Object storage as a service is Swift.
File Storage:-
File Storage are those storage where we share a specific folder/directory via Network File System (NFS) protocol through networking It is also known as network-attached storage (NAS) device.
- On public AWS Cloud File storage as a service is AWS Elastic File System( EFS ).
- Private cloud Openstack File storage as a service is Manila.
Now we have seen various types of storage and there working functions and these all storage are available on the cloud like AWS, GCP, AZURE etc.
But we have some challenges regarding these storage so let me explain with you challenges of two storage with example.
Challenges:-
Block Storage
We know block storage provide by AWS is Elastic Block Storage (EBS) this storage can associate with only one instances means on a particular time we can connect a EBS volume to only one instance not a multiple instances like we can't connect one pen-drive on two different systems. So EBS is work per instance only so if you have multiple server then multiple server having their separate EBS volume. & that's the reason it is very harder to manage on horizontal scaling, consider if we have to change a code of web-server on one instances & we have multiple same code web-server instances running parallelly then we have to go manually change the code to all the instances. So in terms of management is very bad practice to manage or update your code. Conclusion is if we have scale-up requirement EBS won't help us.
Some brief explanation about Block storage with example:-
Consider if we have a locally 1TB of file and we open this file through text editor our editor doesn't load entire 1TB file on RAM, so whenever you read a file or run a program first your data from hard-disk load on your ram then we can see a file or output & if you have any file which is bigger than your ram size but your ram size is limited then how we load this file on ram and see the output?? technically we can't load entire file which is bigger than ram capacity but we can see the output of file how?? because complete file is never load on the ram some frame/initial portion is load on the ram and we see the output. As you scroll down your file then remaining part of your frame will load on the ram, and this feature is given by File-system. Whenever we create a partition and format it, & if your file size 4 GB then your file-system know every part of the file and it will continuous linking all the frames in the files. If you use normal block device HDD and this block device is formatted and file-system has been created that is the reason we have a capability to watch/open heavy file without loading whole complete file on ram and this capability we can only achieve only if we have a file system.
Object Storage
AWS provide object storage is AWS S3 Simple Storage Service and it is very fantastic service by AWS, S3 have a capability to attach multiple Instances/OS unlike EBS, so one folder/bucket on S3 we can attach/mount to multiple instances. In this we don't have to create multiple EBS Volume for our instances one S3 bucket is enough it will save our cost and if developer change the code of web-page then we have to update on single location in S3 and as soon as Developer change the code, every instances which are connect to our S3 bucket they will get latest content. so you can think as a S3 is a centralised storage but in this context we have challenge in S3 is let me explain you in example.
Suppose you create a bucket named "Folder" on S3 and put some data code on S3 and bucket mount to your instance your instance get the data locally but challenge in S3 is we can only Upload, Download, Move the file on S3 bucket yes we can!!! but can we edit the file which is on the S3 ? Answer is NO!! suppose if any file/code which is 1 TB in size on S3 and we want to edit and add only two lines of code!! how we can do this?? If we won't get any option to edit or any text editor to edit this file! It is because AWS S3 don't have any file-system, so that's the reason we don't have a capability to open the file on S3 either small portion too. You have only choice download whole file and edit locally & we edit locally due to our HDD have file-system, after edit upload to S3 bucket will be so much costly also time consuming operation and AWS also charge for data transfer that's the reason we generally upload static content on AWS where we very rarely change.
Conclusion:-
AWS EBS- We can Per instance attach
AWS S3- "Centralised" Management is Good but we can't able to edit or open the file due to lack of file-system....
Now the last File-storage/File-system come and play their role, this storage system can solve our requirement lets find out how?
First Requirement we need a Centralised storage system which is mount to multiple EC2 instances why Centralised?? because as you see above i explain as developer change their code all the instances will fetch updated code from single location.
Second requirement is if we have any file/code is much in size eg. 20GB, 10GB in size. and our instance want those files/code for do operation such as read, write etc therefore we don't need to transfer entire file to instance, what part or portion of file need to our instance it will retrieve those part from our centralised storage system and this system having a file-system. and this is done by only NFS protocol ( NETWORK FILE SYSTEM ) this NFS is the protocol who managing your all files and file-system over a network.
For applying this system we need a NFS server/File-server where we put our code/data and share the "folder/files" which i stored to connected instances via network.. so if any instances use this file NFS server serves file-system over a network as you can see on picture that's all about NFS.
Now as you can see every storage have their own use cases now in this project we are going to implement File-system as a Service and this service which provide by AWS is Elastic File System.
AWS ELASTIC FILE SYSTEM (EFS)
AWS Elastic File System is the service by Amazon Web Service which give us cloud based File-System as a Service/File-storage as a Service, so we don't need to setup and manages the infrastructure for EFS.
AWS EFS is designed in such a way where we can use as a common data source for workloads and running apps & connect thousands of EC2 instances across multiple availability zones, EFS having the features of scalability if the workload or traffics suddenly becomes higher EFS service will expand/scale the storage automatically as if the traffic decreases the storage will itself scale down without needing to provision the storage. it can scale up-to petabytes
Benefits:-
- Elastic storage capacity
- Scalability manged service ( as you add or remove files )
- Storage size unlimited
- Higher availability of data/object because it redundantly stored across multiple Availability Zones
- In AWS EFS there is no upfront fee, you only pay what you use
For know more about EFS service you can visit here
Now from here i am going to start my practical before i go further first we have to setup the environment.
- So we have to create an account on AWS and create IAM user in AWS management console
- Download terraform link> here and install it as per your OS , also add in your environment path as per your OS
- Download and install AWS CLI v2 from here, also add in your environment path as per your OS
- Configure your AWS IAM profile on aws cli
How you configure your AWS IAM profile on aws cli i explain in previous article check-it-out :)
These are the instructions what we are going to perform :-
We're going to write an Infrastructure as code using terraform.
- Creating a VPC(Virtual Private Cloud), Internet gateway, Route Table, a subnet in VPC and an association between them.
- Create the key and security group which allows the port 80 for http, 443 https and 22 ssh , 2049 for NFS.
- Launch an EC2 instance.
- Launch one Volume by AWS EFS attach it in your dedicated subnet, then mount that volume into document root directory of webserver /var/www/html.
- Developers have uploaded the code into GitHub repo also the repo has some images.
- Copy the GitHub repo code into /var/www/html
- Create an S3 bucket, and copy/deploy the images from GitHub repo into the s3 bucket and change the permission to public readable.
- Create a CloudFront using s3 bucket(which contains images) and use the CloudFront URL to update in code in /var/www/html
Step-1:
Creating a Workspace
I create a WorkSpace where i manage all the files ,i created on Desktop named "cloud2" then i create a file using visual studio code named "cloud2.tf", .tf extension represent terraform file.
Step-2:
Giving Provider
First thing you have to do in a file is define a Provider, in terraform we've a concept of providers basically a plugins which allows us to talk to specific set of API. means We have to give provider, here provider means which platform/cloud or service we have to contact like aws,gcp etc here i give "aws" . i are also providing the availability zone "ap-south-1" Mumbai region.
Step-3:
let’s add all variables which we are going to use while creating our resources.
- Create variable of CIDR block for VPC
- CIDR block for Subnet1 and Subnet2 which is a subset of CIDR block of VPC
- Two Availability zone which is used to create our Subnet
- Instance type of EC2 instance
- AMI which is used to create EC2 instance
As we have defined our variables, let us use them while creating our resources one by one. We are going to create VPC with defined CIDR block.
Step-4:
Creating VPC (Virtual Private Cloud)
What is VPC ?
Amazon Web Services provides us private, virtualized network environment where we can launch our AWS resources within a virtual network whatever we have defined. means here i can control all the parts which is needed to configure a network like we can configure our own VPC's IP address space from ranges you select, subnets, route tables, security groups and subnet level and network gateways etc. keep in mind that VPC is not a data-center network it is a software-defined network (SDN). it also helpful to organize our EC2 instances and configure their network connectivity and network access control list to enable inbound and outbound filtering at the instance.
Above Picture for illustration
Feature of VPC :-
- you can have multiple VPC's in a region ( default is 5 ).
- We can give on internet gateway per vpc.
VPC Connectivity options:
- Connecting user networks to AWS VPC
- Connecting AWS VPC to another AWS VPC
- Connecting the internal user with AWS VPC.
Let's Create --------->
Explaination of each keys and values :- resource "aws_vpc" "project2_vpc" { ( Resources which can create aws vpc ) cidr_block = "${var.vpc_cidr}" ( Specify a range of IPv4 addresses for the VPC in the form of a Classless Inter-Domain Routing (CIDR) ) enable_dns_support = true enable_dns_hostnames = true ( Here we give true attributes to DNS support and this will enable DNS support and DNS hostnames so each instance can have a DNS name along with IP address by the help of Amazon Route 53 service ) instance_tenancy = "default" ( instance_tenancy means instance launched into the VPC runs on shared hardware by default ) tags= { Name = "project2-vpc" } } ( Give tag to our VPC resources )
Step-5:
Creating Subnet
Subnet is the segment/subset of VPC CIDR block inside given availability zone, Each subnet has its own CIDR block where we can accommodate group of our resources. Each subnet is equal to one Availability Zone.
Here i create Two Subnet which have independent Availability zone and in these Subnet i will launch two instances.
Explaination of each keys and values :- resource "aws_subnet" "project2_subnet1" { ( This resource will create a subnet for you ) availability_zone = "${var.availability_zone1}" ( As i define our variable above here i give availability zone ) vpc_id = "${aws_vpc.project2_vpc.id}" ( Give vpc id from this vpc our subnet will create ) cidr_block = "${var.subnet_cidr1}" ( Range of ip address of subnet ) map_public_ip_on_launch = "true" ( "true" Give public ip to our instance during launch ) tags= { Name = "project2-subnet1a" } ( Unique tag to our resource ) depends_on = [ aws_vpc.project2_vpc ] } ( Depend/rely on previous resource ) [ Same for Subnet 2 ]
Step-6:
Creating Internet Gateway
Internet Gateway as the gate where data stops on its way to or from other networks. It is a logical connection between an Amazon Virtual Private Cloud (VPC) and the Internet. It allows resources within your VPC to access the internet, and vice versa.
If a VPC does not have an Internet Gateway, then the resources in the VPC cannot be accessed from the Internet.
In brief i explain internet gateway is the router that will take our network packets from our EC2 instance inside the network/subnets and forward them to the public internet, If a subnet’s traffic is routed to an internet gateway using route table, the subnet is known as a public subnet. so all the resources, instances inside a public subnet can access the internet using internet gateway.
- An internet gateway supports IPv4 and IPv6 traffic.
Explaination of each keys and values :- resource "aws_internet_gateway" "project2_internet_gateway" { ( Resource to Create Internet Gateway ) vpc_id = "${aws_vpc.project2_vpc.id}" ( Give VPC ID for Attaching Internet Gateway to VPC ) tags = { Name = "project2-ig" } ( Unique tag to our Gateway resource ) depends_on = [ aws_vpc.project2_vpc, ] } ( Rely on VPC resources )
Step-7:
Create a Routing Table
Route table routing information base or a data table stored in a router or a network host that lists the routes to particular network destinations. it has entries that tell the router to route the traffic to desired destination. Route tables to control where network traffic is directed. Each subnet in your VPC must be associated with a route table, which controls the routing for the subnet (subnet route table). You can explicitly associate a subnet with a particular route table. Otherwise, the subnet is implicitly associated with the main route table.
Explaination of each keys and values :- resource "aws_route_table" "project2_route_table" { ( This resource will create route table ) vpc_id = "${aws_vpc.project2_vpc.id}" ( Give vpc id ) route { cidr_block = "0.0.0.0/0" ( "0.0.0.0/0" Means source can be any ip address, means from any system request is accepted. The destination for the route is 0.0.0.0/0, which represents all IPv4 addresses.) gateway_id = "${aws_internet_gateway.project2_internet_gateway.id}" ( here i give internet gateway id ) } tags = { Name = "project2-route-table" } ( unique tag to this resource ) depends_on = [ aws_vpc.project2_vpc ] } ( Rely on VPC Resource )
Step-8:
Creating an association between route table and subnet
Once our route table is created, then we need to associate it with the subnet's to make our subnet public. Keep in mind that one subnet can only be associated with one route table at a time, but we can associate multiple subnets with the same subnet route table. as you can see below.
You can optionally associate a route table with an internet gateway or a virtual private gateway (gateway route table). This enables you to specify routing rules for inbound traffic that enters your VPC through the gateway.
resource "aws_route_table_association" "project2_rta1" { ( Create Resource For Route Association Table with subnet optionally you can use for gateway too ) subnet_id = "${aws_subnet.project2_subnet1.id}" ( Give Subnet1 ID ) route_table_id = "${aws_route_table.project2_route_table.id}" ( Route tables to control where network traffic is directed as i already Created above so here i give it's ID ) depends_on = [ aws_subnet.project2_subnet1, ] } ( Rely on Subnet1 Resource ) [ Same for Subnet 2 ]
Note: Every keys, values and attributes i'm not going to explain in this article, if you want to know then visit to my previous article.
Step-9:
Creating a Security Group
Now we have our networking setup ready, We need to create a firewall rule/security groups configuration for our instance, Protocol i configured NFS (2049) for allow inbound traffic to the NFS port (2049), HTTP (port=80) , HTTPS (port=443) to connect to websites which is hosted in my instances and SSH (port=22) so this will allow SSH (secure shell) for remote login.
Step-10:
Create a Key-Pair
Here we have generate the keys using tls_private_key and create a aws key-pair by the help of terraform, & save the key locally in my workspace in .pem extension. It is used for login into the instance.
Step-11:
Create an EC2 Instance
In this we've to create AWS-instances and specification is "t2.micro" so we've using AMI (amazon machine image) of Amazon Linux 2 AMI (HVM), SSD Volume Type. After successfully launched making connection to the instance via SSH by using provisioner "remote-exec" in terraform, after successfully connection established to our instances there several commands will run for installing the software like Apache Web-Server, Php intepreter, GIT, utilities. & after successfully installing the software it will start the web-server services and enable it.
Launch Two Instance on Two different Subnet.
Step-12:
Create an EFS Volume
We've have to create EFS Volume
Step-13:
Mounting our EFS to Network/Subnet
After successfully creating the AWS EFS then Mount our AWS EFS to subnet. And EC2 instance is connected to Subnet .
If you want your EC2 can access your file system, you must create mount targets in your VPC. Each mount target has the following properties:
- The mount target ID, the subnet ID in which it is created
- The file system ID for which it is created
- An IP address at which the file system may be mounted
- VPC security groups, and the mount target state.
Mount Our EFS File-system to Two Subnet so our both instance which is running on different subnet can access to EFS.
From this pic you can understand how EFS i mounted internally.
Step-14:
Mount the AWS EFS to our instances permanent & Download the Web-Page From GitHub And copy to Document Root Directory of Web-Server
By entering the rule of specific file systems to the 'fstab' file on linux, then it will make Mount to be permanent so that after restart the instance/systems file system will be remain mounted or automatically mounted.
On First Instance which is running on Subnet 1a.
On Second Instance which is running on Subnet 1b.
Step-15:
Creating AWS S3-Bucket
Creating AWS S3 bucket & change the permission to public readable, S3 is the highly scalable object storage.
- Every S3 bucket name must be unique across all existing bucket.
- Bucket names should be in a lowercase letter or number.
Step-16:
Give Bucket Public Access Policy
Applying permissions or ACLs policy to our bucket, so by this i am allowing public access to my Amazon S3 resources. By default, new buckets, access points, and objects don't have public access.
Step-17:
Uploading the image/object to S3-bucket
Step-18:
Creating a AWS CloudFront Distribution
We've created a cloudfront distribution which would provide us CDN(Content Delievery Network) for delivering the content faster.
Step-19:
Saving the AWS CLOUDFRONT DISTRIBUTION Domain name locally
Step-20:
Updating/modifying website code
Modifying our website code by injecting cloudfront distribution URL over the both instances, for fast and smooth delivery of object to the client .
Step-21:
Opening my web-page in Chrome browser automatically
There are some Terraform commands required to implement best practice and setup whole configuration:-
Step-22:
Initialising the Terraform Plugins
The terraform init command is used to download necessary plugins from internet which is associated with what i write in my code in a working directory.
Step-23:
The terraform validate command
Checking the terraform file which we've created "cloud2.tf" by The terraform validate command , this command is used to validate/check the syntax of the Terraform files, if error found in form of syntax then display an error on the prompt or terminal.
Note:- It showing some warning because i using old version of terraform which is 0.11 so might be some syntax doesn't recognise by terraform but it will work..
Step-24:
The terraform plan command
When you execute terraform plan, Terraform will scan all *.tf files in your directory and create the plan means that allows a user to see which actions Terraform will perform prior to making any changes.
Step-25:
Final Command For Creating the infrastructure over the cloud
The terraform apply command
Now we have a desired state so we can execute the plan, means it will create the infrastructure on the cloud. There are two commands for applying the execution 1- terraform apply ( ask yes or no ) and 2- terraform apply -auto-approve ( run without asking "yes" or "no" )
That's all guys our Infrastructure has created successfully & our website deployed.
So now we have a AWS EFS service configured thus anytime if we want to scale our web-server instances we can easily attached EFS to those instances.
Step-26:
Destroy the Entire Infrastructure
As we create entire infrastructure on AWS cloud by terraform apply command so there is one destructive command which can destroy the whole infrastructure in on go.
We use terraform destroy command to destroy the Terraform-managed infrastructure.
Thank-you Guys for Reading my Article
Hope my article would help, leave your valuable feedback's. For any queries or suggestion feel free to ask.??
GitHub URL
ServiceNow Developer || Javascript || HTML || CSS || DSA Problem Solving || Java || J2EE || Spring Boot || Microservices || Spring MVC || MySQL || MongoDB || Oracle DB || PostgresSQl
4 年Good job keep learning
Java || Spring Boot || Flutter || Dart || Software Developer at Tata Consultancy Services
4 年Great
Assistant Professor at Indraprastha New Arts Commerce and Science College, Wardha
4 年Great work Ashutosh Pandey