Automating AWS EC2 Infrastructure with Terraform: A Step-by-Step Guide (Part 1)
Let's build our IaC stack using infrastructure as code. We will automate the infrastructure of EC2 instances on AWSusing Terraform to create a cloud machine, deploy a web server, a database, and a machine learning model, among other tasks. We'll start by creating a client machine, customizing a Docker image, and working on AWS. Then, we'll automate our first resource, exploring the vast possibilities of IaC.
Creating EC2 Resource
To begin, we will create a resource in AWS, specifically an EC2 resource. When we talk about cloud computing, everything is a resource of computation, storage, processing, etc. Therefore, all we need to do is configure a resource to perform a task, and we have various resources available from a cloud computing provider.
By clicking on Services, we have several categories, and for each category, we will find the available services.
Within an AWS service, you can create various resources. One option is to access it through the menu, as shown above.
Another option, which I particularly like, is to use the search box. In this case, simply type "EC2" in the search box, and we'll go directly to what we need.
Let's select the first option, Virtual Servers in the Cloud. Imagine I need to configure a server to train a machine learning engine or to run a pipeline with PySpark.
I will need a machine, right? I will need a computational resource, and EC2 can be an option. When we select EC2, we are directed to the EC2 console.
This console, by itself, is already a universe. Entering the console of any AWS service feels like entering a parallel universe.
The menu on the left side offers a diverse number of options, so I won't go into detail here. My goal is to address Terraform, go directly to resource configuration, show how to do it here, and then automate using Terraform with infrastructure as code.
Notice in the center of our console, under Resources, that I have 0 instances running. Now, I will create one by clicking the yellow Launch instance button.
We are directed to the instance creation panel, and the first step will be to name it. I will name it pandata-server here.
After that, I need to choose my operating system. I will use the first option, Amazon Linux, a modified version of Linux specific to Amazon.
Notice that when you choose this option, you have what AWS calls an AMI, or Amazon Machine Image. It is an image with a pre-configured operating system; AWS will just take this image and place it on a server for you. So, we will leave the first option, Amazon Linux 2023, AMI, selected.
Later, when I automate via Terraform, I will return here to capture the ID of this AMI. This is how we automate with Terraform.
Next, it will ask for the architecture. We will leave it as 64-bit (x86) and then select the Instance Type, which is the hardware. So, first, you choose the software, which is the operating system, and then we choose the hardware.
Only one instance is eligible for free at the EC2 level. If you explore the scroll bar, you will see that the options are practically infinite. What changes from one to another is the hardware configuration according to the need for RAM, CPU, GPU usage, disk space, HDD or SSD. You will have to choose the hardware configuration based on the resource you need.
Each hardware configuration must be checked along with the AMI you will use, whether it will be an AMI with Windows, Linux, Mac, etc. Then, you choose exactly which instance you want. The range of options available is truly vast; today, AWS has over a hundred options, and over time, they detail the usage ranges of hardware and their respective costs more and more according to the ideal hardware combination, with t2.micro being eligible for free among many others.
After choosing the hardware, we have the choice of the Key Pair (login). Since this is a cloud server, I might want to access the server, but for now, we won't worry about that because I am just creating the resource. Later, I will create the credentials to access AWS.
For a machine to communicate with others, we need a computer network. So, everything you do in a physical environment must now be done in the cloud.
Creating a Virtual Private Cloud (VPC) will have a subnet, an availability zone with various configurations, automatically assigning the IP, and in addition, assigning a security group to specify which ports can be accessed and what traffic will be allowed for your EC2 resource.
Therefore, if I do the procedure manually, everything we have covered so far must be done step-by-step: defining the name, operating system, hardware, instance, key pair for remote access, network configuration, all customized. However, we will see how to automate all of this with Terraform. With Terraform, everything works and works very well!
Continuing, now we need, of course, to configure storage. After all, will we store anything? Install a tool? Place files? Possibly store the result of a pipeline execution? Train a machine learning model and save it to disk? We need to consider all of this as well.
The available size of 8GB is included in the free tier; if you want more space, you will be charged for it. Be careful when navigating AWS to always read the details so that you do not have issues with the options you select.
When you create an EC2 instance, you are essentially configuring a machine. For example, you can include scripts to be executed during initialization, configure a security profile for the instance, and set up automatic recovery in case of a failure to prevent any loss. In the advanced options, you also have the possibility to configure these details.
When you are satisfied with your configuration, review the Summary to ensure everything is according to your preferences. From the Summary, you can automate and configure multiple instances simultaneously with the same hardware configuration, but we will also see how to do this with Terraform. In this case, we will create just one instance and execute it.
Our instance on AWS has been successfully created. Just below, we have the option to view all instances. We will enter the page with this panel that you can see above, showing that the instance state is currently active.
Why use a server?
This may take a few minutes, and once it is running, you will have your server operating in the cloud. This server can be used to train a Machine Learning model, execute a data engineering pipeline, process data and then train the model, run an ETL pipeline to load data to a specific destination, or any other task you can imagine.
You can configure a web server, set up a database, or any necessary service you need. We have several other services available, so we need to evaluate which service offers the best cost-benefit ratio.
Terraform
After showing all of this, now I will demonstrate how to use Terraform with just a few commands. There are commands to create the infrastructure and commands to delete it. With Terraform, our job is to create the automation script.
Once created, I can execute the script as many times as I want to create and delete the resource, thus automating the entire process. Let's now move away from the manual process and understand the automation process with Terraform in infrastructure as code.
Conclusion
In conclusion, we have explored the process of creating and configuring an EC2 instance on AWS, covering every step from selecting the operating system and hardware to configuring the network and storage. By understanding the manual setup, we can better appreciate the power of automation with tools like Terraform.
In the next article, we will return with a detailed solution using Terraform, demonstrating how to streamline and automate the entire process, making infrastructure management more efficient and reproducible. Stay tuned for a hands-on guide to infrastructure as code with Terraform.
Thank you.