登录查看更多内容

Automating AWS EC2 Infrastructure with Terraform: A Step-by-Step Guide (Part 1)

Leonardo A.

Data Analyst

发布日期: 2024年6月20日

Let's build our IaC stack using infrastructure as code. We will automate the infrastructure of EC2 instances on AWSusing Terraform to create a cloud machine, deploy a web server, a database, and a machine learning model, among other tasks. We'll start by creating a client machine, customizing a Docker image, and working on AWS. Then, we'll automate our first resource, exploring the vast possibilities of IaC.

Creating EC2 Resource

To begin, we will create a resource in AWS, specifically an EC2 resource. When we talk about cloud computing, everything is a resource of computation, storage, processing, etc. Therefore, all we need to do is configure a resource to perform a task, and we have various resources available from a cloud computing provider.

By clicking on Services, we have several categories, and for each category, we will find the available services.

Within an AWS service, you can create various resources. One option is to access it through the menu, as shown above.

Another option, which I particularly like, is to use the search box. In this case, simply type "EC2" in the search box, and we'll go directly to what we need.

Let's select the first option, Virtual Servers in the Cloud. Imagine I need to configure a server to train a machine learning engine or to run a pipeline with PySpark.

I will need a machine, right? I will need a computational resource, and EC2 can be an option. When we select EC2, we are directed to the EC2 console.

This console, by itself, is already a universe. Entering the console of any AWS service feels like entering a parallel universe.

The menu on the left side offers a diverse number of options, so I won't go into detail here. My goal is to address Terraform, go directly to resource configuration, show how to do it here, and then automate using Terraform with infrastructure as code.

Notice in the center of our console, under Resources, that I have 0 instances running. Now, I will create one by clicking the yellow Launch instance button.

We are directed to the instance creation panel, and the first step will be to name it. I will name it pandata-server here.

After that, I need to choose my operating system. I will use the first option, Amazon Linux, a modified version of Linux specific to Amazon.

Notice that when you choose this option, you have what AWS calls an AMI, or Amazon Machine Image. It is an image with a pre-configured operating system; AWS will just take this image and place it on a server for you. So, we will leave the first option, Amazon Linux 2023, AMI, selected.

Later, when I automate via Terraform, I will return here to capture the ID of this AMI. This is how we automate with Terraform.

Next, it will ask for the architecture. We will leave it as 64-bit (x86) and then select the Instance Type, which is the hardware. So, first, you choose the software, which is the operating system, and then we choose the hardware.

Only one instance is eligible for free at the EC2 level. If you explore the scroll bar, you will see that the options are practically infinite. What changes from one to another is the hardware configuration according to the need for RAM, CPU, GPU usage, disk space, HDD or SSD. You will have to choose the hardware configuration based on the resource you need.

Each hardware configuration must be checked along with the AMI you will use, whether it will be an AMI with Windows, Linux, Mac, etc. Then, you choose exactly which instance you want. The range of options available is truly vast; today, AWS has over a hundred options, and over time, they detail the usage ranges of hardware and their respective costs more and more according to the ideal hardware combination, with t2.micro being eligible for free among many others.

After choosing the hardware, we have the choice of the Key Pair (login). Since this is a cloud server, I might want to access the server, but for now, we won't worry about that because I am just creating the resource. Later, I will create the credentials to access AWS.

For a machine to communicate with others, we need a computer network. So, everything you do in a physical environment must now be done in the cloud.

Creating a Virtual Private Cloud (VPC) will have a subnet, an availability zone with various configurations, automatically assigning the IP, and in addition, assigning a security group to specify which ports can be accessed and what traffic will be allowed for your EC2 resource.

Therefore, if I do the procedure manually, everything we have covered so far must be done step-by-step: defining the name, operating system, hardware, instance, key pair for remote access, network configuration, all customized. However, we will see how to automate all of this with Terraform. With Terraform, everything works and works very well!

Continuing, now we need, of course, to configure storage. After all, will we store anything? Install a tool? Place files? Possibly store the result of a pipeline execution? Train a machine learning model and save it to disk? We need to consider all of this as well.

The available size of 8GB is included in the free tier; if you want more space, you will be charged for it. Be careful when navigating AWS to always read the details so that you do not have issues with the options you select.

When you create an EC2 instance, you are essentially configuring a machine. For example, you can include scripts to be executed during initialization, configure a security profile for the instance, and set up automatic recovery in case of a failure to prevent any loss. In the advanced options, you also have the possibility to configure these details.

When you are satisfied with your configuration, review the Summary to ensure everything is according to your preferences. From the Summary, you can automate and configure multiple instances simultaneously with the same hardware configuration, but we will also see how to do this with Terraform. In this case, we will create just one instance and execute it.

Our instance on AWS has been successfully created. Just below, we have the option to view all instances. We will enter the page with this panel that you can see above, showing that the instance state is currently active.

Why use a server?

This may take a few minutes, and once it is running, you will have your server operating in the cloud. This server can be used to train a Machine Learning model, execute a data engineering pipeline, process data and then train the model, run an ETL pipeline to load data to a specific destination, or any other task you can imagine.

You can configure a web server, set up a database, or any necessary service you need. We have several other services available, so we need to evaluate which service offers the best cost-benefit ratio.

Terraform

After showing all of this, now I will demonstrate how to use Terraform with just a few commands. There are commands to create the infrastructure and commands to delete it. With Terraform, our job is to create the automation script.

Once created, I can execute the script as many times as I want to create and delete the resource, thus automating the entire process. Let's now move away from the manual process and understand the automation process with Terraform in infrastructure as code.

Conclusion

In conclusion, we have explored the process of creating and configuring an EC2 instance on AWS, covering every step from selecting the operating system and hardware to configuring the network and storage. By understanding the manual setup, we can better appreciate the power of automation with tools like Terraform.

In the next article, we will return with a detailed solution using Terraform, demonstrating how to streamline and automate the entire process, making infrastructure management more efficient and reproducible. Stay tuned for a hands-on guide to infrastructure as code with Terraform.

Thank you.

要查看或添加评论，请登录

Leonardo A.的更多文章

Techniques for Exploratory Data Analysis and Interpretation of Statistical Graphs

2024年11月20日

Techniques for Exploratory Data Analysis and Interpretation of Statistical Graphs

Overview In this project, we’ll explore techniques for exploratory data analysis and dive into the interpretation of…

2 条评论
SQL: Mastering Data Engineering Essentials

2024年9月19日

SQL: Mastering Data Engineering Essentials

Here’s an interesting fact: do you know when the SQL language was created? When it first appeared? I do! It was in…
The Power of Hypothesis Testing

2024年8月3日

The Power of Hypothesis Testing

Hypothesis testing is a fundamental tool in inferential statistics and data science, allowing us to evaluate claims…
Normalization and Standardization in Data?Science: When to apply one, when to apply the?other?

2024年8月2日

Normalization and Standardization in Data?Science: When to apply one, when to apply the?other?

I’m going to bring you now probably the topic that generates the most doubts among those who are just starting their…
Mastering Data Preprocessing in Python Pandas: 23+ Clear Examples

2024年7月4日

Mastering Data Preprocessing in Python Pandas: 23+ Clear Examples

1. Introduction Data preprocessing is a critical step in any data analysis or machine learning project.
Data Splitting in Machine Learning: Techniques and?Pitfalls

2024年7月1日

Data Splitting in Machine Learning: Techniques and?Pitfalls

Machine learning is all the rage these days, but are you really grasping the fundamentals? If you’re diving into this…
Building and Deploying a Machine Learning Model with Flask (Model & Deploy Guide)

2024年6月28日

Building and Deploying a Machine Learning Model with Flask (Model & Deploy Guide)

We have completed the first part of our project, which was building the Machine Learning model. Now, let’s move on to…
8 Steps to Building a Machine Learning Model for Classification

2024年6月26日

8 Steps to Building a Machine Learning Model for Classification

Explore the process of creating, training, and deploying a machine learning model to predict product types based on…

1 条评论
9-Step Guide to Building Machine Learning Models

2024年6月24日

9-Step Guide to Building Machine Learning Models

In this article, I will walk you through the process of building machine learning models. I will first describe the…
Data Engineering: Principles of ETL vs. ELT

2024年6月21日

Data Engineering: Principles of ETL vs. ELT

Introduction There is a long journey within data engineering, especially in the ETL process. ETL is an acronym that…

See all articles

Creating EC2 Resource

Why use a server?

Terraform

Leonardo A.的更多文章

Techniques for Exploratory Data Analysis and Interpretation of Statistical Graphs

SQL: Mastering Data Engineering Essentials

The Power of Hypothesis Testing

Normalization and Standardization in Data?Science: When to apply one, when to apply the?other?

Mastering Data Preprocessing in Python Pandas: 23+ Clear Examples

Data Splitting in Machine Learning: Techniques and?Pitfalls

Building and Deploying a Machine Learning Model with Flask (Model & Deploy Guide)

8 Steps to Building a Machine Learning Model for Classification

9-Step Guide to Building Machine Learning Models

Data Engineering: Principles of ETL vs. ELT