Cloud Cost Optimization with HashiCorp Terraform: A Simple Guide

Cloud Cost Optimization with HashiCorp Terraform: A Simple Guide

Cloud computing has changed how companies manage their IT infrastructure, and it's now more important than ever to keep an eye on costs. Traditionally, finance teams managed budgets, but with the rise of on-demand cloud services, engineers are stepping into the role of cloud financial controllers. So, how do you manage this shift effectively?

This guide will help you understand how to use HashiCorp Terraform to optimize your cloud costs. By the end, you'll know how to define roles, automate processes, and implement a strategy for cloud cost optimization.

What You'll Learn

  1. Roles and Responsibilities (RASCI Model): We'll show you how to assign responsibilities within your team for managing cloud costs and overall cloud infrastructure.
  2. Visualizing Cost Management in Terraform: Understand how cloud cost management fits into a Terraform provisioning workflow.
  3. Planning and Forecasting Costs: Learn how to plan and forecast your cloud costs using Terraform.
  4. Integrating Cost Optimization Tools: Get step-by-step instructions on how to integrate and use cloud vendor and third-party cost optimization tools within Terraform workflows.
  5. Using Sentinel for Cost Control: Discover how to use Terraform’s policy as code framework, Sentinel, to automatically prevent overspending by setting rules around costs, instance types, and tags.

A Survey of Cloud Waste

With the continuous shift to consumption-based cost models for infrastructure and operations, i.e., Cloud Service Providers (CSPs), you pay for what you use but you also pay for what you provision and don’t use. If you do not have a process for continuous governance and optimization, then there is a huge potential for waste.

A recent cloud spending survey found that:

  • 45% of organizations reported being over budget for their cloud spending.
  • More than 55% of respondents are using either cumbersome manual processes or simply do not implement actions and changes to optimize their cloud resources.
  • 34.15% of respondents believe they can save up to 25% of their cloud spend and 14.51% believe they can save up to 50%. Even worse, 27.46% said, “I don’t know.”

First, let’s unpack why there is an opportunity and then get to the execution.

The Role of Cloud Governance Teams

When moving to the cloud, many organizations establish governance models where a team, often called the Cloud Center of Excellence, oversees strategy, architecture, operations, and costs. These teams typically include IT management, cloud technical specialists, and finance professionals. Finance is responsible for cost planning, migration financial forecasting, and optimization.

However, financial teams often say, “We need to get a handle on costs, savings, forecasting, etc.” but lack direct control over these costs. It’s now engineers who manage both infrastructure and costs directly.

The Financial Paradigm Shift

The shift to cloud services brings a new financial paradigm:

  • Engineers are Responsible for Costs: Engineers now manage operations and costs, using tools to automate and directly control expenses.
  • Complex Cost Planning: Cost planning and estimation for running cloud workloads are not easily understood or forecasted by finance teams.
  • New Budgeting Challenges: Traditional financial budgeting and on-premises hardware demand planning don’t account for the cost variability in consumption-based cloud models.

Finance lacks control in two primary areas:

  1. Pre-Provisioning: Limited governance and control during the resource provisioning phase.
  2. Post-Provisioning: Limited governance and control in enforcing infrastructure changes for cost savings.

Planning, Optimization, and Governance

Now the next question: How can engineers use Terraform at each level of the cloud cost management process to deliver value and minimize additional work? To get started, see how the visualization illustrates Terraform’s place in the cloud cost management lifecycle. (Start at the top with the “Planning” phase)

To summarize the steps:

  1. Identify Workloads: Start by identifying workloads that are migrating to the cloud.
  2. Create Terraform Configuration: Define your infrastructure as code with Terraform configuration files.
  3. Cost Estimation: Run terraform plan to perform cost estimation with integrated third-party tools.
  4. Provision Resources: Run terraform apply to provision the resources.
  5. Optimization Recommendations: Once provisioned, workloads will run, and vendor tools will provide optimization recommendations.
  6. Integrate Recommendations: Integrate a vendor’s optimization recommendations into Terraform and/or your CI/CD pipeline.
  7. Implement Sentinel Policies: Investigate and implement Terraform’s Sentinel policies for cost and security controls.
  8. Update and Apply: Update Terraform configuration and run plan & apply.
  9. Optimized Resources: Newly optimized and compliant resources are now provisioned.

Planning — Pre-Migration and Ongoing Cost Forecasting

Cloud migrations require a multi-point assessment to determine if it makes sense to move an application/workload to the cloud. Primary factors for the assessment are:

  • Architecture
  • Business Case
  • Estimated Cost for the Move
  • Ongoing Utilization Costs: Budgeted/forecasted for the next 1–3 years on average

Since engineers are now taking on some of these responsibilities, it makes sense to use engineering tools to handle them. Terraform helps engineers take on these new responsibilities.

Using Terraform configuration files as a standard definition of how an application/workload’s cost is estimated, you can now use HCP Terraform & Enterprise APIs to automatically supply finance with estimated cloud financial data or use Terraform’s user interface to provide finance direct access to review costs. By doing this, you can help eliminate many slower oversight processes.

Planning Recommendations:

  • Standardize Cost Planning: Use Terraform configuration files as the standard definition of cloud cost planning and forecasting across AWS, Azure, and GCP, and provide this information via the Terraform API or role-based access controls within the Terraform user interface to provide financial personas a self-service workflow.
  • Extract Usable Data: Many organizations conduct planning within Excel, Google Sheets, and Web-based tools. To make data usable within these systems, use Terraform’s Cost Estimates API to extract the data.
  • Use Terraform Modules: Define standard units of infrastructure for high-level cost assessments and cloud demand planning. For example, define a set of modules for a standard Java application, so module A + B + C = $X per month. This helps quickly assess potential application run costs before defining actual Terraform configuration files.
  • Understand Financial Growth: Use Terraform to understand application/workload financial growth over time, i.e., cloud sprawl costs.
  • Align Naming Conventions: Structurally align Terraform Organization, Workspace, and Resource naming conventions to the financial budgeting/forecasting process.

Basic Patterns for Consuming Optimization Recommendations

To establish a mechanism for Terraform to access optimization recommendations, we see several common patterns:

  1. Manual Workflow: Review optimization recommendations from the provider's portal and manually update Terraform files. This is the least efficient but a necessary starting point for creating a feedback loop for optimization.
  2. File Workflow: Create a mechanism where optimization recommendations are imported into a local repository via a scheduled process (usually daily). For instance, Densify customers use a script to export recommendations into a densify.auto.tfvars file, which is then stored in a local repository. The Terraform lookup function can then be used to reference specific optimization updates set as variables.
  3. API Workflow: Create a mechanism for optimization recommendations to be extracted directly from the vendor and stored within an accessible data repository using Terraform’s http data_source functionality to perform the dataset import reference.
  4. Ticketing Workflow: Similar to the file and API workflows, but some organizations insert an intermediary step where optimization recommendations first go to a change control system like ServiceNow or Jira. These systems have built-in workflow and approval logic, where a flag is set for acceptable change, which is then passed as a variable to be consumed later in the process.

Optimization as Code: Terraform Code Update Examples

To optimize resources effectively, it's important to maintain key pieces of resource data as variables. Optimization tools provide recommendations for resources such as compute, database, and storage. Here's how you can set up your Terraform configuration to use these recommendations.

At a minimum, you should have three variables: new_recommendations, current_fallback, and resource_unique_id.

For example, using Densify, you can find the Densify Terraform module via the Terraform Registry and the Densify-dev GitHub repo.

Step 1: Define Variables

variable "densify_recommendations" {
  description = "Map of maps generated from the Densify Terraform Forwarder. Contains all of the systems with the settings needed to provide details for tagging as Self-Aware and Self-Optimization"
  type        = "map"
}

variable "densify_unique_id" {
  description = "Unique ID that both Terraform and Densify can use to track the systems."
}

variable "densify_fallback" {
  description = "Fallback map of settings that are used for new infrastructure or systems that are missing sizing details from Densify."
  type        = "map"
}        

Step 2: Update Terraform Code with Variables and Logic

Use the lookup function to check for optimization recommendations in the local file densify.auto.tfvars.

locals {
  temp_map     = "${merge(map(var.densify_unique_id, var.densify_fallback), var.densify_recommendations)}"
  densify_spec = "${local.temp_map[var.densify_unique_id]}"

  cur_type      = "${lookup(local.densify_spec, "currentType", "na")}"
  rec_type      = "${lookup(local.densify_spec, "recommendedType", "na")}"
  savings       = "${lookup(local.dens
        

要查看或添加评论,请登录

Sarwan Jassi的更多文章

社区洞察

其他会员也浏览了