The coldest Monday with a $1 million cloud bill: Terraform to the rescue

The coldest Monday with a $1 million cloud bill: Terraform to the rescue

At HashiDays this year, Prerit Munjal, Software Architect and Educator at KubeCloud, explained how a cryptomining attack led to an overhaul of the startup's cost management systems with the help of Terraform.

Imagine this "Cold Monday." You're working at a small startup and you wake up to a $1M+ bill from Google App Engine accrued over the weekend. What was it? For KubeCloud, it was a compromised service key that led to a cryptomining attack. The attack changed the entire company's product velocity and led to the development of an in-house product using Terraform to implement quotas and manage metrics.

What was the issue?

Listen to Prerit explain in the video below:

While this talk doesn't look at the practices around preventing service key theft (hint: check out our 5 best practices for secrets management), it does talk about the other guardrails you should have in place for cloud cost monitoring and limitations.

These were the issues that KubeCloud faced:

  • Poor cleanup processes for long-running resources (see our video on Terraform ephemeral workspaces, which allow customers to set timeouts for automatically destroying non-production resources, eliminating the need for manual clean-up, reducing infrastructure costs, and streamlining workspace management)
  • Risky complexity due to the usage of many shell scripts
  • No org-wide restrictions on resource count and type (check out our policy as code framework, Sentinel)

Its first steps included setting billing alerts and converting some of those shell scripts to Terraform code — but the big fix was quotas.

Quotas and Terraform

KubeCloud started building a large matrix of cloud vendor quotas. But they needed something to manage all these quotas in an automated way. Enter Terraform.

While using Terraform as a state engine isn't a typical use case, it worked very effectively, essentially creating an in-house dashboard for cloud cost quota management and visibility.

Here's an example main.tf for this quota managing use case in Terraform:

main.tf for this quota managing use case in Terraform

As KubeCloud's story shows, cloud mismanagement can become costly (ahem, fast), but the right tools will help you regain control. Infrastructure as code solutions like Terraform provide a critical safety net that keep cloud environments cost-effective and resilient. Investing in automated infrastructure management is a move toward financial stability, operational efficiency, and long-term scalability.

Check out our white paper on how Terraform can help secure your infrastructure.

Prerit Munjal

CTO of KubeCloud & InfraOne ? Advisor ? Building the Cloud-Native Ecosystem ? Keynote Speaker

3 个月

Give the talk a watch :)

要查看或添加评论,请登录

HashiCorp的更多文章

社区洞察

其他会员也浏览了