The cloud isn't infinite, and a step-by-step guide to kill unused resources

The cloud isn't infinite, and a step-by-step guide to kill unused resources

Cloud is not "pay for what you use" it's "pay for what you forgot to turn off"

But like most things in life, it's funny, until it's about you, and I'm frugal by nature, never pleased to see something goes to waste, be it physical or digital.

And thus, I declared war on useless and Tagless resources. The hunt for a solution led me down some interesting paths. This was not a recent struggle, nor was I the only one affected by it. In some instances, people were struggling with cleaning nameless EC2 instances appearing out of nowhere, but luckily, that was not my issue. Mine was straightforward: cost optimization by deleting resources with no Tags. Why Tagless resources you ask? Because most of the time, these resources are the result of unplanned and hasty workloads launched directly from the console, so harder to clean after the fact and are often left for dead, or undead, in a zombie like state.

From proprietary to open source, the solutions were many and I spent quiet some time battling with a few of them (mostly the free and open source ones) but I had a few imperative requirements:

  • I wanted my solution to be Serverless
  • I wanted metrics and reporting
  • I have workloads running on a few cloud providers so I wanted a cloud hybrid solution
  • And last, I wanted a solution that covers as many resources as possible, across cloud providers

And so after a few weeks of bloodshed, I settled on Cloud Custodian

Cloud Custodian is a rules engine for managing public cloud accounts and resources. It allows users to define policies to enable a well managed cloud infrastructure, that's both secure and cost optimized. It consolidates many of the adhoc scripts organizations have into a lightweight and flexible tool, with unified metrics and reporting.

It seemed to have everything I needed, the README.md was a great place to start, but I was yet to make it completely Serverless. But nothing can stand in the face of strong will (and a few double shot of Espresso, mostly the Espresso though) And so in this blog post I will provide a step by step guide on how to setup a "tag-compliance policy" using cloud-custodian, on AWS Lambda, making it completely Serverless. This policy will check for a specific tag on running EC2 instances and if missing, it will perform the defined action, in our case, KILL the instance.

Prerequisites

Although Cloud Custodian is cloud-hybrid, my post will only cover AWS. So you'll need an IAM access that can create policies, roles and users.

1- IAM Role:

The Lambda function will require access to perform actions on EC2 instances. This can be achieved by attaching an IAM role to it.

So first, create an IAM role "custodian-tag-role" and grant AmazonEC2FullAccess policy to it.

2- IAM User:

Second, to create the Lambda function from your local machine, you need programmatic access to AWS. For this, create an IAM user "custodian-user" and grant the following permissions/policies:

  1. AWSLambdaFullAccess (AWS managed policy):
  2. CloudWatchFullAccess (AWS managed policy)
  3. IAMPassRole
{
 "Version": "2012–10–17",
 "Statement": [
 {
   "Effect": "Allow",
   "Action": "iam:PassRole",
   "Resource": "arn:aws:iam::XXXXXXXX:role/lambda_basic_execution"
 }]
}

Load the provided key and secret key in your environment:

export AWS_ACCESS_KEY_ID=<ACCESS_KEY>
export AWS_SECRET_ACCESS_KEY=<SECRET_KEY>
export AWS_DEFAULT_REGION=<REGION>

Also, copy the new user's ARN and add it to the trust relationship of the IAM role created before "custodian-tag-role"

3- Install Cloud Custodian:

Use pip for the installation:

pip install c7n

And once the installation complete, create a policy document. Cloud custodian policy is a YAML document that defines the policies and actions to be taken on cloud resources. You can find example policies on this link. Here is the one we will be using for our example:

policies:
- name: owner-tag-compliance
  mode:
    type: periodic
    schedule: rate(1 hour)
    role: arn:aws:iam::XXXXXX:role/custodian-tag-role
  resource: ec2
  description: |
    Schedule a resource that does not meet tag compliance policies
    to be stopped in four days.
  filters:
    - State.Name: running
    - "tag:Owner": absent
  actions:
    - stop

This policy will create a Lambda function with role "custodian-tag-role" and a CloudWatch rule to trigger it every 1 hour and check EC2 resources for the following filters:

  1. The instance is in running state
  2. The “Owner” tag is absent

Instances matching this criteria will be stopped. Save the file with the name policy.yml and run the following command to execute it.

custodian run -s . policy.yml

Go back to the AWS console and you shall find a new Lambda function that will run every 1 hour, check the filters and STOP any EC2 instance that doesn't comply.

And that's all there is to it. If you want to see various commands with custodian:

custodian -h


要查看或添加评论,请登录

?? iLyas Bakouch的更多文章

社区洞察

其他会员也浏览了