AWS / TF public warning
Car Pile up from ABC news

AWS / TF public warning

Today I have found out 2 of the most fun things that conspired against me.

The first is the Terraform aws_iam_policy_attachment.

I normally work on large AWS projects and rely on infrastructure as code to manage them. I use a range of IaC products depending on the client. Working on large projects it is unfeasible to have a single IaC stack to manage the whole environment so it is important to manage different logical elements using different stacks.

In walks my Friday nemesis aws_iam_policy_attachment

One of my colleagues had used this method to attach an AWS managed policy used for an EKS cluster. Looking at the Terraform docs there is this big warning:

BIG WARNING

This resource in Terraform is a very dangerous tool if you have multiple stacks. You can easily remove a policy from another part of your system. Everything is happy where you are working but something else is on fire

The second fun thing that I have never played with before is the EKS cni containers. There is a reason I have never played with them before and that is they have always "just worked". Unfortunately if you remove the cni role policy in AWS then the cni containers die. All other pods will fail to start.

Running kubectl get pods -n kube-system give you a list of the aws-nodes that should house the cni containers. If there is an issue, the pod will report 0/1 ready. The container dies with a 137 exit code (this is often used by the out of memory manager). When inspecting the container it reports that it was not killed by OOM. The last line in the log is "Checking for IPAM connectivity ..." which is the 3rd line of the normal start up process.

If anyone can point me to any useful debug of this then let me know!

So you have been warned:

  1. DON'T USE aws_iam_policy_attachment unless that is really what you want. Use one of the other resources like a role policy attachment
  2. If your cni containers keep being killed check your IAM permissions


Hopefully I can help save someone else from a "Fun Friday"

要查看或添加评论,请登录

Andrew Larssen的更多文章

  • Measuring the cost of Bedrock

    Measuring the cost of Bedrock

    Amazon Bedrock is a great product but it does come with one slight problem - attributing costs. At a very high level…

    2 条评论
  • Claud 3.7 Sonnet - Could this change things?

    Claud 3.7 Sonnet - Could this change things?

    First let's start with the obvious. Anthropic Claude 3.

    1 条评论
  • GraphRAG - What's it all about?

    GraphRAG - What's it all about?

    A while ago all the hype in GenAI was about RAG (Retrieval Augmented Generation). RAG is a technique to give LLM (large…

  • DeepSeek on Bedrock - the story continues...

    DeepSeek on Bedrock - the story continues...

    Just over a week ago I wrote an article about running DeepSeek on Amazon Bedrock. This is a follow on piece.

  • RAG for video

    RAG for video

    I have been looking at producing a chatbot able to answer questions based on a company knowledge base. Ideally it would…

  • DeepSeek on AWS Bedrock

    DeepSeek on AWS Bedrock

    There is a lot of talk right now about DeepSeek. I am a bit scare about running any sort of model where I don't know…

  • Amazon Bedrock Model Distillation

    Amazon Bedrock Model Distillation

    Model distillation is quite a complex term. Before we look at the Bedrock product it is worth starting out by answering…

    1 条评论
  • ReInvent keynotes update

    ReInvent keynotes update

    There have been 2 keynotes so far. Monday Night Live with Peter DeSantis and the CEO keynote with new CEO Matt Garman.

  • AWS Resource Control Policies

    AWS Resource Control Policies

    In the last couple of weeks there have been a few announcements coming out of AWS. Normally at this time of year it…

  • Network security and AWS Transit Gateway

    Network security and AWS Transit Gateway

    There are a few ways you can improve your networking security using AWS Transit Gateway. If you are using AWS multi…

社区洞察

其他会员也浏览了