FinOps and Automation
? Erik Norman
CEO | FinOps Lead | Certified FinOps practitioner | TPM | ITAM Forum member | Cloud migration specialist
You wouldn’t optimize a resource for $50 annualized savings, right? The effort will probably cost you more than what you're saving. But hey, with automation you can, especially if you have plenty of resources. Also, automating your simple tasks frees up plenty of time to deal with complex tasks. I’ve worked for a corporation with 45K AWS accounts. Trust me, automation wasn’t an option.
Automation vs. manual tasks
We all know, automation can save plenty of time and money, and it’s generally worth it whenever you have to deal with anything repetitive. Automating tasks can also be fun, like this guy who created scripts to automatically text his wife whenever he was working late. Do you have any other fun scripts to share? Please add a comment, I’d love to hear about them.
Automation for FinOps practitioners
Let’s see if we can identify the elephant in the room. All these tasks have an indirect impact on your cloud costs bar one. Yup: however useful and important these tasks are, only one will directly affect your bill. But should you really use some automation to also implement cost fixes? What are the pros and cons? If you ask ChatGPT, you will get a decent but very generic answer with enough food for thought. However, there were a couple of crucial aspects missing: one is a huge advantage, the others, well, aren’t. Let’s go on and dive deeper.
The BIG advantage of Automation in FinOps
As stated in The State of FinOps 2023, the biggest challenge of FinOps practitioners is how to get engineers to take action. I always say: “Sending recommendations is as effective as sending love letters. Only if the counterpart is already positively inclined, it will work.”
Now, automation cannot tackle this problem directly. You will still need to work on changing company culture, introducing new KPIs for stakeholders and teams, gamification and so on. That cannot be automated. However, automation can help you circumvent the symptoms of engineers not taking action: rather than sending a recommendation to right-size or re-type resources, you can use automation to automatically implement the suggested changes. In my experience, most companies can benefit and gain ~20% cost savings. So, what are we waiting for? Well, there are other aspects we need to talk about first.
Automation and risk
Automating your cost reports isn't risky, so we'll narrow the scope and focus on the risks of automatically implementing cost fixes.
领英推荐
The automation will only be as good as its implementation and its input. If it is going to directly affect your cloud resources, then it better be good. Or should it be smart? Yes, you might add some AI to make it smarter, but you can also convince AI that 2 + 5 = 8 because your wife is always right. IMO, you should focus on making your automation safe.
Think about right-sizing resources; you follow best practices, monitor performance metrics for the longest possible time span of typically 3 months, and create some automation script or configuration to size your instance or cluster accordingly. You even err on the side of caution, leaving plenty of headroom. Right? Well, let’s say that Amazon customers wouldn’t be happy during Black Friday and similar seasonal events. As a FinOps practitioner, you need to be able to distinguish what can and what cannot be safely automated, the risks involved, alignment with company strategy, non-prod vs. prod vs. office workloads, or when to opt for hybrid solutions, i.e. automate recommendations but vet and approve them manually – the execution could then still be automated.
Automation and IaC
I love IaC. You can specify your infrastructure in a template and then deploy it. Make some changes, deploy again. Wonderful. However, what is the correct approach when a cost optimization recommends some changes to the infrastructure? In my experience, IaC templates are created during development and rarely modified until the next big update. Cloud costs are dynamic and require dynamic handling. Should you ask the engineering teams to constantly update the templates and re-deploy? Or make the changes first and update later? In short: allow some drift or stick to 100% compliance?
In my experience, a good solution is to tolerate a small amount of drift. Keep your engineers focused on more productive tasks, and update the templates when feasible. This allows you to maximize cost savings and keep opportunity costs low.
Where to start, what should you automate first?
I know it’s tempting to look at Storage and Compute as the biggest cost factors, and try to reach for the biggest cost savings. However, that is risky. If you optimize too eagerly, your customers will experience service degradation or failure, and no cost optimization is worth losing customers. Check out this amazing talk on the subject.
Conclusion
I’m an advocate for automation, especially within the realm of FinOps. It reduces your workload, so you can do what you’re best at: analyzing complex data and delivering valuable insights. It also allows you to go after cost optimization opportunities at scale and make a significant dent in your cloud bill.
Cloud FinOps Leadership | Value Analysis | Work + Faith
1 年? Erik Norman, this is really helpful. It's common for FinOps content on LinkedIn to be repetitive. I often agree with the content, but it does not provide any new insights or practices. This is different, and it will help our team communicate with stakeholder groups about automation. Thank you.
Nice article Erik, I am also a fan of finops automation ;)
CEO | FinOps Lead | Certified FinOps practitioner | TPM | ITAM Forum member | Cloud migration specialist
1 年Fun fact: I just published this article mentioning the risks of optimizing cost too eagerly, and the LinkedIn website was down within seconds.