Reducing cloud costs - what worked for us

Reducing cloud costs - what worked for us

Background

In the last 2 years I have worked on a bunch of COGS (cost of goods sold) reduction initiatives (aka reducing cloud cost) for our cloud services. In my discussions with friends in the industry I have time and again seen that it's a topic which many find interesting and intriguing.

The purpose of this post is to share the key learning with others.

Note:

1) Given that this is a public post, I didn't get into the details specific to my organization. Instead I made an effort to distill the learning into principles and ideas which are more broadly applicable and useful to majority of people.

2) The work we did and success we achieved was a "team work" by a group of highly committed and capable individuals spread across multiple geos. No one "star performer" can show the stamina and achieve the results that a motivated team can.

Without further ado let's dive right in.

Two approaches to saving costs

  • Correction: Fix the past
  • Prevention: Secure the future

We explore both the categories in turn.

Principle 1: You can't change what you can't see. Getting to the source of truth.

The first and foremost thing to get in place-if you don't have it already-is dashboards that show consolidated monthly costs of services and components based on real data. It should be possible to drill down and see which are the top components contributing to the cost of a service. It should also be possible to filter based on geo and deployment type (prod, staging etc.)

Note: It's common to share infra across in which case there needs to be a clearly defined, proportional and transparent cost attribution (to consumers) model for that infra.

Principle 2: Make the dashboard public. The power of gamification.

When everyone (withing the org) can see what everyone else is contributing to the COGS it unleashes a very powerful dynamic - gamification. A scoreboard that everyone can see motivates you to play your 'A' game. The satisfaction of seeing the results of your work on the dashboard (costs going down) is immense.

Principle 3: Choose your battles wisely. Don't sweat the small stuff.

Focus on the top 3-5 contributors to cost at any give point based on what the dashboards show. It's easy to get sucked into areas which are exciting to pursue but yield low returns. Here is where the dashboard act as a compass and points you in the direction of what to go after to get maximum ROI.

Principle 4: Data driven decisions.

Savings come primarily in two ways. Down-sizing or right-sizing resources or eliminating them altogether. Making decisions is always hard unless you have the right data to guide you.

To right size you need to understand the usage of existing resources, historical and current traffic volumes, and expected growth. This is lot of data to gather and analyze but without this it's impossible to make meaningful decisions.

To deprecate features you again need data on active use by paying customers. This can be an eye opening exercise because it will show you how customers are using your product as a function of the cost you are incurring.

Fun fact: When we did this exercise we found (for the first time) that a "cool" feature we had built was costing us 12$ per API call and there was hardly any active use for it.

Principle 5: No one size fits all. A horse for a course.

The topic of right sizing/optimizing is vast and probably deserves a post of its own. Some of the things we did includes - reducing nodes in clusters, changing plan, changing the type of nodes, resource reservation, negotiate pricing with vendors based on actual usage, tuning the number of always-on functions, replacing costly databases with cheaper ones, managing log volumes and retention, eliminating unused features, sharing infrastructure where it made sense for better utilization, redesigning, ....

Tip: Start with the assumption that you have over-engineered and there is resource wastage, and you will most likely find it.

Principle 6: Not always a zero sum game. It can be a win win.

It's natural to think that by reducing costs we may lose some other desirable attribute. While such tradeoffs do exist, it's not always the case. There are chances that you can reduce COGS and in the process get better. A case in point is an API where we reduced the COGS (of API gateway) by 90% and improved the performance of a customer script that uses the API by 6X.

So far we looked at how to reduce the COGS for what is already existing. It is equally important to learn how to be cost optimal from the beginning.

Principle 7: Prevention is better than cure. The easiest way to fix a problem is by not creating it.

COGS review is now a critical part of our design review process. COGS estimation and justification is something we do even before writing the first line of code. In the review process, COGS gets the same weight as the functional requirements. If we can't develop a feature in cost-effective way we would rather not do it.

Architects debating the cost aspects of design with passion is a sight to behold and music to the ears. Especially if you have gone through the grind of controlling the costs of an existing service in production with active customers.

This brings us to the end of what I had to share and I think that's quite a lot to digest and assimilate.

I will end the post with an insightful comment one of our exec made [paraphrased]:

"Optimizing costs is being respectful of the customer because costs get passed to the customer eventually."

~S~

#workstories #cloudcost


Phil Wiffen

People and Engineering advocate | Engineering Manager

5 个月

Some great stuff in here. I’m going to share this with our engineering leadership team. Thanks for sharing, Subu ??

要查看或添加评论,请登录

Subramanian Krishnan的更多文章

  • How intelligence manifests as per yogic science

    How intelligence manifests as per yogic science

    There are 4 levels of speech mentioned in yogic texts (ex. Lalita Sahasranama).

    1 条评论
  • Is Bhagavad Gita only a spiritual text?

    Is Bhagavad Gita only a spiritual text?

    The Bhagavad Gita contains wisdom of infinity and it can be looked at through various lenses. In this brief post, I…

    2 条评论
  • The importance of being adaptable

    The importance of being adaptable

    Background I believe I'm qualified to write on this topic because I have gone through sufficient life experiences where…

  • Deepening customer focus

    Deepening customer focus

    Background I have been in the software industry for two decades and have engaged in lot of customer interactions over…

  • Defeating impostor syndrome

    Defeating impostor syndrome

    One of the objectives I have set (for myself) for this year is overcoming impostor syndrome. There are different…

    1 条评论
  • 6 years in Citrix!

    6 years in Citrix!

    I'm a few days away from completing 6 years in Citrix and it feels like a perfect time to reflect on the journey so…

    6 条评论
  • Dawn of a new era

    Dawn of a new era

    Recently Citrix announced its new strategy and innovation roadmap and it feels so comforting and empowering to know the…

  • Monitoring the monitor

    Monitoring the monitor

    Jumping into customer issues is a great way to learn about what a product does and how it works. Four months back I…

  • Career myths and pitfalls to avoid

    Career myths and pitfalls to avoid

    Note: I'm writing this primarily as a reminder/advice to myself. Posting in public is just in case anyone else finds it…

    3 条评论
  • Does job title matter?

    Does job title matter?

    The answer is yes and no. It matters when The organization and your colleagues judges you based on the title.

社区洞察

其他会员也浏览了