The Flawed Premise of Cost Optimisation
As more organisations move their IT workloads to the Cloud they soon discover that public Cloud is, in fact, not inherently cheaper than self-hosting. Whilst it may be simpler, and probably more reliable, there is also an increased potential for waste and inefficiency. Even when it is not initially obvious, over time the trend is that costs will increase unless actively managed.
The initial response to increased costs is typically a direct action approach of focusing on the most expensive workloads as a target for reduced spend. However such an approach does not necessarily consider other factors at play, such as resource (under-)utilisation, premium services, and ultimately the "value for money" proposition. Often more expensive services are either niche or priced according to the value it can potentially provide. So whilst they may appear more expensive on the surface they can actually result in a cheaper solution overall. So just focusing on monthly spend without considering the overall impact is not really telling the full story.
Architectural Efficiency
The premise that focusing on the biggest costs will save the most money is not always going result in the best strategy. A more effective cost optimisation approach is to constantly challenge the efficiency of architectural design. There is often a tendency to design Cloud architectures in the same way as self-hosted environments. Provisioning of multiple segregated non-productions environments is common (e.g. dev, sit, svt, uat, etc.), with much of the resources being underutilised for most of the time. Whilst some savings can be achieved by simply shutting down environments when not in use, it really should be questioned as to whether this is the most efficient arrangement of resources in a Cloud environment. Cloud-native architectures are more likely to consider elasticity of resources as a key principle of design and avoid static environments wherever possible.
Resource Optimisation
Because of the focus on minimising cost we often measure the extent of the problem (and potential gains/losses) in financial terms: billing and reporting is based on overall spend as a function of elapsed time. When the conversation changes to focus on efficiency these fiscal reports no longer provide the best measurement of improvements over time. As many Cloud services are billed according to the time utilised (or requests, data transferred, etc.), it makes more sense to report on the total resource-hours and use this as a measure of efficiency. A similar approach may be applied to measure API requests, data transfer or other usage metrics that contribute to final costs. Some Cloud providers can generate reports based on these metrics, and even generate alerts when the expected thresholds are exceeded.
By focusing primarily on cost optimisation we are at risk of hiding much of the detail required to improve the efficiency of architectural design and resource utilisation. Ultimately if you can reduce the resource-hours, API requests, data transfers, etc. over time you know the architecture is becoming more efficient and as a side-effect the cost associated with those resources will be reduced. But perhaps more important than the method of optimisation is the realisation that this is not an occasional activity but something that needs to be measured and proactively managed all the time.