Cloud Cost Management, Optimization & Savings Strategies
Brandon Pfeffer, CMA
Strategic Finance ? Corporate Finance ? Operations ? FP&A ? M&A ? Financial Modeling ? Strategic Planning ? Treasury ? Start-ups ? Private Equity ? Budgeting ? Cloud FinOps ? Analytics ? Pricing ? ? [email protected]
Introduction
Companies operating in the cloud face a constant challenge of trying to control cloud cost growth while maintaining optimal levels of operational performance for internal applications and customers. The goal of cloud cost management is to align costs with actual needs without compromising on service quality or performance, typically by limiting expenses such as overprovisioned resources, unused instances, or inefficient architecture. It is a balancing act between keeping costs down and providing the appropriate cloud resources to maintain peak performance, fuel growth, and ensure compliance and data security.
A crucial benefit of cloud computing is the ability to add servers, storage, and networking capacity quickly and easily to respond to usage demands. Cloud cost optimization helps companies control cloud costs and improve budgeting, forecasting, and IT performance. Best practices for cloud cost optimization include setting strict budgets and using automated tools to identify and adjust cloud resources in the moment. There are a variety of strategies that are utilized to reduce cloud costs. These strategies are best implemented in a gradual manner and in conjunction with each other. (Please see my other article on Cloud Data Transfer & Storage Cost Reduction Strategies)
The 6 Rs of Cloud Migration
To get a better understanding of cloud cost savings strategies it is important to understand how applications are migrated to the cloud in the first place. During this cloud migration process process, cloud operations teams need to evaluate various options and decide on the most effective method given operational, technical, engineering, design, time, financial and personnel constraints There are several major ways to migrate applications to the cloud that are commonly referred to as the six Rs of cloud migration which can be summarized as follows:
Cloud Costs Saving Strategies
When cloud resources are first migrated and often deployed their capacity requirements are not fully known or well understood. This uncertainty leads companies to overprovision CPU, storage, networking, and other cloud resource capacity to ensure there will not be any performance issues. As a company’s cloud assets operate over time Cloud Ops team need to monitor their cloud resources and to adjust to improve operational efficiency and and reduce costs. Changes in customer behavior can result in reduced capacity requirements. Cloud cost management best practices have historically focused on finding every opportunity to cut costs. Nowadays, it focuses on optimizing cloud usage to minimize costs and maximize returns. ?Below are twelve cloud-cost saving strategies that Cloud Ops and FinOps teams can implement that can help reduce costs and optimize resources long-term:
1) Re-Evaluate Pricing Plans Options
Cloud pricing has become increasingly complicated, which can cause companies to inadvertently overspend on unnecessary resources. ?Review pricing and billing information for anomalies. Companies should continuously re-valuate their pricing plan option to reduce their cloud spend on a short-term and long-term basis. Some of the more common practices include:
2)?Orphaned Resources
When a virtual machine is deleted, there are often secondary attached cloud resources such as storage drives, network interfaces, and Public IPs that are left in place. These resources are called Orphaned Resources as the primary resource that they were attached to is no longer active so these cannot be used anymore. However, the company is still charged for the orphaned resources even though they are no longer used. A Cloud Ops team needs to audit, identify and delete orphaned resources on a regular basis by looking for areas with very stable costs and that use dated resources.
The big three cloud providers have native tools to help identify orphaned resources including the following (2):
3)?Virtual Machine (VM) Version Upgrades
Every cloud hosting provider regularly upgrades the specifications of their data centers and cloud infrastructure to use the latest technology including newer generation of CPUs, storage, and RAM offerings. During the upgrade process firms usually use virtual machines to keep older versions of cloud assets up and running to avoid any disruptions to customers during the overall upgrade process. With virtual machines a cloud ops team can upgrade the version to the latest generation if the VM specs, and the workloads can support the change. Newer generation virtual machines offer faster CPUs, with more efficient chips that utilize less energy and offer more bandwidth. There are a few things to consider before upgrading versions (2):
4)?Usage Reduction by Resizing (Rightsizing)
Adjust resource allocation by continuously evaluating whether allocated resources are fully utilized. Overprovisioning leads to unnecessary costs. Monitoring resource utilization using cloud-native tools like AWS CloudWatch or Azure Monitor. Or you can use external third-party software packages such as CloudZero, IBM Turbonomic, CloudHealth or Apptio Cloudability to monitor resource usage and identify underutilized instances (e.g., oversized VMs or idle databases). To resize a resource, you need a thorough understanding of how much of the resource is utilized. Cloud Ops teams need visibility of cloud resources including CPU usage, memory utilization, network throughput, and storage utilization. For instance, when working with virtual machine resources they need to be properly sized based on the resources need requirements. There are three general rightsizing strategies that cloud ops teams can utilize:
The more common rightsizing mistakes include the following (1):
5)?Usage Reduction by Redesigning Underlying Software
The most complex method of usage reduction is to redesign the services themselves. Having company engineering teams modify the way software is deployed, rewrite applications, or even change the software altogether can help you take advantage of cloud native offerings.
领英推荐
6)?Scaling
This strategy works by adjusting your computing capacity to match your workload's requirements. It involves checking factors like CPU, memory, and network bandwidth. With autoscaling, you can increase cloud resources when your workload spikes, and it works the same way downwards. In short, it automates the adjustment, so you save money without doing anything. One of the biggest advantages of operating in the cloud is the ability to rapidly scale up computing resources as needed. This helps reduce costs during off-peak times. However, this ability can be costly so companies need to balance operational needs and what they can afford to spend.
There are two different methods for scaling (2):
1)??Horizontal scaling, which is also called Scaling In/Out is a process of adding more VMs to a pool that executes the same processes or runs the same applications or workloads, distributing the work among nodes in the pools. Horizontal scaling is a great way to reduce waste, as additional resources are added when they are needed. ?After the period of high demand is over the additional VMs can be removed from the pool to reduce costs.
2) Vertical scaling, also called Scaling Up/Down consists of upgrading or downgrading the compute specifications of the virtual machine. Vertical scaling is often better for traditional or legacy solutions, where one server does the heavy lifting for web applications, databases, and similar workloads.
7)?Family Standardization
As a firm’s cloud profiles grow sometimes, they develop a large collection of diverse CPUs and VM types in their environments that can lead to operational and cost inefficiencies. VM and CPU family standardization is important as having common families of VM allows for greater savings when using Reserved Instances and Savings Plans. Companies should establish VM family standards and enforce them across multiple cloud platform providers to ensure operational consistencies across their platforms.
8)?Shutdown Idle Resources & Power Scheduling
Non-production environments (such as development & testing) are often only required during working hours. Cloud-native tools like AWS Instance Scheduler or scripts can be utilized to automatically shut down resources during off-hours. Cloud costs on CPUs and virtual machines depend on the amount of time that they are running and the rate that is charged when they run. Cloud Ops team should be using automated software that schedules VMs and that turns off virtual machines in the evening and on weekends when they are not utilized, which can result in great savings.
9)?Data Retention Policies
The ease of storing data in the cloud theoretically is limitless which makes it easy to store data forever. Pre-cloud companies’ data center usually monitored available disk space as a constant constraint and had rigorous data retention policies in place. Companies operating in the cloud need to have active data retention policies that are routinely updated to reflect changes in customer and internal data storage needs.
10) Utilize Storage Tier That Matches Data Needs
By default, most cloud providers store data in the standard classes or what is called hot storage. Standard storage is usually the most expensive storage class as data stored in this class can be rapidly retrieved on demand. If a company’s data is not needed as quickly then moving it to a lower tier where it cannot be retrieved as quickly can result in significant savings. Tiered storage solutions use cost-effective storage classes based on data access patterns (e.g., AWS S3 offers Standard, Infrequent Access, and Glacier tiers). One must be careful though in not moving critical data to a storage tier that will take too long to retrieve that has operational impact. Cloud hosting providers do charge for moving data from one tier to the next and for data that spends only a brief period in a tier. Companies should utilize specialized software that moves and stores data based on pre-defined rules.
11) Optimize Network Routes
Data transfers between regions, cloud providers, or out of the cloud (egress) can be expensive. Cloud Ops teams should consider using the same cloud region for interconnected services or optimizing the architecture to reduce data movement (1):
12)?Use Containers and Kubernetes
Containerization with services like Amazon ECS, Azure Kubernetes Service, or Google Kubernetes Engine allows for denser resource utilization compared to VMs, resulting in cost savings. Containers decouple applications from the underlying host infrastructure. This makes deployment easier and cheaper long-term in different cloud or OS environments. Each node in a Kubernetes cluster runs the containers that form the pods assigned to that node. Containers in a pod are co-located and co-scheduled to run on the same nod. The benefits of containers include:
Kubernetes automates operational tasks of container management and includes built-in commands for deploying applications, rolling out changes to your applications, scaling your applications up and down to fit changing needs, monitoring your applications, making it easier to manage applications. Kubernetes services?let you grow without needing to rearchitect your infrastructure. Kubernetes save time and money for Cloud Ops teams.
Conclusion
Cloud Ops teams need to constantly monitor their cloud resources and to make adjustments to improve operational efficiencies and to reduce costs. There are a variety of cloud cost saving strategies that can be used in conjunction with each other to help companies reduce ever increasing cloud spend. The best companies actively monitor their cloud resources and proactively look for opportunities to adjust their primary and secondary cloud resources to reduce their overall cloud expenditure and utilize a variety of the strategies mentioned.
?
?
Data Solutions Specialist | Gen AI | Data Pipelines | Insights & Analytics
1 个月Very insightful read. Just interested to know, do you have any pieces on cloud repatriation and the cheeky costs that come with that?
Sales Executive at HINTEX
1 个月This is such an important topic for companies using cloud services!
CEO @ North Star Training Solutions | 1000+ CEOs/Execs/Directors coached | I build your leadership bench so you can focus on building your business.
1 个月Cloud cost management is crucial for balancing expenses and performance. What strategies do you think are most effective?