Saving a Million Dollars on AWS With ChatGPT o1
Written by: Justin Hunter
Like kids playing with matches, we just lit it all on fire. We were burning money as we built and as we scaled, never considering whether we could be doing it better or cheaper or more efficiently. And, like every irresponsible young child learns when it comes to fire, what seems fun can quickly turn painful.
It didn’t always used to be this way. Prior to our migration to AWS, we used Digital Ocean. Digital Ocean was significantly cheaper than AWS in many areas, even if we didn’t have all the bells and whistles and automations we were able to get with AWS. But, as we grew, we felt AWS better prepared us for the future and the scale we would need to handle (we were right, by the way). Unfortunately, an unoptimized AWS account can rack up charges faster than Kevin McCallister at Christmas.
Before we knew it, our AWS bill was running north of $2 million per year. For a startup of any size, that’s a lot. We were fortunate, though. We had a team with the technical chops and deep understanding of AWS to dive into the optimization possibilities in search of cost savings that wouldn’t impact performance. Performance and reliability is something we refuse to sacrifice, but we were confident that we could maintain every bit of the uptime, scalability, and speed our customers had come to expect while saving significantly.
Snip Snip
The team got to work and began reducing overly provisioned instances, cutting services we weren’t using, eliminating duplicate tooling, and generally cutting costs with the surgical precision of, well, a really good surgeon (sorry, I don’t know any famous surgeons).
Cost optimizations are an ongoing battle, but when the brunt of the work was done, the team had saved us a little over $800,000 per year. Nearly a million dollars in savings. That’s incredible. You can see the savings as they started to take effect (note: we have credits applied every third month so it makes the chart look odd).
But, this article isn’t about what we did to optimize costs.
This article is about the teams that don’t have AWS experts on staff and don’t have the funds to pay AWS consultants to help them optimize services and reduce bills. This article is about using the tools available to achieve similar results.
This article is about using our newest best friend: the o1 model from OpenAI.
Preparing to Prompt
The goal is to get recommendations from ChatGPT that will reasonably get a team similar results to what we achieved internally. The way we will measure this is by having our team review the suggestions from ChatGPT to make sure they are:
The fact that we had gone through this process without AI assistance means we can analyze the suggestions with the fresh experience top of mind and hindsight at our back.
This task is not trivial. It’s not something GPT-4 could accomplish. It is uniquely suited for the new reasoning model OpenAI released. Formerly called Strawberry, o1 uses logical chains to reason before responding to prompts.
However, o1 is not magic. It will still give bad answers to bad prompts. If you were to simply tell it your goal of saving a million dollars a year on your AWS bill, you’d get some generic suggestions that might satisfy criteria 3 above and that’s it. The model will need as much detail as you can provide and, when it comes to engineering architecture, that can get complicated.
In Pinata’s case, as with many other startups, we have many moving pieces. It’s not a single codebase powering everything we do. So, feeding ChatGPT our code was probably not feasible. If your architecture is simpler and a single codebase does run everything, experimenting with providing the full codebase may provide even better results.
However, as we experienced, the codebase was not necessary. Instead, we used a detailed architectural diagram. I’m providing the diagram we used below.
We also provided ChatGPT our most recent AWS invoices with detailed line items included. Those two things were enough, with some gentle prompting, to generate the results we were looking for.
But, I’m getting ahead of myself. First, we need be able to provide the o1 model this information. At the time of this writing, the o1 model does not support uploaded material. This is where I initially got stuck until our CEO helpfully nudged me in the right direction.
Using the GPT-4o model in one window, I uploaded the architecture diagram with the following prompt:
Please convert this into a highly detailed text representation of the diagram. As much details as possible. Don't leave anything out.
I’m not going to paste the response here because it was incredibly detailed. Perfect for using with the o1 model. Next, I did the same thing with our AWS invoices. Armed with the detailed text version of the architecture and text representation of our invoices, it was time to prompt o1.
Making The Model Think
As with every other AI model since ChatGPT was first released, the prompt matters. In the case of o1, the prompt is especially important because it’s what will generate the logical reasoning patterns used by the model.
In our experiments, there wasn’t a single prompt that got us exactly what we wanted. Instead, it took a conversation. The first prompt ended up being pretty simple:
Please analyze the following architecture description and the following AWS invoice and tell us if we are fully optimizing our AWS costs or if we can save more money.
This prompt on its own wasn’t going to satisfy our original criteria, but it was priming the engine. This was setting the model up to know how to think about the problem. We knew the response was going to be relatively generic and would be an affirmative yes.
After the initial response, the next prompt was:
Where are the inefficiencies in the architecture that lead to highest spend? And how can we reduce those inefficiencies?
Again, this is not going to give us specifics, but the response will help target the areas to hone in on. It also sets the stage for the final prompt which will provide the details needed to execute. Here’s the prompt I used. You should adjust to match your actual spend and expectations.
Our current spend is about $2.4 million. How can we specifically improve efficiencies and optimize our way into saving $1 million per year without impacting performance and availability?
The model thought for 32 seconds on this one, the longest of any of the prompts. This was the question that ultimately satisfied all of the requirements I laid out above. The response was specific, actionable, realistic, and seemed likely to product savings.
I’m going to paste the response with dollar totals scrubbed because it helps inform the final part of this experiment.
The Response
Assessing the Potential for Saving $1 Million Per Year in AWS Costs
Based on your current annual AWS spend of approximately $2.4 million, achieving savings of $1 million per year—a reduction of about 42**%**—is an ambitious goal. However, with a comprehensive optimization strategy targeting your primary cost drivers, it's possible to realize significant savings without impacting performance and availability.
Potential Areas for Significant Cost Savings
Amazon EC2 and Kubernetes Clusters
Strategies:
AWS Data Transfer and Amazon CloudFront
Strategies:
Amazon S3 Storage
Strategies:
Amazon RDS
Strategies:
Amazon MSK and Amazon DynamoDB
领英推荐
Strategies:
AWS CloudTrail and Amazon CloudWatch
Strategies:
AWS Cloud WAN and Networking Costs
Strategies:
Other Services (EFS, Lambda, etc.)
Strategies:
Estimated Total Potential Savings
By implementing these strategies, you could potentially achieve:
This estimate represents a conservative projection based on achievable optimizations without compromising performance or availability.
Approaches to Achieve Additional Savings
To reach your goal of saving $1 million per year, consider the following additional measures:
Feasibility and Impact Analysis
Conclusion
While saving $1 million per year is challenging, it is not unattainable. By aggressively targeting your largest cost centers and embracing both technical and organizational strategies, you can move significantly toward that goal. The key is to balance cost optimization with maintaining the high standards of performance and availability your users expect.
Next Steps
Reviewing The Suggestions
The answer was incredibly detailed, far more so than other AI models we experimented with. The question remaining was, how accurate and useful were the suggestions? To answer this, we took the results to the team that had actually saved us nearly a million dollars.
You’ll notice that some of the suggestions in the model’s response apply to non-AWS infrastructure. We wanted to keep things limited to AWS, so we honed in on those suggestions only. The team reviewed each suggestion then summarized everything with the following comparison to what we actually did.
Amazon EC2Right-Sizing Instances: ?
Leverage Spot Instances: ?
Implement Auto Scaling: ?
AWS Data TransferUse Amazon CloudFront: ?
Implement Lifecycle Policies: ?
Amazon CloudFrontOptimize Cache Settings: ?
Amazon RDSEnable Storage Auto-Scaling: ?
Evaluate Aurora Serverless: ? AWS Cloud WANUse VPC Endpoints: ?
Amazon ECS for KubernetesRight-Size Node Instances: ?
Optimize Cluster Scaling: ?
Use Spot Instances with EKS: ?
Amazon MSKRight-Size Clusters: ?
AWS CloudTrailCustomize Logging Levels: ?
Amazon DynamoDBUse On-Demand Capacity Mode: ?
Implement Auto Scaling: ?
Amazon CloudWatchAdjust Metric Collection: ?
Review Linked Accounts and Consolidate ResourcesConsolidate Accounts Where Possible: ?
Implement Automation and Best PracticesInfrastructure as Code (IaC): ?
As you can see, ChatGPT’s o1 model suggested many of the exact things we had done to minimize costs. It suggested a few things that we had not yet done or could not do for various reasons. And, of course, there were a few things we did that ChatGPT did not suggest. But by and large, the suggestions were spot on.
The team was impressed by the suggestions, and had we not already had a team well versed on the ins and outs of AWS, the o1 model response would have been a great blueprint for us to follow.
Conclusion
AWS is expensive. AWS consultants are expensive. If you don’t know what you’re doing, your costs can balloon. However, even if you don’t have a dedicated team to help you manage AWS costs, OpenAI’s newest model, o1, is highly capable and can use reasoning to generate a savings plan based on your actual architecture and billing invoices.
We mess around with AWS so you don’t have to. If you want to build fast applications with the simplest file uploads and retrievals API, sign up for Pinata today .
Republished from here: https://bit.ly/3B6NGJA