Pre-aggregating report data with OpenOps
? Erik Norman
CEO | FinOps Lead | Certified FinOps practitioner | TPM | ITAM Forum member | Cloud migration specialist
Check out this workflow. Can you guess what it does?
This is a sample workflow and doesn’t look like much, but the more you look into it, the more you should realise how powerful this is.
Why pre-aggregate data?
It’s quite simple. Let’s say you have a dozen teams or cost centers, and you want to see three months’ worth of daily trends by team and service category (Compute, Storage, etc.). Let’s assume you only use one cloud provider and get about 200K Cost and Usage Report (CUR) lines per day.?
The resulting data for your report should have:
90 (days) X 12 (teams) X ~10-15 (service categories) = between 10K and 15K rows
Now, the data you need to aggregate for this report amounts to the following:
90 (days) X 200,000 (CUR lines per day) = 18M rows
That is a factor of 1200:1. Monthly trends? Multiply by ~30. Yearly? Multiply again by 12!
CUR data does not change
Cost and Usage Report data - as well as most other transactional data - does not change once created. If you open a CUDOS dashboard or similar on QuickSight, you will aggregate the same data over and over and over. This might work well for small data samples and for development purposes, but for any significant cloud spend, the amount of data grows quickly. That means your dashboards and reports will get slower and slower, but also more expensive.
What if you analyzed what data you need for your dashboards and reports, and pre-aggregated all relevant metrics? Then your reports and dashboards only need to read pre-digested stats and visualize them, and go from “go for a coffee while the report is being created” to “almost instant”.
Workflow breakdown
Let’s watch that image above again, and check what this sample workflow does.
When was yesterday?
The first couple of steps are required to get the date of yesterday and extract its values: year, month, day. This is useful because, with most cloud providers, today you will get yesterday’s usage and cost data.?
Cost data
Then it creates sample data for the given date - in a production workflow it would obviously pull the data from the service provider.
Retrieve aggregation settings
Now comes the magic: I’ve created a table with all the aggregation settings I want:
Aggregate and upload
For each aggregation setting, we aggregate the cost data accordingly and upload it to an S3 bucket.
Why is this magic?
Is there a new metric or report, and you need new aggregation settings? Just add them to the aggregations table. Not yet done in the sample workflow, but you can extend it to identify new aggregation settings, and pre-aggregate old data, while known aggregation settings only consider the data of yesterday.
Then there is the obvious performance and cost benefit. No more sluggish dashboards, and with the cost reduction, the tool would pay for itself even if you went for the Enterprise version.
How long did it take me to build this workflow?
Less than two days, including the time required to create a full-fledged open-source repository with functions to aggregate data and also create sample data, documentation, contribution guide, 100% test coverage, etc.
The time required to build the workflow was less than an hour, including the figuring-out-how. You could, obviously, go old-school, and create your own custom ETL pipeline to do the same, but I have an inkling it will take you slightly longer.
Conclusion
What are you waiting for? Head over to openops and get creative! It’s open-source - still in beta - and you can self-host it for free! There will be a SaaS version soon, as well as Premium and Enterprise features.
Cloud FinOps and Cost Optimization????
1 天前Seems like a no brainer. Great solution for reporting
Cloud FinOps Managing Partner
1 天前Pragmatic, actionable & very interesting approach : congrats ? Erik Norman for making it clear ??
FinOps Lead H&M | FinOps Foundation Ambassador | FinOps Certified Proffesional |
1 天前Ankit Pathak
Senior FinOps Engineer at J&J | FinOps Foundation Ambassador | @finopsscribler on Medium
1 天前Good read! Thanks for sharing :)