Pre-aggregating report data with OpenOps
Created with ChatGPT

Pre-aggregating report data with OpenOps

Check out this workflow. Can you guess what it does?

OpenOps sample workflow

This is a sample workflow and doesn’t look like much, but the more you look into it, the more you should realise how powerful this is.

Why pre-aggregate data?

It’s quite simple. Let’s say you have a dozen teams or cost centers, and you want to see three months’ worth of daily trends by team and service category (Compute, Storage, etc.). Let’s assume you only use one cloud provider and get about 200K Cost and Usage Report (CUR) lines per day.?

The resulting data for your report should have:

90 (days) X 12 (teams) X ~10-15 (service categories) = between 10K and 15K rows        

Now, the data you need to aggregate for this report amounts to the following:

90 (days) X 200,000 (CUR lines per day) = 18M rows        

That is a factor of 1200:1. Monthly trends? Multiply by ~30. Yearly? Multiply again by 12!

CUR data does not change

Cost and Usage Report data - as well as most other transactional data - does not change once created. If you open a CUDOS dashboard or similar on QuickSight, you will aggregate the same data over and over and over. This might work well for small data samples and for development purposes, but for any significant cloud spend, the amount of data grows quickly. That means your dashboards and reports will get slower and slower, but also more expensive.

What if you analyzed what data you need for your dashboards and reports, and pre-aggregated all relevant metrics? Then your reports and dashboards only need to read pre-digested stats and visualize them, and go from “go for a coffee while the report is being created” to “almost instant”.

Workflow breakdown

Let’s watch that image above again, and check what this sample workflow does.

When was yesterday?

The first couple of steps are required to get the date of yesterday and extract its values: year, month, day. This is useful because, with most cloud providers, today you will get yesterday’s usage and cost data.?

Cost data

Then it creates sample data for the given date - in a production workflow it would obviously pull the data from the service provider.

Retrieve aggregation settings

Now comes the magic: I’ve created a table with all the aggregation settings I want:

  • Finance: Monthly and daily costs grouped by ‘CostCenter’
  • Operations: Several combinations of hourly and daily costs, and service category, environment, application, resource name, and so on.
  • CUDOS: plenty of aggregations required for the CUDOS dashboard
  • Billing: monthly costs grouped by billing account

OpenOps Table with sample aggregation settings

Aggregate and upload

For each aggregation setting, we aggregate the cost data accordingly and upload it to an S3 bucket.

Why is this magic?

Is there a new metric or report, and you need new aggregation settings? Just add them to the aggregations table. Not yet done in the sample workflow, but you can extend it to identify new aggregation settings, and pre-aggregate old data, while known aggregation settings only consider the data of yesterday.

Then there is the obvious performance and cost benefit. No more sluggish dashboards, and with the cost reduction, the tool would pay for itself even if you went for the Enterprise version.

How long did it take me to build this workflow?

Less than two days, including the time required to create a full-fledged open-source repository with functions to aggregate data and also create sample data, documentation, contribution guide, 100% test coverage, etc.

The time required to build the workflow was less than an hour, including the figuring-out-how. You could, obviously, go old-school, and create your own custom ETL pipeline to do the same, but I have an inkling it will take you slightly longer.

Conclusion

What are you waiting for? Head over to openops and get creative! It’s open-source - still in beta - and you can self-host it for free! There will be a SaaS version soon, as well as Premium and Enterprise features.

Ram Sharma

Cloud FinOps and Cost Optimization????

1 天前

Seems like a no brainer. Great solution for reporting

?? Bastien Martins Da Torre ???

Cloud FinOps Managing Partner

1 天前

Pragmatic, actionable & very interesting approach : congrats ? Erik Norman for making it clear ??

Andrée Lundberg

FinOps Lead H&M | FinOps Foundation Ambassador | FinOps Certified Proffesional |

1 天前
回复
Benjamin van der Maas

Senior FinOps Engineer at J&J | FinOps Foundation Ambassador | @finopsscribler on Medium

1 天前

Good read! Thanks for sharing :)

要查看或添加评论,请登录

? Erik Norman的更多文章

  • OpenOps

    OpenOps

    When I first heard about this tool, I thought, "No. Please.

    4 条评论
  • FinOps practitioner: the navigation system for your cloud journey

    FinOps practitioner: the navigation system for your cloud journey

    Coming from a software engineering background, I’ve often heard the definition of a developer: a device that turns…

    2 条评论
  • The perfect formula for your FinOps success

    The perfect formula for your FinOps success

    Introduction One of the biggest challenges in FinOps is Empowering engineers to take action. We will discuss how to…

    7 条评论
  • Is AWS really ready for IPv6?

    Is AWS really ready for IPv6?

    I had already read about AWS's plan to charge for public IPv4 addresses, and I must say I was thrilled. It seemed like…

    4 条评论
  • FinOps and Automation

    FinOps and Automation

    You wouldn’t optimize a resource for $50 annualized savings, right? The effort will probably cost you more than what…

    8 条评论
  • Cloud cost optimisation with ITAM

    Cloud cost optimisation with ITAM

    The siren song In recent years, we’ve seen a large number of companies migrate to the cloud, following the song of…

    1 条评论