How Not To Spend Half a Million Dollars on Logs
Scanner.dev
Fast search and threat detections for security data in S3. Reduce the total cost of ownership of your SIEM by up to 90%.
A month ago, we started playing with a fun data set to push the limits of our product some more. We decided to dial it up to 11 and indexed a data set of 100 billion synthetic AWS CloudTrail log events with a cumulative size of 250TB.
Why 250TB? This is the data set size at which standard log tools start to cost more than half a million dollars per year. Ouch. We wanted to see how much Scanner could beat that price, how easy it really was to onboard, and whether Scanner’s log search remained fast at that scale.
To get more detailed pricing estimates for standard log tools, let’s assume we ingest the 250 TB data set in one year, or 700GB/day on average. Given this volume, here are some estimates of how much the standard tools would cost:
Azure Sentinel
IBM QRadar
Splunk Cloud
Datadog
Across the board, you’re looking at an annual log bill above half a million dollars.
How does Scanner’s cost compare? By indexing 250TB of logs into Scanner, we were able to answer this core question and others:
Cost: 80-90% less than standard tools.
AWS CloudTrail logs are a useful data source for security teams because they capture basically all of your organization’s AWS API calls, but the challenge teams frequently face when using CloudTrail logs is the extremely high volume. At enterprise scale, it’s fairly common to see CloudTrail log volume approach 1TB/day, and as mentioned above, this will likely cost at least half a million dollars per year to support in standard log tools.
During our indexing experiment, we validated that our infrastructure usage costs stayed extremely low. Here are some data points:
Throughput
Storage
Database
Compute
What is the upshot of this usage data? Scanner’s infra remained cost effective at a meaningfully large scale, especially the indexing compute that we’re being a little secretive about.
We determined that, depending on the features you need to enable and the scale of your detection rule set, Scanner’s customer-facing price can be comfortably in the range of $50k to $100k for this 250TB data set. In other words, 80-90% less than standard tools.
To me, this feels like the price that logs should always have had. A log tool should augment your team, not be more expensive than team to pay for. It also seems right that a log tool should charge a handful of dimes per gigabyte, not a handful of dollars.
Onboarding ease: Trivial if your logs are in S3. A little bit of work if they aren’t.
For this 250TB synthetic CloudTrail data set, we created the same sort of file structure that AWS CloudTrail does natively. In particular, we created about 60 million S3 objects in an S3 bucket, with the key structure looking like this:
We also tried onboarding this data set into a few other tools that have native S3 integrations, and we experienced varying degrees of success and pain:
Amazon Athena.
Snowflake
Onboarding logs into Scanner was about as easy as onboarding logs into Athena, and far easier than onboarding logs into Snowflake. Here are the steps we took in Scanner:
In all, setting up the integration and kicking it off took maybe 10 minutes all together, so for logs like CloudTrail that are already in S3, the onboarding process in Scanner is pretty much trivial.
If you don’t have logs in S3 already, you will need to do more work. Thankfully, many security-related data sources already have integrations to push logs to S3. So at least for these tools, onboarding onto Scanner is easy. Here are some examples of log sources that have built-in integrations to push logs to S3 buckets:
For logs that aren’t in S3 already, we recommend trying out Cribl and Vector.dev. Over time, we want to make it easier for Scanner to automatically pull from the most frequently used security log sources, which often have export API mechanisms. But for now, you will still need to do some work to get such logs into S3.
Speed: We started playing with our logs in a whole new way
When it comes to speed, it’s probably best to let the product speak for itself. In this video, we used Scanner to query the 250TB synthetic CloudTrail data set, and for the sake of comparison, we also ran queries in Amazon Athena and Snowflake.
Why only test the data set in Scanner, Athena, and Snowflake, and not also in those standard tools mentioned earlier, i.e. Azure Sentinel, QRadar, Splunk, and Datadog? It’s pretty simple – do you have a few million dollars lying around to burn?
Here are some of the highlights from the video:
We ran a query in Athena to look for indicators of compromise on 250TB of logs.
We ran a query on Snowflake to find the same indicators of compromise in a much smaller data set: only 15TB.
Then in Scanner, we ran several queries, like:
Since Scanner is so fast, we found ourselves playing with this data set in new ways.
Given that it’s actually feasible to look for indicators of compromise across 1 year of logs in Scanner, not just the last 90 days, we could explore the full data set without heavy up-front planning.
For example, doing a big historical log scan in Amazon Athena to find indicators of compromise would have probably required a few days of planning and maybe one week of execution, job babysitting, and fixing bugs.
In Scanner, scanning the full data set can be done in one sitting in an unplanned, ad-hoc way. While you’re in there, if some suspicious data catches your eye, you can keep playing around with the data and investigate, testing hypotheses rapidly.
This is how 250TB should feel. Light as a feather.
Typically, once you start to generate close to 1TB of logs per day, managing logs starts to feel… well, heavy. The financial burden starts to become serious, and annual costs reach $500k to $1M. You need to start twisting your log pipelines into knots to filter out data to keep costs under control. You likely lose visibility into time ranges older than 90 days.
This is actually a pretty silly state of affairs, and it’s caused by the fact that standard log search tools still use an architecture designed for the on-premise era, running indexing clusters that couple storage and compute together on each machine. This is an order of magnitude more expensive to run and scale.
By leveraging a better architecture with storage and compute decoupled, Scanner:
If you’re reading this and think that Scanner could help out with your logging projects, we’d love to chat and see if it works for your use cases. You can visit us at our website and reach out to book a demo. Thanks!