Announcing Scanner for Splunk: Lightning-fast threat hunting through your S3 logs directly from Splunk

Announcing Scanner for Splunk: Lightning-fast threat hunting through your S3 logs directly from Splunk

We're excited to announce the release of our custom Splunk app, Scanner for Splunk, which makes it easy for users to leverage logs in S3 for advanced threat hunting and detection - all while staying entirely within the Splunk UI.?

This app helps teams expand their visibility into historical logs and high volume log sources that are only stored in S3 and not indexed by Splunk.

One interesting cost-reduction approach we've seen our users take is to move high-volume logs out of expensive Splunk ingestion and instead store them directly in S3. Then, they use Scanner to index the log files in-place in S3, and they use our custom Splunk app to perform fast threat hunting and detection on those logs directly from the Splunk UI. This can help reduce costs for these log sources by 80-90%.

Reduce blind spots, lightning-fast threat hunting

In our experience, a typical security team will keep somewhere between 2-5x as much log volume in S3 as in Splunk. For these teams, the Scanner app can increase detection coverage by 2-5x, sometimes dramatically reducing blindspots in high volume log sources, like AWS CloudTrail, Cloudflare, VPC flow logs, and others.

Retention in S3 with Scanner can be much longer than the typical 30-90 days in Splunk due to the low cost of object storage. It is common for our users to search through one year or more of logs in their S3 archives, helping them perform more exhaustive investigations and surface potential advanced persistent threats.?

By leveraging serverless compute and a novel indexing system, Scanner executes large queries quite rapidly. For example, it takes only 1 second to return all hits from 10TB of log data.

How does it work?

Scanner custom commands support queries and visualizations


The Scanner for Splunk plugin provides new custom search commands. When a custom command is executed, the Splunk search head executes a request against the Scanner API to kick off a search query against S3 data, and the results are returned back into Splunk.

Since the Scanner custom search commands can be used as part of the Splunk query pipeline, users can transform Scanner results using Splunk commands, or join results from Scanner together with results from Splunk indexes.


Queries can join Scanner and Splunk data together


Scanner's indexes are built from the log data in a customer's S3 buckets. The index files that Scanner generates are also stored in the customer's S3 buckets, so there is no vendor lock-in: our users control all of their data.?

Since Scanner can analyze logs in their raw format, there is no need to perform a new data engineering project (eg. creating AWS Glue tables) whenever there is a new log source. Just point Scanner at the bucket, indicate which S3 key prefixes to index, and you will be off to the races.


High performance?

When a Scanner query executes, it launches serverless Lambda functions to traverse these index files at high speed, narrowing the search space rapidly. This can give teams rapid threat hunting capabilities, even on petabyte-scale log sets in S3. Scanner queries complete in a few seconds on 100TB of data, for example, whereas queries in other S3 scanning tools like AWS Athena might take a few hours.

Here is some performance data comparing Scanner with another S3 scanning tool, AWS Athena. The data set is 250TB of CloudTrail logs in JSON format. In this example, Scanner and Athena are both querying for all activity from a specific AWS Access Key over varying time ranges. Scanner is hundreds of times faster on large time ranges.


?

Re-use existing Splunk content?

Teams with extensive security content in Splunk, like alerts and dashboards, can apply them to new log sources in S3 after a few reasonably small tweaks.

Scanner's query language is roughly a subset of SPL (Splunk Search Processing Language), so it is fairly easy to adapt existing content to the log sources stored in S3.

The conversion process typically involves changing the first part of the query that performs the filtering and basic aggregations, and then piping the results to further Splunk commands for additional processing, reshaping, and joining with additional data sources, like indexes and lookup tables.

Scanner also provides out-of-the-box content for log sources that are commonly stored in S3, including AWS CloudTrail, Cloudflare, Crowdstrike FDR, and more.?

Additionally, our team provides a concierge service to help our Splunk users update their existing content, making the transition as easy as possible.

?

Out-of-the-box security content available from Scanner

?

How Scanner compares to Splunk Federated Search for S3

Splunk's new federated S3 search also allows querying S3 logs, but Scanner for Splunk differs from it in a few ways:

  • Easier to write queries. No need to write SQL. Users can write ad-hoc queries naturally in Scanner's query language, which is roughly a subset of SPL.
  • Easier onboarding of new log sources. No need to create AWS Glue tables. Scanner analyzes log files in S3 in their raw format with minimal configuration needed.
  • All fields are indexed for fast search. Scanner uses a novel indexing technique that covers all fields in the log data, so search is fast no matter what fields a customer is using in their query.
  • Lower querying costs. We've heard from our users that long-running queries in Splunk's federated S3 search can use a large number of data scan units, which can become expensive. Scanner provides free compute units every month based on a customer's log volume, which typically covers all querying for our users. Any additional querying beyond that is charged at $2/TB, which is less than half of what other S3 scanning tools (like AWS Athena) charge.
  • Available in Splunk Enterprise, not just Splunk Cloud. Teams can install our custom app in their Splunk Enterprise instance and start querying their S3 logs.?


Cost savings

Some of our customers move high-volume log data out of costly Splunk ingestion and into cheap S3 storage. They then use Scanner to make those logs visible in Splunk at much lower cost.

Here are some pricing examples showing cost savings after moving a high-volume log source with varying volume out of Splunk ingestion and into S3 where Scanner indexes it.


Getting started?

Here are the steps to take to get started:

  • Visit the Splunkbase page for Scanner for Splunk to install our custom Splunk app.
  • Sign up for a Scanner demo. Meet with an engineer from our team to learn more about the product, and we'll learn how we can meet your use cases and requirements.
  • Set up indexing for your logs in S3. Follow the integration steps to provide IAM permission to read logs and emit notifications when new files are ready for Scanner to index.
  • Route log sources to S3. Here is some example documentation for pushing various common log sources to S3:

?

After installing Scanner for Splunk, you can continue to use Splunk as a single pane of glass and increase your threat hunting and detection coverage to all of the log sources you want, not just those ingested by Splunk.

Take your threat hunting and detection to the next level with Scanner.

?

Visit Scanner.dev to Learn More


Threat Detection Rules, Or How to Stop Your Redis Server from Mining Bitcoin for North Korea

Read More


How Not To Spend Half a Million Dollars on Logs

Read More


Data Engineering Podcast: Build A Data Lake For Your Security Logs With Scanner

Read More


Introducing New Statistical Aggregations: Average, Percentile, Variance, and More

Read More


Sam Bhagwat

building mastra (YC W25), the .TS agent framework

10 个月

super exciting!! great to watch this launch!

要查看或添加评论,请登录

Scanner.dev的更多文章

社区洞察

其他会员也浏览了