Splunk vs ELK : Security, Scale and High Availability Perspectives
Today developers primarily look at Splunk and the ELK Stack as the two strongest options to solve the same set of problems leveraging versatile use of log analytics, visualization and monitoring . Consensus based on various reports points to the fact that while Splunk allows users to search through the information to extract what they need ELK requires more work and planning in the beginning, but the value extraction is easier at the end. The confusion remains still as who offers the best ROI and productivity gains to users without sacrificing the overall performance. In this article I will focus on performance in a limited manner from only three angles : Security, Scale and High Availability.
Let us start with basics and understand the attributes, ingredients and differentiators of Splunk’s Core. Basically Splunk is made of its forwarder, which pushes data to remote indexers; indexer, which has roles for storing and indexing data and responding to search requests; and search head, which is the front end of the web interface where these three components can be combined or distributed over servers. Please note that Splunk also supports the integration of its functionalities in applications via SDKs. They have been very effective in common use cases which include operational monitoring, security and user behavior analytics. My personal experience is that they have a slight edge over ETL in the cybersecurity or any type of intricate security applications. I must mention that Splunk is a paid service wherein billing is generated by indexing volume.
The ELK Stack is a set of three open-source products—Elasticsearch, Logstash and Kibana—all developed and maintained by Elastic. Elasticsearch is NoSQL database that uses the Lucene search engine. Logstash is a data processing and transportation pipeline used to populate Elasticsearch with the data (though also it supports other destinations including Graphite, Kafka, Nagios and RabbitMQ). Kibana is a dashboard working on top of Elasticsearch and facilitates data analysis using visualizations and dashboards. Additionally its query language is very powerful since Elasticsearch query domain specific language is a flexible, expressive search language that exposes a rich set of query capabilities across any kind of data. From simple Boolean operators to custom relevance functions, users can articulate exactly what they are looking for and bring their own definition of relevance. The query language also includes a composable aggregation framework that enables users to summarize, slice, and analyze structured or semi-structured datasets across multiple dimensions
Both Splunk and the ELK Stack are used to monitor and analyze infrastructure in IT operations as well as for application monitoring, security and business intelligence. But in day to day live migration of complex workflows in clouds dealing with high end enterprise and mission critical (HA) applications data security and security compliance is of paramount importance. It is here that CIOs and decision makers are willing to take a closer look at the following key needs for even some incremental cost.
1.Machine learning and alerting. : ML capabilities such as anomaly detection, forecasting, and categorization are tightly integrated with the Elastic Stack to automatically model the behavior of data, such as trends and periodicity, in real time in order to identify issues faster, streamline root cause analysis, and reduce false positives. Without these capabilities, it can be very difficult to identify issues such as infrastructure problems or intruders in real time across complex, high-volume, fast-moving datasets. Some people think that Splunk does an equally great job here but it costs slightly more to achieve the same results.
2.Elasticsearch serves as the central authentication hub for the entire Elastic Stack. Security features include encrypted communications and encryption-at-rest; role-based access control; single sign-on and authentication; field-level, attribute-level, and document-level security; and audit logging. Kibana is the user interface for the Elastic Stack. Compared to Splunk we get similar performance but developers have to be trained on Kibana UI and related plug-ins—and this adds to training budget, additional learning curve and may shift time-to-market goal.
3.Kibana is gaining Competitive edge : Dozens of Kibana plug-ins have been shared by the community via Elastic documentation and code sharing platforms such as GitHub. Beats and Logstash are data ingestion tools that enable users to collect and enrich any kind of data from any source for storage in Elasticsearch.
4. Beats and Logstash have an extensible modular architecture. Beats are lightweight agents purpose-built for collecting data on devices, servers, and inside containers. Splunk offers all built-in –one type of loaded platform and offers better convenience and user experience but at a slightly higher price point comparing apple to apple in a given use case with a fixed platform configuration. This is why Amazon, Microsoft Azure and others uses most of its components in their public cloud offerings.
So, while developers save time and get more productive using Splunk with a slight increase in TCO but most probably factoring in the productivity gain the TCO may be neck to neck for both vendors.
High Availability : The Lifeblood of Business Continuity in mission critical use cases.
Cloud native applications are finding big uses today in mission critical Systems run by Banks, High Frequency trading, Defense and Telecom. These customers mostly are looking for rapid on-premise to cloud transitions in a fault tolerant manner with great DR and speedy on-line replication but do not want to write new codes as new tools, platforms (despite being 100% open source) do pose newer challenges of dependencies and interoperability. My recent work on such use cases suggests the following after critically comparing ETL vs Splunk
a) No need to Write Code when using Data transformation engine. Logstash is a centralized data transformation engine that can receive and pull data from multiple sources, transform and filter that data, and send it to multiple outputs. Logstash has a powerful and flexible configuration language that allows users to create data stream acquisition and transformation logic without having to write code. This greatly extends and accelerates the ability to create data management pipelines to a wide variety of organizations and individuals. This is where Splunk appears to be weak. Customers handling Mission Critical applications and Mainframe applications have found here a powerful engine for digital transformation in agile and secure manner
b) ELK stack - Elasticsearch (search), Logstash (ingestion and processing), and Kibana (reporting and visualization) - has a lower barrier to entry as it is an open source platform. In the ELK stack, Logstash plays the role of the log workhorse, creating a centralized pipeline for storing, searching, and analyzing log files. It uses built-in filters, inputs, and outputs, along with a range of plugins, to deliver strong functionality to your logs. Due to its open source nature, it’s easy to extend it to custom log formats or add plugins for custom data sources like SMF.
c) Elasticsearch processes and analyzes mainframe machine data, enabling teams to better understand mainframe environments. Today the Elasticsearch integration with Abend-AID allows mainframe system data, including information on abends and other issues collected by Abend-AID, to be automatically sent to Elasticsearch for analysis and actionable insights into the mainframe environment.Elastic transforms a mainframe data (Compuware or IBM mainframe data) into an open source based digital platform. Dozens of battle tested Data models of Elastic search operation using Syncsort and Corelog for IBM and Unisys mainframes are available in the public domain. All these mainframe vendors have fully embraced Kibana which is paying them rich dividends as an industry leading super-intelligent dashboard and orchestrator.
Achieving Scale in Live Migrations : Comparing Splunk vs ETL here I must say the jury is still out as who provides an agile secure scaling for most of the use cases. While not much competitive benchmark data exist currently on a given well defined use case I personally think that one has to work with both vendors how to arrive at an optimum migration flow plan and the overall architecture to ensure scaling is attainable. What I must suggest is that one should start with a great baseline for extracting the key components from both vendors. For example in case of ETL Metrics, Logging, Business Analytics, and Security Analytics –all of these come with pre-built configurations making it easy to use Beats and Logstash to ingest the respective type of data, and include default Kibana searches, dashboards, and visualizations to deliver instant insights. This vendor claims that the Elastic Cloud and Elastic Cloud Enterprise and The Elastic Stack and solutions can be deployed on premises, in public and private clouds, and in hybrid environments.
I also found that shipping data to Splunk is fairly easy. After installation, the forwarders come pre-configured for a wide selection of data sources such as files and directories, network events, windows sources and application logs, and they are used to import data into Splunk . Important to note here is the fact that Splunk web UI includes flexible controls that allow you to edit and add new components to your dashboard. Management and user controls are great and can be configured differently for multiple users, with each having a customized dashboard. Splunk also supports visualizations on mobile devices with application and visualization components that are easy to customize using XML.
I hope this helps to a wide range of audience and I am sure many many unanswered questions still remain. Will come up with a follow-up to this with more competitive data and with greater granularity illustrating 1 or 2 use cases in a great depth.