Fluency Platform. Beyond Pipeline Observability (Part 1)
Should I read this?
If your responsibilities include moving large volumes of data between systems, then this is an important read. This article series focuses on how to implement a large-scale infrastructure to not have bottlenecks in the collection or ingression of data. Bad message management results in the lack of visibility when collecting, a delay in enter data into a system, and even delays from analysis out of a system. Proper routing of data results in reduction of cost by optimizing where data is stored. When you are done reading, you should understand what piping is and how it impacts infrastructures.
Challenge
Security audit sources like Web Services, Cloud infrastructures, and endpoint detection and response (EDR), have created an increasing large amount of data. The result is increases in alerts, lack of trained personnel, delays in analysis, gaps in security reviews, and high costs indexed stored data. The challenge is that solutions did not scale to the work load of the solutions.
Cybersecurity has beed modifying solutions to address analyzing security data. Elastic was a big data database. Splunk was an application performance monitor (APM). Both solutions were modified to perform Security Information Event Management (SIEM) solutions. The result was a focus on database searches, not on the collection and analysis of streaming.
Innovation sometimes can be found in removing a component instead of improving it. One thing all commercial SIEM tools have is a reliance on databases. The constant storing and searching of data has not scaled.? Solutions, like Cribl and Edge Delta , aim to reduce the cost by not sending data to the SIEM. This creates a greater security gap. Fluency Security has a database for threat hunting and investigation, but Fluency does not use a database for analyzing, correlating, or alerting. Instead Fluency leverages streaming analytics, meaning that analysis is done in the pipe prior to the use of a database. The results are immediate alerting, better detection, and reduction in cost.
Introduction
As infrastructures scale, the moving of data becomes complex can become a bottleneck to the systems that it integrates. Fluency has been moving large amounts of data since our inception in 2013. The infrastructure to move data is referred to as pipes. Pipes are used for the collection, processing, and routing of data in an infrastructure. ?They are observable when the status of processing and metrics of flow are reported from the pipe to the operators. Gartner’s term is telemetry pipelines. Regardless of the term, it is the backbone of distributed information systems, such as security information and event management (SIEM) and social networks like X ( Twitter, Inc. ).
While routing data can save a company money by storing infrequently searched data in slower, less expensive storage, the operational benefits of visibility, metrics, and alerting, are themselves a reason to implement observable pipes.
When Fluency SIEM demonstrates its ability to detect and notify in real time, it is demonstrating the processing element in our pipes. That processing element performs stateful functions, such as parsing, metrics, entity association, and micro-analytics (https://www.dhirubhai.net/pulse/fluency-platform-solution-data-message-handling-chris-jordan/ ).
At the heart of pipes is the collection -> parsing -> routing sequence. It is this sequence that represents the critical path. Meaning that this sequence represents the minimum amount of time that any message has in message management. When the critical path does not scale, the system becomes congested, and will eventually fail.
Benefits of Pipes
Before going into the details of understanding and comparing piping systems, we first need to understand what pipes provide. If we oversimplify its requirements, then we will compare solutions that do not meet the need. It is like defining a truck as a four-wheel vehicle and then trying to use a car to move your home.
A complete pipe allows for stateful processing of properties inside the data. The pipe accounts for the data lifecycle from the inbound data source until it is stored as part of the pipe’s outbound process.
In general, Data moves from collection to storage to searching. This process of moving is complex. It requires the streaming of data in the pipe, and at the same time health and insight of the pipes moving the data, referred to as observability.
Fluency’s Platform extends the Piping capability by adding stateful processing to the “Analysis” phase, while also enhancing the in-bound collection and enrichment functions. The first law of thermodynamics is equivalent to the law of conservation of energy, and data piping is very similar. That it is more efficient and scalable to process data when in motion, than to store the data and process later. The cost to use a database as a hub in big data, means that data must be retrieved in a cold state to be transferred into a hot one to be processed. All this storing and retrieval is expensive and grows exponentially worse as the data scales.
To increase timeliness of notification and reduce the computational expense of fragmented searching, Fluency leverages the stream of data by implementing “Micro-Analytics.”
Streaming Analytics
Streaming analytics refers to answering questions from data that is occurring as the process runs. The analysis maintains a state of the data related to the question.? We need to be careful here. We are not querying the data.?
Security Orchestration and Automated Response (SOAR) tools talk about the importance of timeliness in response. The worst offender to a slow response is actually the use of a database for detection. The reason is that most detection search a database. As the number of "signatures/searches" increase, these searches are done more infrequently. Many searches occur hourly or even daily, meaning that the time for an automated response to have any impact was missed.
A database query refers to a complete set of data, normally structured in a relational database. The query selects data from the database. Data is in a table format, and tables can be joined to create new tables. The select statement articulates a subset of rows from a resulting table. Selection is by evaluating fields in a row to a boolean statement. The data is often aggregated (GROUP BY) and a function is associated with the aggregation.
Streaming analytics have an alternative to a query language.? Streaming analytics can use a programming language to express the question. Consider the SQL from before. What if the question was more complex than wanting the SUM of a worker.
What if the question was to determine the deviation of previous work hours to determine when there is an anomaly in the hours on the workers last submittal. To do this with SQL, an analyst would write a program to download all the previous hours submitted and then organize them by time and compute the the answer. To do this for all employees, requires every record to be downloaded. At this point, SQL is useless, as every record would need to be retrieved. In a streaming system, the state of the seasonal position can be updated with each new value entered. The result is instant. The streaming system does need to maintain the state of the entity, the worker in this case, but it is not recomputing the seasonal values.
A special type of streaming analytic is a metric. What makes a metric different, is that the calculation is not based on an event triggering, but instead the trigger is a consistent time period. In streaming analytics, we can maintain a state and for each time period, we can send the result to a metric database, such as Prometheus (https://prometheus.io/ ).
In summary, the database process is inefficient. Big deal? Inefficient in the Cloud means too things. It costs more, and its slower. Slower to security means late. With all the effort in automating responses, if the alert is late the options of what is automated is also late.
Streaming Analytics can perform big data analytics, not only faster and more efficient, but can perform analysis that database analytics will never be able to do. Security is an industry that analyzing data is critical. So, why is streaming the way to go for security?
1.????? Not all questions can be articulated in a query language.
2.????? We wanting the answers to multiple questions at the same time.
3.????? We care how quickly we get an answer.
Let’s separate query from question. Query is a formatted structure that a program runs to retrieve data, while a question is what you want to be answered. That is a good segue to our first point.
Not all questions can be articulated in a query language.
Query Languages are not programming languages. Programming languages are statements (instructions) that have structure. A query language is actually a configuration statement to a program. The statement is excuted by the database program. For example a simple SUM query:
SELECT worker, SUM(hours)
FROM agents GROUP BY worker;
This tells the program to read records from the AGENTS table and create a ROW record for each working_area. As the program reads each record in AGENTS table,? create a new working_area row and put the value of the hours in its value if one does not exist, if it does exist then add the hours of the row to the temporary table’s row. When complete, export the new table.
There is implied logic in an SQL statement. And that is its major weakness. This logic structure was created in 1972. All the advances in data science and computer science are not considered in its design. The results is that fifty years later, data analysts are using an antiquated solution. One that never considered the issues of big data or cloud computing.
领英推荐
To overcome this limitation, analysts use programs. However, the program does not interact with the data at first, but calls this antiquated solution, SQL, to retrieve the data. Then with the limited data retrieved, it analyzes the data. The limitation of SQL is not overcome, but instead hidden.
Streaming Micro-Analytics
Micro-analytics is when a custom function is passed into the dataset. Fluency can perform this on a static datastore by using our execute-in-place approach. However, micro-analytics are even more powerful and less expensive when we want to evaluate a stream of data, which is basically all the data that enters a SIEM.
Before going too far, SQL programs will tell you there are functions in SQL. That is true, these functions are:
The limited breadth of functions in SQL makes it ill-suited for analytics. Hence why we wrap code around database calls.
How do micro-analytics differ from a database query? Consider the case where you want to know when a metric falls outside a normal range for a system being monitored. To perform this, you would need to make a list of all system, and then group the data for each system. Then you would need to compute that curve of the historic deviation against a line of the actual occurrence. This is the graph to the left.
To do this for the entire system, you need all the data. The code doesn’t need a database, it just needs all the data.
Combining streaming with micro-analytics, the pipeline calculates the seasonal curves as data arrives in the system and can plot the expected and seasonal threshold curves at the same time. The state of the system in the pipe can then compare the actual result to the min-max thresholds and alert when an initial fault occurs or can wait and validate that a series of faults are occurring.
Using micro-analytics on streaming data is highly efficient and alerts occur on occurrence, unlike a database technique that does so at large intervals. The database technique is not only inefficient, but critical alerts also occur significantly late. When considering cost, Cloud infrastructures, like AWS, cost is charged by the processing power and stored data used. Comparing streaming analysis to database analysis, streaming is exponentially less costly as in a database approach each system and time overlap multiplies the computational and storage needs.
The structure of a streaming query.
Not all streaming approaches can perform analytics. The three most popular streaming alerting systems are simple filters. A filtering system performs a grep-like analysis of the incoming record and triggers when there is a key-value expression match.
To perform micro-analysis, the process must:
1.????? Determine a messages entity hierarchy (its peers and parents). This can be thought of as a Group by in SQL. What messages belong to the scope of function.
2.????? The entities state must be maintained. It is not necessary to have a complete history of an entity’s messages; however, there needs to be enough data to support the analytical function.
3.????? There needs to be the function itself. This can be complicated, like a seasonal curve, or simple like a count or average. Functions operate of a set of data with a common attribute for a given range of time. These three aspects, set, common attribute, and range, define the minimal baseline for a function.
4.????? Needs to be a trigger. A trigger is a comparison to determine when a value is produced. For a metric, this trigger is a time period. For a signature, this trigger is a key-value condition.
Micro-analytics allow us to know when a stateful condition is met. This is done in traditional SIEMs by searching a database and then analyzing the resulting table.
Marketing people make the idea that all “searching” is the same. Clearly, how one analyzes data in a stream differs from that of a relational or document database.
The characteristics of streaming data means:
1.????? That an analytic is considering the set of data as time progresses, as with a database the search is for a set of data that is complete.
2.????? Members of the set, which the aggregation and function act on, are constantly changing as members enter and leave the timeframe.
3.????? There are no joins in the analytic directly. Instead, the adding and modification of data to the record occurs in the prior “enrichment” phase.
4.????? The value of the “trigger” establishes a match. It is on the trigger that a notification (data) is sent to the user. This notification can go to a metrics database, like Prometheus, and produce a view of the data stream.
As noted by the cyclical marker in the process map, the “Analysis” phase can be cyclical, allowing results from one analytic process to another. In Fluency, these connections occur as channels. Channels in streaming differ from messages, like Redis , in that streaming channels must handle a large volume of data, and so the coupling is tighter with a parent to child relationship instead of an announcement technique.
These points show how streaming analytics differ from a database search, but what about a regular expression technique over the data.
More than Observability
When scaling solutions there can be significantly many processes. The ability to determine the status of processes means that the system can detect both processes that have failed, are about to fail, or deteriorating to a failed state. The system should be able to present past, current, and scheduled processes to include associated results and used resources. The objective of observable is to allow decisions to perform action to avoid loss, deterioration, or error.
While the definition of observability is fine. The implementation of observability is critical to the success in operations. While being able to display pipe health and provide transparency to the process is a good start, observability must be paired with a means to debug, correct, test, and distribute the corrective actions back into operations without impact the system.
Next
In the next newsletter, we will finish up the discussion of pipes. This newsletter focused on the ability to perform analytics in the pipe. But that is an advanced feature. The basic needs of parsing and routing still need to be covered. We also have not really dug deep into deployment, scalability, and operations of pipes. We will also look at the role of Streaming Analytics in modern SIEMs. While there are many SIEMs, only five (5) perform streaming analytics, Fluency being one of these five.
About Fluency
Fluency Security (www.fluencysecurity.com ) is operated by its two co-founders: Christopher Jordan (CEO) and Kun Luo (Product Development). Chris Jordan founded Endeavor Security, a cutting-edge, threat detection company focused on enterprises and the US Governments. Endeavor Security was acquired by McAfee in 2009. Kun Luo was the CTO Endeavor Security. They both left McAfee in 2012 and started Fluency. Fluency is focused on scaling solutions to address the big data analytic problems of cybersecurity.
CEO at SLVA Cyber Security ? Investor ? Advisor ? Entrepreneur
11 个月Spot on guys. By working with ALL available data in realtime streams you are significantly reducing MTTD and MTTR. You are eliminating security gaps which can only benefit SOC teams.