Why your detections may be failing (and what to do about it)
Hacks regularly go unnoticed for a longer period of time. But not just in organisations with immature security. Not even in organisations that do not have a SOC. They occur in organisations with a SOC as well as those without a SOC. If you are lucky, that unnoticed breach was part of a red team exercise, and you get all the insights from the red team to improve your detections. But eventually, everyone's luck runs out at some point. False-positives may not be your biggest problem, instead false-negatives may be what you should be really worried about. Your detections may be failing, even if you are not aware of it.
So why do detections fail? There is a number of reasons for that. Detecting a breach may fail because detections for certain TTPs are not implemented. Or they are implemented, but the proper data source is missing. Or the data source is connected, but coverage is insufficient. And there are many more things that can go wrong. Managing a detection ruleset is not a fire & forget activity. Detection engineering is hard.
To support a successful detection engineering effort, detections must cover your relevant risks and must be actively validated to ensure that their are functioning as expected. This requires proper input and design, as well as active validation of your detection analytics. This is shown in the following figure:
Input and design
The first step to ensuring that you are detecting the threat that you want to detect is to have the right detections in place to start with. The detection process starts with determining what to detect and turning that into an operationalized detection. This requires the right input and proper detection design.
Input
New analytics that are designed properly, address the biggest risks in the IT environment. Having the right analytics in place requires the right input:
Now that you have the right input to go on, the next step is to design your analytics properly.
Design / update
In the analytics design (or updates to that design due to changes in input or findings from validations), the detection engineer should account for the following:
Between design and deployment, the detection engineer will also test the correct working of the analytic. While this is part of validation, it takes part in a controlled environment with parameters that are understood. The validation strategy is aimed on validation after deployment, during the run phase, where there is no controlled environment.
Validation strategy
Once operational, the accuracy and functionality of the detection ruleset should be continuously evaluated. This validation strategy has three major elements:
Analytics validation
The core part of your validation strategy is active validation of analytics. There is a number of activities that can be used for this purpose, supported by tooling and processes. The following four activities are part of the analytics validation process:
Adversary emulation. Automated adversary emulation is an activity where validations are automatically validated by specialised tooling. Examples of such tools focusing on MITRE ATT&CK? focused validation include MITRE Caldera, Red Canary Atomic Red Team and Mordor. Other frameworks include MAAD-AF. Breach & attack simulation tools also have similar capabilities. Note that adversary emulation is sometimes also used to refer to activities involving tools that are used to red teaming purposes (such as Cobalt Strike).
Red & purple teaming. Both activities can provide valuable insights into the performance of your detections. While red teaming has a broad focus to also test preventative measures as well as incident response capabilities, purple teaming is narrowly focused on determining if certain attack techniques are detected by the SOC.
Threat hunting. While not focused on validating detection, threat hunting investigations may uncover threats previously unnoticed in security monitoring and therefore provide essential input for improving your detections. Threat hunting investigations can serve as a safety net for those attacks that have gone unnoticed.
?Incident response. Lessons learned in incident response can also be used to improve your detections. An analysis of a previous incidents may uncover that the detection was detected in a later stage and that certain aspects of the incidents / attack went unnoticed. This analysis provides valuable insight into areas for improvement.
领英推荐
Data source validation
The second element in your detection validation strategy is validation of data sources. Data source validation focuses on ensuring that the right logging is continuously ingested by the security monitoring system. Data source validation has three main activities:
Coverage. Detections will fail if you have failed to connect relevant data sources to the security monitoring infrastructure. Achieving a higher level of coverage is an essential activity within the SOC. This requires active measurement of your current coverage and a structured approach to improving coverage.
Log configuration. Log configuration aims to ensure that systems are providing the right logging and the right level of logging detail. While the SOC is not responsible for configuring log settings in the IT infrastructure, the SOC should be actively involved in defining the logging policy and is an important stakeholder in having the right logging settings as part of organisational configuration management. The SOC can also measure the logging quality.
Log availability. Even when systems are connected to the central security monitoring system, and logging is configured properly, systems may still fail to send over their logging, for example due to a connectivity issue. So actively identifying systems that are not providing logging and working together with IT to resolve these issues is vital.
Tools like DeTT&CT can help to gain insight into data source availability and log source quality and how this impacts detection of attacker techniques in the MITRE ATT&CK? framework. Note that parsing error may also lead to failed detections, so monitor for parsing errors.
Use case life cycle management
The last element in your detection validation strategy is life cycle management of your use cases and associated detection analytics. Part of life cycle management is measuring your use cases and analytics. In the context of detection validation, two measurements are relevant:
Last time fired. Some detections will go off daily, while others may only fire occasionally. Some analytics are so critical that you hope they never fire. Such rules are likely highly specific and not covered by standard adversary emulation tooling. Measuring last time fired will give an idea of what analytics rarely if ever go off. In some cases, this is expected, while in other cases, something may be wrong with your analytics rule. Not that even when this is expected, such rules should still be regularly validated (manually if there is no other way to do it).
Last time reviewed. Analytics should be regularly reviewed and updated. Measuring when rules where last reviewed and actively reviewing those that have not been reviewed for a long time is part of keeping your analytics up to date and accurate.
Conclusion
In this article, I have explored the different elements that should be addressed to properly implement your detection engineering program, get the right detections in place, and continuously verify that these are working as expected. To summarize:
Here is a brief overview of potential reasons for failure of your detection logic.
Further reading and resources:
Capability Abstraction:
DeTT&CT:
Adversary emulation:
Sales and Marketing Executive at Condition Zebra (M) Sdn. Bhd.
1 年Thanks for sharing! Kindly look into joining our upcoming webinar: ??? Discover MDR: Your Key to Cybersecurity Success! ?? Join our Managed Detection and Response (MDR) webinar and take charge of your cybersecurity! ?? Don't let cyber threats catch you off guard. Stay protected with MDR! ?? Date: Monday | 4 September 2023 ? Time: 3.00 - 4.00 pm Learn from our experts: ? The current cybersecurity threats and trends ? Introduction to Managed Detection and Response (MDR) and its benefits ? How do we ensure cost efficiency and the successful implementation of MDR ? MDR Best Practices for strengthening your cybersecurity posture ? MDR Demo session ?? Reserve your spot now: https://condition-zebra.com/online-webinar/ Get your questions answered live during the webinar! ??♂???♀? #CybersecurityWebinar #MDR #ManagedDetectionAndResponse #StaySafeOnline #CyberProtection #WebinarEvent #CyberDefense #CyberSecuritySolutions
Permiso Security | Identity Risk Visibility and Threat Detection For ALL Identities in ALL Environments | Helping Security & IT Unite to Secure ALL Identities
1 年Great thought leadership and insights here Rob van Os! From your perspective, what degree is alert fatigue impacting the strategy of continuous detection refinement? Are any of the failed detection reasons you listed in the table having a greater adverse impact vs others? P.S check out what we (Permiso Security) are doing to address detection challenges in cloud environments https://permiso.io/blog/s/permiso-extends-could-threat-detection-and-response-capabilities-azure-azuread-microsoft365/ Helping SOC teams by generating actionable session reports showing all identities and resources involved and the access chain as a human, vendor, or machine identity moves through cloud environments
This is my research and personal account.
1 年Hello Rob, I like your lifecycle, the Analytics Validation block could benefit from our recent work published in measuring signature detection with objective measures. We have released a tool called SigmaTau that works for Sigma only but can be extended in general to also Yara and Suricata. Happy to have a chat and if you have time you can read our whitepaper here. This is all fully open source: https://www.priam.ai/whitepaper Tool: https://github.com/priamai/sigmatau