Abnormal Process Execution: Is an LSTM Deep Learning Model a Good Fit?

Abnormal Process Execution: Is an LSTM Deep Learning Model a Good Fit?

These days, I am experimenting with different ML/AI models to explore practical use cases of AI in cybersecurity, as there is still limited literature available on the topic. Having extensively worked on building detection technologies, I always feel the urge to develop better systems that can enhance threat detection in complex scenarios.

Almost all systems that use process behavior-based detection methods, such as EDR (Endpoint Detection and Response) and sandboxes, rely heavily on analyzing abnormal process execution—i.e., parent-child process relationships—for detection. The concept is simple: processes should behave according to their intended use. For example, a PDF reader process should be used for reading or editing PDF documents; it should not execute code or invoke other system utilities. More sophisticated examples, like the Log4j vulnerability and the SolarWinds supply chain attack, illustrate cases where a process performed actions it was never supposed to. Exploiting an application is both a fascinating and terrifying aspect of computer science history. Exploits that manipulate process memory (e.g., buffer overflow, use-after-free, format string vulnerabilities) hijack the execution flow of a process at runtime based on attacker input. Traditional detection methods are complex and require extensive in-memory adjustments. If you're interested, you can read more about some fantastic research we conducted in the past that helped detect exploits in a generic way (https://patents.justia.com/patent/11244044).

Traditional methods can detect and block attacks but require significant expertise and constant attention. Any mistake can potentially wreak havoc on the system (remember CrowdStrike? ??). The easiest approach is to collect telemetry and use rules/policies to define abnormal process execution, and this method has been working quite effectively. However, there’s a major issue—rules and policies cannot cover every scenario or application. For example, if /bin/ls is found to be vulnerable to a local privilege escalation exploit and spawns another shell, how would we detect it? With rules and policies, we can cover critical applications we are aware of, but not every possible case. Another problem is that rule-based detection is static, meaning that any variation can bypass it.

The latest advancements in AI and the release of several powerful models offer a promising solution to these problems. In my experiments, I evaluated the LSTM (Long Short-Term Memory) model for detecting abnormal process sequences. LSTM models are particularly effective for handling sequential data and can process long sequences efficiently. This raises the natural question: what type of data should be used to train the model?

Data, Data, Data…

In AI, data is the key to everything. If you have high-quality data for your use case, even current models can provide effective solutions. I experimented with approximately 12,000 records—around 6,000 normal execution sequences and 6,000 abnormal execution sequences. I collected normal sequences from several virtual machines by running and installing various software and utilities. Abnormal data is harder to acquire, I created a script to generate sequences resembling real attack patterns. I generated this data in two variations:

  1. Completely random sequences where the execution order of processes is unrealistic (e.g., /bin/ls spawning /bin/chown).
  2. More realistic sequences that could indicate potential attacks (e.g., systemd -> httpd -> bash -> bash -> python -> xmrig).

Training

I trained the model using a sequence length of 10, 1 bidirectional and 1 single directional layer. Since my training dataset was relatively small, the training process completed quickly.

Results

Based on the available data, the results are promising. Here’s a snapshot of the outcomes:


Since I don’t have a large dataset, I couldn’t fully evaluate the model’s efficacy. I am releasing the trained model on GitHub (https://github.com/amitsec-ai/lstm-abnormal-process/tree/main) for further experimentation. If anyone tries it, I’d love to hear about your results—or even better, share more data if you have it!

Conclusion

Deep learning models, especially LSTM, seem to be a strong fit for this problem. Even with minimal data (just executable names and sequences), the detection results on my test data were quite impressive

要查看或添加评论,请登录

社区洞察

其他会员也浏览了