Deep Learning for protection from Ransomware attacks
Raghuveeran Sowmyanarayanan
Passionate about adding value to customers with actionable business insights driven through AI & Analytics
Nowadays Ransomware attacks are on the rise. Many companies have become victim to ransomware attacks. While there are different types of ransomware, it typically involves the attacker breaching company’s network, encrypting a large amount of the company’s files/data, which usually contain sensitive information, exfiltrating the encrypted files, and demanding a ransom. Therefore, a sudden spurt of encrypted data movement in the corporate network traffic can be a strong indication of ransomware infection. To effectively detect such behavior patterns, there needs to be capability to detect encrypted files using machine learning (ML) and generate encrypted data movement alerts as part of user behavior analytics. This will help companies to identify ransomware attacks as they unfold in their network.
2. Leveraging Analytics to detect signs of files corruption
Data backups are leveraged to observe how data changes over time and post that we can use analytics to detect signs of files corruption indicative of ransomware attacks. Based on content analytics which look for signs of corruption based in metadata, automated alerts can become active whenever suspicious behavior is detected. Some examples that can be considered as signs of corruptions could be mass deletions, encryption , other suspicious changes to core infrastructure, user files etc. In addition to signs of corruption, we need to recognize activity patterns and learn/train constantly. In the event of ransomware attacks,?post-attack analytics reports can be utilized to understand the depth and breadth of attacks which provide a listing of last good backup sets before corruption, to facilitate recovery process.
3.??????Machine Learning for diagnosis of cyber attacks
Traditional signature-based detection, anomaly based detection, immutability of data are not adequate to identify ransomware attacks, next technique being looked at is behavior based detection. An AI/ML driven solution can analyze large data sets with a high degree of accuracy to identify the most subtle Indicators of Behavior (IoBs) at a scale that manual human analysis can never match. The goal of behavior analytics is to detect anomalous user behavior that indicates potential threats such as malicious insiders, compromised accounts, data exfiltration, ransomware, and other threats, through machine learning and statistical analysis.?ML algorithms would be constantly monitoring for malicious activity, detecting actively never-before seen ransomware strains, immediate blocking of questionable behavior and automatic recovery of damaged files and continuously learn from them and train our ML algorithms.
领英推荐
Stack trace analysis is also one of the foundation techniques which will look from track record of what happens at different points in time. By analyzing what happens at each stage, normal activity becomes clear and a reference model is created. In the case of a ransomware attack, new code would be injected into this process – which is readily noticeable. The strongest software solutions use ML that considers only the most popular reference points and excludes aberrations. This approach further refines the machine’s knowledge of good versus malicious code increasing accuracy.
The sequence of bytes in an encrypted file tends to be more random than unencrypted files, which is often manifested in some statistical measures of randomness and information density in the file. Therefore, these statistical tests can be helpful in determining whether a file is encrypted or not. We can explore various statistical tests such as Chi-square Test, Entropy, Arithmetic Mean, Monte Carlo Value for Pi etc. However, our analysis shows that using any of these statistical tests alone is not sufficient to identify encrypted files and can generate excessive false positives. For example, some compressed files also look random according to some of these tests. To reduce the false positives from individual statistical tests, we need to develop some classification ML models to classify whether a file is encrypted or not. The model takes all of the statistical tests and other characteristics of the file as input features, based on millions of real and synthetic files of different file types. A decision tree-like ML model/algorithm can automatically learn the difference between encrypted files and unencrypted files. In our tests, the ML model was able to achieve good accuracy with low false positives. critical to automating correlations by analyzing data at a rate of millions of events per second, so instead of manually querying data, analysts can spend more time acting on the insights produced by AI/ML across disparate assets on the network.
?4.??????Way Forward
?Preventing ransomware attacks on the endpoint, across the enterprise, to everywhere the battle is taking place. This would require hybrid (Static, Dynamic) and multi-level approach ( Feature, Behavior, Stack trace, Encryption etc.) to detect ransomware attacks reasonably accurately. A periodic evaluation of chain ingredients to incorporate the new variants will make our capability more robust and automating this task will be very helpful. Leveraging cloud computing with parallel processing capabilities will further enhance adoption. Isolation of data, files or databases from production environments will make the environment resistant to ransomware attacks. But not all attacks target just data. Key IT infrastructure code are also vulnerable. There have been reports of attacks on operating systems, firmware, network/ communication switches, and applications as well.?Cybercriminals can now leverage?ransomware-as-a-service?kits, allowing inexperienced cybercriminals to deploy complex, undetectable attacks with ease – exponentially increasing the threat at hand. Hence it becomes all the more important to enable ransomware attacks prevention automated through machine learning so that analysts can spend more time on validating insights provided by AI/ML and act quickly to protect their environments rather than doing manual correlations of data.
About the Author
Raghuveeran Sowmyanarayanan is Artificial Intelligence & Analytics Leader heading Delivery Transformation Office at Cognizant and was heading AI&A Healthcare practice earlier and has been personally leading very large & complex Enterprise Data Lake & AI/ML implementations. He can be reached at?[email protected]
Executive Client Partner at Randstad Digital US
1 年Excellent one Raghu ??
Data Architect at Cognizant Technology Solutions
1 年Every subject you publish is futuristic, interesting and unique. Thanks Raghu!