Fuzzy Hash Malware Matching
Mohammad AlAqeel
Subject Matter Expert in Digital Forensic Incident Response|SecOps| Cyber Threat Intelligence| Threat Detection and Response | Cyber Defense solutions| CFCE| GCFA | GCFE| GCTI| eCMAP|OSCP|SOC CMM/OSINT/SOCmint|IT and OT.
First it all, let us undersatdning the Fuzzy hash Malware Matching is a technique often used in malware analysis and threat detection to compare files or data for similarities. It is a part of approximate matching, where instead of seeking an exact match (like in traditional hash comparisons), the goal is to identify files or data that are similar to each other.
Key Aspects of Fuzzy Hash Malware Matching:
Okay, now let's walk through how the Threat actors are abusing fuzzy hash:
So the attacker abusing fuzzy hashing is based on similarity, as we expaline above, and can be exploited to bypass security controls by making malicious files appear similar to legitimate files. This can indeed allow attackers to evade detection mechanisms in an enterprise.
Below are some real-world scenarios where fuzzy hashing might be abused in this way:
Now let's see how we can detect that by using something very important during investigation and lead us to create a watchlist or rule in EDR solutions to detect and avoid the defensive evasion technique by using "ImpHsash."
And the goal of this article is to give users in the Defense Team a better understanding of ImpHash by highlighting the following:
What is an ImpHash?
So, ImpHash (import hash) is a hashing method for PE executable files, such as EXEs and DLLs, that allows you to make fuzzy matches. It allows you to easily identify executables that are similar but not necessarily exactly the same.
and ImpHash focuses only on the executable's (which means it is a limitation) import table, which contains information about the external functions and libraries used by the executable. Executables will have the same import table if they are variations of each other or compiled using the same basic build infrastructure.?
As a result, binaries do not need to be exact matches to match based on ImpHash. This allows investigators to find malware samples that are likely to be the same at their core but have been altered slightly over time.
The ImpHash computation process involves the following steps:
领英推荐
Now let us come to Why ImpHash is Useful for Defense Teams, Especially DFIR:
The primary problem that ImpHash solves is that it is easy for an attacker to change a bit in a file, which will result in a different content-based cryptographic hash. Yet, the file still has the same malicious behavior. To a security analyst or tool, it is hard to know if this file with a never-before-seen hash value is good or bad.
ImpHash allows you to find similar files, which may or may not be derivatives of the one you are focusing on. It’s important to note that this fuzzy hash is not based on logic behavior or code sequences in the executable. It’s entirely based on what the executable declares that it depends on (which could be a lie) and the order that it is declared.
When To Use ImpHash
ImpHash can be used in many ways. However, if one does not understand what ImpHash is and its pitfalls, then it may not be as effective as one hopes. The following are scenarios I believe ImpHash can best be leveraged:
Malware Scanning and File Upload Restrictions
The main use case for ImpHash for Defense Team, or DFIR, is for malware analysis when the investigator can’t upload files to external analysis platforms, such as ReversingLabs or VirusTotal.?
An example of this is as follows:?
Hunting For Malicious Binaries (exe,dll)
In this scenario, ImpHash can provide significant value is hunting malware. The primary advantage that ImpHash provides is that it can find matches in a network even when each host has a unique variation of the file. This can be illustrated in the following two scenarios.?
Conclusion
In summary, ImpHashing is a method of fingerprinting a binary based on its import table. This has an advantage over traditional hashing as it allows for matching of binaries that are functionally or near functionally identical but are not exact copies of each other. ImpHashing can be useful for threat intelligence sharing, tracking threat actor tooling, and hunting down ever-changing malware. Using imphash is a smart and targeted approach to detect the specific evasion technique you described, especially for .exe and .dll files. While it has limitations, combining it with behavioral analysis, sandboxing, and other hash-based techniques (e.g., fuzzy hashing or cryptographic hashes) can significantly improve detection accuracy.
For more detailed of the ImpHashing process, check out Mandiant’s post.
Happe Detection and Hunting
Director Cybersecurity / Privacy / Risk Management / Corporate Governance, Audit and Compliance , EMBA, CISSP
3 个月Proud of you Abo Zain .. Keep up the great work
Cyber Defense | DFIR | Reverse Engineering & Malware Analyst | GIAC Advisory Board Network+ | Security+ | GCIH | GREM
3 个月Thank you for this Insightful post ???? Is it effective with small IAT? Like what many packed malware usually have (1 or 2 main APIs)
Cyber Security Consultant eCIR | eCTHP | eCDFP I eCPPT | EJPT I CEH | CHFI | ECSA
3 个月Very informative well done ?? Mohammed AlAqeel