How Computer Vision Can Detect Hidden Cyber Threats
Hornetsecurity
Leading cloud security and compliance SaaS provider, protecting 125,000 organizations globally.
Introduction
Cybercriminals continuously refine their techniques to evade security measures, using advanced web technologies (HTML, CSS, JavaScript) to disguise phishing emails and malicious links. Traditional filters rely on analyzing raw content, but attackers manipulate how information is visually rendered, making detection more challenging. This is where Computer Vision plays a crucial role in cybersecurity.
Why Computer Vision?
When an email or webpage is opened, the browser or email client renders it into a graphical representation, different from its raw HTML source. Attackers exploit this by embedding threats in ways that are visually harmless to users but invisible to traditional security scans.
Techniques used by attackers to evade traditional filters
QR Code Phishing (Quishing)
The following email screenshot presents a QR code based phishing attempt impersonating DHL. The QR code has been partially redacted for security reasons.?
QR codes are widely used, and attackers take advantage of this familiarity. Instead of inserting static images, they dynamically generate QR codes using HTML tables, arranging characters to form a scannable pattern. This technique makes it harder for security systems to extract and analyze embedded malicious links. It has also been used to impersonate well-known brands like Microsoft and Chase.
ZeroFont Phishing
The ZeroFont method hides malicious text by setting its font size to zero, making it invisible to users but still present in the HTML code. Attackers insert fraudulent information - such as phone numbers for tech support scams - between these hidden elements, preventing security filters from recognizing the full context. Here is an example of ZeroFont technique used in an email scam impersonating Geek Squad, a well-known company providing technical support and assistance to Best Buy customers. In this scam, the goal of the attacker is that the targeted end user calls a fraudulent toll-free phone number.?
An analysis of the HTML source code reveals that the fraudulent phone number is obfuscated in a convoluted way, as illustrated below. HTML span elements of size zero (See ‘style=”font-size:0vw”’) are inserted between the sequences of digits and hyphens (‘88’, ‘8-’, ‘72’, ‘9-’, ‘12’ and ‘52’) making the extraction of the phone number extremely difficult for a traditional email filter.?
How Computer Vision Enhances Security
These techniques show the need for Computer Vision to analyze emails and web pages like a human. However, it is resource-intensive and requires rendering beforehand, which can take seconds if external resources (Images, CSS, JS) must be loaded for accurate visualization.
Email rendering to analyse suspicious emails
In practice, in the context of email, the application of an email rendering engine followed by a Computer Vision analysis can only be performed on a sample of the considered email traffic – only the most suspicious emails should be candidate. An example of pipeline is proposed in the following diagram.?
Reverse Image Search for Phishing Detection
Another approach that could be used would be to replace the Computer Vision engine and decision engine with a Reverse Image Search component coupled with a database of email screenshots.??
This method is particularly useful for detecting repeated spam campaigns impersonating major brands (e.g., Costco, Walmart, Lowe’s). One of the main challenges is that the content of the email is extremely variable from one to another – the spammers know all the tricks to add noise and hide the relevant content – while the visual representation of the email does not evolve much over time – but it does change.
To detect recurring spam, analysts maintain a database of known spam screenshots. When a suspicious email is analyzed, Reverse Image Search checks if it matches a known spam image, blocking elements if confirmed. This system must support fuzzy search to detect slight variations, making it effective for identifying phishing campaigns with minor changes, like different QR codes.
Hornetsecurity leverages Computer Vision in its ATP Secure Links technology to detect phishing attempts based on visual patterns - going beyond traditional security measures. Stay ahead of evolving threats with cutting-edge protection!
Want to know more? Schedule a demo today!
This content was inspired by the first in a series of in-depth technical articles detailing how Computer Vision technology is used at Hornetsecurity, read the original article here: Detection of Cyberthreats with Computer Vision