Analyzing malicious PDF using Pdfid, Pdf-parser tools

Analyzing malicious PDF using Pdfid, Pdf-parser tools

In this article, I'll show you how to analyze a malicious pdf using Strings, exiftool, pdfid, pdf-parser tools.

This is our malicious pdf, we'll get IOCs from this pdf file

No alt text provided for this image


(i) Using strings tool:

Go to the command prompt then type strings -a sample.pdf (sample.pdf is malicious pdf)

-a --> Used to print all strings within the pdf file.

No alt text provided for this image

you can see Javascript and OpenAction keywords and these are IOCs.


(ii) xorsearch:

Using xorsearch to find any encrypted strings within the pdf file.

type xorsearch sample.pdf http - using this command to find encrypted URL

xorsearch -p sample.pdf - Using this command to find an embedded executable

unfortunately, we can't get any helpful information.

No alt text provided for this image


(iii) Using exiftool:

exiftool is used to extract metadata from pdf files.

type exiftool sample.pdf

No alt text provided for this image

you can see the above image, It shows the pdf file version is 1.3 and has read and write permissions only.


(iv) Uisng PDFiD tool:

PDFiD is a Python tool, it is used to analyze and sanitize PDF files.

type pdfid.py sample.pdf

No alt text provided for this image

you can see the above image, openaction used by malware in one object and javascript keyword used in three objects. most probably openaction keyword tries to execute the javascript code within the pdf file.


(v) Using Pdf-parser tool:

PDF Parser is used to extract data from PDF documents.

type pdf-parser.py --search openaction sample.pdf

No alt text provided for this image

pdf reader will trigger the openaction to execute the javascript when the sample.pdf opens.

so we need to find how many objects use the javascript keyword

type pdf-parser.py --search javascript sample.pdf

No alt text provided for this image

there are three objects use javascript (object 1, object 7, object 12)

object 1 - this object will use openaction to execute the javascript code and it references object 7.

object 7 - this object references object 10

object 12 - this object references object 13

so we need to find what can object 10 and object 13 do,

type pdf-parser.py --object 10 sample.pdf

No alt text provided for this image

object 10 - this object references object 12

object 12 - this object references object 13

type pdf-parser.py --object 13 sample.pdf

No alt text provided for this image

object 13 contains the javascript code, it used zlibcompression to compress the javascript code.

if the pdf opens, the pdf reader will use the filter to decompress the javascript and then execute.

we can see the decrypted javascript code using this command,

pdf-parser.py --object 13 -f -w sample.pdf

-f - It used to decode data in the object

-w - It used to display raw data from the object

No alt text provided for this image

the above image shows the decompressed javascript code, we need to dump this code into another file. so,

type pdf-parser.py --object 13 -f -w -d dumped.js sample.pdf

-d - It used to dump the code into another file

I've saved this decoded javascript file as dumped.js for further analysis

No alt text provided for this image

open the dumped.js file in Notepad++ or any other text editor, it will arrange the code into the correct format.

No alt text provided for this image

this javascript code uses the function, for and while loop, and more. I'll show you how to analyze the malicious javascript code in my upcoming posts.


Conclusion:

the first stage is to execute the embedded malicious code within the PDF file, and the second stage is embedded malicious code downloads the additional payloads/malware from the internet.

In this case, we've found the malicious javascript file in object 13.

dan bromberg

Student at The Johns Hopkins University - Carey Business School

1 年

Thanks, but it would have been helpful if you provided links to the different programs you used and Sample.pdf.

要查看或添加评论,请登录

Mohanraj A的更多文章

社区洞察

其他会员也浏览了