Developing Ransomware Malware. 0x10

Developing Ransomware Malware. 0x10

Writing Our Own Ransomware Malware for understand how the Cybercriminal write it.


Forensic Investigation - Password cracking Word, Excel & PDF. 0x05


In previous Chapters for Forensics we learnt :

  • We Learned the File structure of PDF , Microsoft Word and Excel?
  • Created a malicious MS Word document via Macro Editor from MS Word
  • Inserted a basic VBS script which will get executed when a user opens our malicious Word Document file.

In this chapter we will learn?

  • Where to get password hashes for MS Word, Excel and Pdf documents
  • How to crack the MD5 hashes for Word, Excel and Pdf?
  • How to use "john the ripper" tool to crack them?
  • We will also try to understand basic things about MD5 which is vulnerable.


Lets rock ??

Let's understand what Files HASHES are.?

File hashes are unique strings generated from the contents of a file. In today's world they are used as digital fingerprints. Even a small change in the file's content will produce a different hash.

For example :??

We have a file called "qrc.txt" that contains the text "Hello World!". Let's calculate its hashes.

Below, you can see the output for the file. Even a small change in the file will result in different hashes.

"Hello World!"   : ed076287532e86365e841e92bfc50d8c  - Actual data

"Hello  World!"  : 1bd31a4d042ddc15a94cbdda33ed61ab  - Space is added
 
"Hello WorlD!"   : 127675805fd1218c4656ce2f5d22cdfc  - D is capital letter
 
"Hello worlD!"   : 7eb5c6e6e327eda38f610ba366b4b7ea  - w in small letter

"hello worlD!"   : a15df29e88139d9790491ac0ce98ed69  - h in small and D in capital         

All the above text data has a small change, and the hash are different. Hashes are case-sensitive. Even if you add a space to the file, the hash will change. This helps us ensure file integrity. Here are a few special algorithms used for hashing.

MD5 (Message Digest Algorithm 5): Produces a 128-bit hash value. It's fast but considered cryptographically vulnerable.

SHA-1 (Secure Hash Algorithm 1): Produces a 160-bit hash value. It's more secure than MD5 but has known vulnerabilities.

SHA-256 (Secure Hash Algorithm 256-bit): Part of the SHA-2 family, it produces a 256-bit hash value and is widely used.

bcrypt: Specifically designed for hashing passwords. It incorporates a salt to protect against rainbow table attacks and brute-force attacks.

These Algorithms are integrated into tools or software.

We can use these tools to find the file hashes. Where we need to provide the file name as a parameter, it will generate hash for the complete file.?

To compute a hash for a file on windows? machine we can give following command?

Windows :
------------ 
c:\> certutil -hashfile Aliens.txt MD5
                 MD5 hash of Aliens.txt:
                 cc2c5a19cab7237594ae52c4de0d2c1fc29b7166
                 CertUtil: -hashfile command completed successfully.”

Linux :
-------
# md5sum Aliens.txt
                 Cc2c5a19cab7237594ae52c4de0d2c1fc29b7166 Aliens.txt        

File hashes are used everywhere, from browser SSL/TLS verification to blockchain. For example, when you visit a Linux website to download an image, they provide the image file hash.

After downloading the image file, you need to run the hash command to generate the hash of the downloaded file and compare it with the hash provided on the website. If the hashes match, the file integrity is intact. If the hashes don't match, it means there was an issue during the download, or it could be a "man-in-the-middle" attack ??

Hashes are also used to find duplicate files in the cloud, on servers, and on desktops. If two files have the same hash, they are identical. That means we can hash all the files on the hard drive and sort them. If we find duplicate hashes, that means we have an exact copy of the file, which can be deleted if not needed.?

This is a Good way to maintain your Disk space. Hashes are also well known for Digital signatures, and for file encryption.?

In this chapter we are going to see how MS Word, Excel and PDF files use hashes for encryption, so we can crack it ??


Lets understand the MD5 algorithm works ??

When we give this command on the Linux machine , This is what happens? :?

# Md5sum “Aliens have changed Humans  DNA !”         

  • Step 1: Read the Message and Padding the Message, MD5 processes data in 512-bit blocks. To prepare the input message.
  • Calculate Original Length : The message "Aliens have changed Humans DNA !" is 31 bytes long (each character is 1 byte).
  • Append a Single '1' Bit: In binary, this means adding a 1 after the message.
  • Append '0' Bits: Add enough '0' bits to make the total length 448 bits. Since 31 bytes = 248 bits, you need to add 199 bits (448 - 249).
  • Append Length: Append the original length of the message as a 64-bit little-endian integer. For our message, this length is 248 bits.
  • Padded Message:(in bits), "A"..."!" + 1 + 000...000 (199 zeros) + 248 (as 64-bit little-endian integer).
  • ?Initialize MD Buffer : MD5 uses four 32-bit variables (A, B, C, D), initialized as follows: A = 0x67452301 , B = 0xefcdab89, C = 0x98badcfe, D = 0x10325476.
  • Convert to Hex: Convert the 128-bit result to a hexadecimal string, which is the MD5 hash.

The string will rotate 4 times to get the unique hash. This explanation is on a very high level as I don't want to go into details , which requires a lot of Mathematics . Let our tool handle it??.


Now we know how MD5 works and for what it is used for . Let's start our actual work password cracking the Pdf , Word and Excel file ??.

Let's understand this better. When we password protect PDF, DOCX, or XLS files, the password hashes are integrated into the same file along with a salt (a random value added to the password before hashing). This means the hash is embedded in the same file.

You might think this is a glitch, but it's actually a common practice. The password hash and salt are stored within the file so that when you try to open it, the software can verify the password you enter by comparing its hash with the stored hash. If they match, access is granted.

This process ensures the file remains secure and can only be accessed by someone with the correct password. So, while it might seem like a glitch, it's actually a secure way to protect the file ??.

For example : We have a PDF file called "qrc.pdf" that is password protected. When I use the password "CafeLate", the application changes this password into a special, unique code that can't be reversed (this code includes something called a "salt") .

In short :Here's what happens step by step:

  1. CafeLate: The password you enter.
  2. Salt + Hash: The application adds a random value (salt) to the password and then creates a unique code (hash) using a method like SHA1 or MD5.
  3. Unique value: This combination creates a unique code.
  4. Encrypt the file: The file is locked using this unique code.
  5. Save the hash: The unique code is saved inside the same file.

So, "CafeLate" + salt -> unique code -> file is locked with this code -> code is saved in the SAME file ??


Now if you know where the file hashes are then you can create the same hash or reverse engineer the password. And this can be done with one of my favorite tools? called “John the Ripper “ . The name sounds scary but it is one of the best tools used to brute force or crack file passwords since 1996.?

This tool can crack most of the well known files available today.?

We already learned in the previous chapter about the file structure of PDF , MS word , MS Excel file , so it will be very easy for us.?

Below images show a password protected PDF file is created and once you provide the valid password, it gets opens in the web browser.

When Valid Password is provided , Access is granted.

To get information for the password protected PDF file, you still need a valid password; otherwise, it won't show anything.

In the example below, we use the Linux "pdfinfo" command to get the information about the PDF file. Unless and Until a valid password is not provided the application will not show any information giving you error : "Command Line Error: Incorrect password."

The correct command to view information from a password-protected PDF file is:

# pdfinfo -opw helloworld -rawdates passPDF01.pdf        

In this example, "helloworld" is the password for the PDF file. Once you enter the valid password, you can see the file's information.


Once the valid password is provided, information about PDF is show.


The above image also shows us that, the file is Encrypted with the integrated hashes.?

Now how to crack the password. To crack the password we need to know where exactly the hashes are saves in the PDF.

The password hashes are stored in the “/Filter” tag of the PDF.

<<
/Filter /Standard
/V 4
/Length 128
/R 4
/O (Binary data representing owner password hash)
/U (Binary data representing user password hash)
/P -4
>>        

In short the password hash are stored in the "Encrypt dictionary" of the PDF document. The “/U” entry contains the user password hash, while the “/O” entry contains the owner password hash. These fields are part of the Encrypt dictionary.

  • Filter: Usually /Standard.
  • V: Version of the encryption algorithm.
  • R: Revision number.
  • O: Owner password entry.
  • U: User password entry.
  • P: Permissions.

These entries provide the necessary data to understand how the Password protect PDF is encrypted.

Let's Extract the required information to crack the password, by using strings command on Linux .

The command to get the required format, execute the following command.

# strings passPDF01.pdf| grep "/Filter"        

Below image shows the command in action which displays the required information .

Hashes are displayed from the PDF with "stings" command.

Extracting the Encryption Information from the strings output, Following are the values we have got:

  • Filter: Standard
  • Length: 128
  • O: 1F4CFD5BB3E1C02A7A5EC04BA11F26671E63A7CC16ECE2A6E1C8803BCD43CC1B
  • P: -1060
  • R: 4
  • StmF: StdCF
  • StrF: StdCF
  • U: 4ED58C8076C7023D16384732A2239CA300000000000000000000000000000000
  • V: 4

Now lets create a file called “pdf_hash.txt” in "johnTheRipper" Format. So it can read the hash and convert back to ASCII password.

#echo '$pdf$4*4*128*1F4CFD5BB3E1C02A7A5EC04BA11F26671E63A7CC16ECE2A6E1C8803BCD43CC1B*4ED58C8076C7023D16384732A2239CA300000000000000000000000000000000*-1060' > pdf_hash.txt        
With the help of 'echo' command we can create "JohnTheRipper" File format.

Once the pdf_hash.txt file is generated , let's give it to “john” for cracking .?

# john --format=pdf pdf_hash.txt        

By default john us “/usr/share/john/password.lst” file for cracking , pls feel free to give another dictionary file like “rockyou.txt” file which has “14344392” passwords or get a new file from dark web. Our default john password list contains? “3560” passwords which will be compared with the provided hash.?


Lets understand how john works ??

What John tool does ,? it takes the 1st password from the file, converts it into a hash and then compares it with the hash which we have provided. If the hash matches then it provides us the ascii text. E.g.?

  • Take first word from the file ““/usr/share/john/password.lst” i.e “aaaaa”
  • Convert the “aaaaa” to hash -> “E1C02A7A5EC04BA11F26671E6”
  • Compare the “E1C02A7A5EC04BA11F26671E6” with the pdf_file.txt hash.
  • If its compares then it display password as “aaaaa”?
  • If not compared, move to the next password and so on.
  • If the word is not found in the given file, then it tries another permutation and combination,? all chartered from the keyboard are taken and? starts it again.?

Passwords can be cracked because they are made up of the keys on a keyboard, like letters (a to z, A to Z), numbers (0 to 9), and symbols (~, +, etc.). Since these keys are predefined, using different combinations, a password can eventually be guessed.

Nowadays we have a lot of good processors and chips like NVIDIA which cracks passwords in very less time. We don't have to wait for years ??

Below image shows the password is has been cracked by john tool??


The password is cracked for the PDF

John tool did it again ??This is is how you can crack the PDF file password ??

The most important thing here is to know where the PDF file saves the hashes, this is for MS Word and MS Excel too.


That Great? ! ??.

Let's move to the next Phase, cracking Microsoft Word and Excel Password.?

The process for cracking the password is the same for most of the file. Here Microsoft Word and Microsoft Excel cracking process are exactly the same.?

Again let's try to understand Where MS Word and Excel file Save Password.?

As you know MS Word and MS Excel's files are ZIP files , we can extract those files and get the required information . We have a “passPDF01.docx” file which is password protected. When we open the file in MS Word , it asks for a password.?

Below image shows when we open the Word document ,it asks for password . Once the proper password is saved it will open the document.

Once the proper password is provided , we get access to the word document

Let's quickly unzip the document and understand where it stores the password ??.

Below image shows the file is extracted with 7Zip. Where we can find our main file called as “Encryptioninfo”? where all the juicy information is stored ??


.docx file is unzipped by 7zip

Lets understand the main “./EncryptionInfo” file. This file saves all the hashes which we require for cracking.?

Use following command to extract the required information from the file :

# cat EncryptionInfo        

From the extracted files we require 3 important tags from “./EncryptionInfo” file:?

  • saltValue
  • encryptedVerifierHashInput
  • encryptedVerifierHashValue

Below image shows the “cat” command in action, it displays all the required information from the EncryptionInfo file too.

We can see the Hashes been displayed from the EncryptionInfo file.


Once you run the “cat” command , we can use these commands to get the base64 values for these tags : "salevalue" , "encryptedVerifierHashInput", "encryptedVerifierHashValuerequired".

use following commands

grep -oP 'saltValue="\K[^"]+' ./EncryptionInfo | head -n 2 | tail -n 1
grep -oP 'encryptedVerifierHashInput="\K[^"]+' ./EncryptionInfo
grep -oP 'encryptedVerifierHashValue="\K[^"]+' ./EncryptionInfo        

The above commands is extra step in comparison with the above PDF cracking, Here we have to get base64 strings first and then we need to convert them in to Hex. In Pdf cracking we already had Hex data, so we directly gave it to john.

In short , we have to take base64 string then we need to convert in to Hex and then we can feed it to john.

Below images show the above command in action which extract the base64 values form the "EncryptionInfo" File.

We can extract base64 values to crack the password.

Now once we have the bas64 values for the respective fields, we need to convert it to HEX , So our john tools can understand . Lets convert the base64 value to HEX by following command.

# echo "gsHOyH4Nx67Gn7/GdC1zwQ==" | base64 -d | xxd -p -c 256
# echo "QjuVUfH/nB/keVTFXWa02A==" | base64 -d | xxd -p -c 256
#echo"QHMWbkzRJWTD7hoJxFwfCio0wFQdnUI9kuVTBcKPBxzDYvlUMc2bGp04hQzuaTyZmjdb7MiB1svGYQekiZmsyQ==" | base64 -d | xxd -p -c 256        

When we execute the above command we get a hex value which we need to save in a john tool format. Following command copy the hex value to johns tools format .

# echo '$office$*2013*100000*256*16*649a66a94473948a77b0f4695ec2e561*423b9551f1ff9c1fe47954c55d66b4d8*4073166e4cd12564c3ee1a09c45c1f0a2a34c0541d9d423d92e55305c28f071c' > word_hash.txt        

The above command create word_hash.txt file with proper format which can be understood by John tool. Once the word_hash.txt is created, Then we can start hash cracking .

Below image show the File format which JohnTheRipper Tools understand. First we have to give "saltvalue", then "HashInput" value and then "HashValue" in Hex.

JohnTheRipper File format

To make the extraction and create the john tools format, let's make a script to automate this. I am sharing this in a "extract_and_convert.sh" file so we can understand it in a much better way.

#!/bin/bash

# Extract values from EncryptionInfo
saltValue=$(grep -oP 'saltValue="\K[^"]+' ./EncryptionInfo | head -n 2 | tail -n 1)
encryptedVerifierHashInput=$(grep -oP 'encryptedVerifierHashInput="\K[^"]+' ./EncryptionInfo)
encryptedVerifierHashValue=$(grep -oP 'encryptedVerifierHashValue="\K[^"]+' ./EncryptionInfo)

# Debugging: Print the extracted base64 values
echo "Extracted Salt Value (Base64): $saltValue"
echo "Extracted Encrypted Verifier Hash Input (Base64): $encryptedVerifierHashInput"
echo "Extracted Encrypted Verifier Hash Value (Base64): $encryptedVerifierHashValue"

# Convert base64 to hex
saltHex=$(echo $saltValue | base64 -d | xxd -p -c 256)
encryptedVerifierHashInputHex=$(echo $encryptedVerifierHashInput | base64 -d | xxd -p -c 256)
encryptedVerifierHashValueHex=$(echo $encryptedVerifierHashValue | base64 -d | xxd -p -c 256)

# Trim the encryptedVerifierHashValueHex to 64 characters
encryptedVerifierHashValueHex=${encryptedVerifierHashValueHex:0:64}

# Debugging: Print the converted hex values
echo "Salt (Hex): $saltHex"
echo "Encrypted Verifier Hash Input (Hex): $encryptedVerifierHashInputHex"
echo "Encrypted Verifier Hash Value (Hex): $encryptedVerifierHashValueHex"

# Construct the hash string for John the Ripper
hashString="\$office\$*2013*100000*256*16*$saltHex*$encryptedVerifierHashInputHex*$encryptedVerifierHashValueHex"
echo $hashString > word_hash.txt

# Verify the contents of word_hash.txt
cat word_hash.txt        

Now the word_hash.txt file is created , lets run johnTheRipper ??

Command to crack MS word Password protected Document file

#john –format=office word_hash.txt         

Once you run the command it will take some time to crack the password.?

Below image shows that “john the ripper” has successfully cracked/found the password.

Password was successfully cracked for MS word Dcoument

Great ! Now let's Crack the password for Microsoft Excel file. The process is the same here, however we will be using a ready made script from JohnThe Ripper tool set, which is called “office2john”. This script is very much like above one which I have shared, however this script is very complex and versatile.

Once you open the MS Excel file it will ask for the password, below image show MS Excel is asking for password.?

MS Excel asking password.

How many time this screen ??, any way lets continue.

Now again we can extract the MS Excel file with 7Zip . Once the files are extracted we can run the following command. This script is a readymade script from "JohnTheRipper" Tool Set.

# office2john myExcel.xlsx > 01.txt        

The above command will create a "01.txt" file which will have all the necessary extraction from MS Excel file and john format.? Once the 01.txt file is created we will run this command to crack the password. Very Easy ha ??,.

# john --format=office 01.tx        

And after few minutes you will get the password ?? The below image shows that in action

MS Excel Password is cracked.



Phew! That was a long series ??

With this, I'm finishing the “Developing Ransomware Malware” series.?

All the chapters I've written and shared will be published as a book.

I'm very grateful to my friends who gave suggestions, liked, and shared this on LinkedIn. You all motivated me to write this wonderful book. There will be a few more chapters in the published version.

A special thanks to my friend "Mr. Deepu Sebastian", who suggested and encouraged me to create videos, which I will do once the book is published.

If you have suggestions for the chapters, know a good publisher, or have any useful information to share, please feel free to email me at “sanjay.d.bhalerao[@]gmail.com .” with a subject as “DRM”.??

Thanks a lot for your time, and see you in the next series! ??Which will be on "PCI SSS Compliance" , after a break.


Abhijit Bhokare

Assistant Manager GRC at QRC Assurance And Solutions Pvt Ltd

4 个月

Very informative Sir

要查看或添加评论,请登录

社区洞察

其他会员也浏览了