Uncovering AFF4: File Format Essentials And Imaging
The increasing number of drives per case and their size has made existing forensic file formats and storage techniques less effective.
1?? Why AFF4?
E01 file format’s heavy use of CPU, along with the lack of or incomplete metadata storage, linear bitstream copy approach, and poor compression, prompted developers to seek alternative solutions.??
As a result, a new file format was created to solve these problems. This is how the AFF4 file format was born. It has introduced a number of advantages:
??Open-source format
??Supports non-linear multi-pass imaging
??Fast compression methods: Snappy and LZ4?
??Block hashes
??Stores binary zeroes as spans similar to sparse files
??Forensically reproducible partial nonlinear images
??Vendor-neutral
2?? Getting Deeper Into AFF4?
Advanced Forensic Format version four is a highly optimized open-source forensic file format used for the storage of digital evidence.?
The format was created in 2009 and explored in the paper “Extending the advanced forensic format to accommodate multiple data sources, logical, evidence, arbitrary information, and forensic workflow” by Michael Cohen, Simson Garfinkel, and Bradley Schatz.
Before going deeper, let's deal with the basic terminology that you can find below.??
The AFF4 developers took the main idea of storing data in compressed block streams and added a virtualization layer (Virtual Block Stream or a Map) on top of it. This helps to represent data and discontinuities (drive’s areas with sparse data or zeros).
This approach completely changed the сoncept of a forensic image from being a linear block stream to being a nonlinear block stream.?
Green arrows in Figure 1 indicate that blocks can be read from different parts of the drive, from the beginning, middle or the end.?
Once the drive’s block has been read, it is compressed, hashed and stored in the Compressed Block Stream. Then, a reference to that block is stored in the Virtual Block Stream Map, so that we know where it actually belongs in the source drive.
As noted by Bradley Schatz, one of the developers of this file format, in his presentation at the Magnet Virtual Summit.?
“But we could then choose to go to the very end of the hard drive and read the end of the hard drive, where there might be some other partition information being stored, for example a UEFI partition scheme could store some data there. So again, we can read that block, compress it, store it down in the compressed block stream. So you can see now that the blocks have been split out of order, and then we would store the Map, ensuring the virtual block stream, which shows us? where that particular block belongs in the Virtual Address Space”.
3?? AFF4 compression methods
Compression is an algorithm used to reduce the number of symbols representing source information. It saves space and time required to store and transmit data.?
AFF4 has implemented two algorithms that are more efficient than Deflate in E01 - Snappy and LZ4:
??LZ4 is renowned for its exceptionally high compression speeds, significantly outperforming most other compression algorithms. It is well-suited for scenarios where speed is more critical than achieving the highest compression ratio.
??Snappy, developed by Google, also focuses on speed rather than maximum compression, offering fast compression times but generally not as fast as LZ4.
See more detailed benchmark comparison at Github: https://github.com/lz4/lz4
4?? AFF4 Hashing method
As mentioned above, when the data is read from each block, it is compressed and hashed. This is called a Block Hash. The hash of several such blocks is computed in the Block Hashes Hash, while the last one is computed in the Block Map Hash.
领英推荐
This Block Map Hash represents a single SHA-512 hash value for all individual block hashes based on the Merkle tree model.
Before we move on to image acquisition, let's take a moment to compare the most popular formats to highlight the advantages of AFF4.
5?? AFF4 vs. other Forensic File Formats
To enhance our comparison, we conducted a speed test on two prominent forensic file formats, E01 and AFF4, using Atola TaskForce's 2 imaging capabilities.?
Here are the test results (the recorded average imaging speeds shown in megabytes per second for each combination):
As we can see, AFF4's LZ4 and Snappy compressions provide higher imaging speeds compared to compressed E01.
The setup for the test was as follows:??
??As a source device, the 250GB Samsung SSD 970 EVO Plus drive was chosen.?
??First, an image with the Windows 10 operating system was written to the drive to create the first set of source data. The imaging target was either E01 or AFF4 file on a storage device, 1TB Samsung SSD 990 PRO, connected directly to the TaskForce 2 unit.?
??For each combination of source dataset, forensic file format (E01, AFF4) and compression method (LZ4 and Snappy for AFF4), we conducted three imaging sessions with identical settings, and then calculated the average imaging speed for that combination. For each session, SHA1 hash was calculated during the imaging process. After each session the imaging report with all the parameters was saved, and the target storage drive was wiped using Format NVM method.
??Then, the Linux operating system was written to the source drive, previously wiped using Format NVM method, and all the described procedures were repeated.
??Last, the sectors of the source drive were overwritten with random values and all the procedures were repeated again.
Now that we are familiar with the AFF4 file format, it is time to learn how to create images in this format. ?
6?? Imaging to an AFF4 file
That's it! Dive into the capabilities of AFF4 with confidence, equipped with the understanding of the basics and ready to take advantage of it every day. Happy investigating!???
Previous episodes:
[9] RAID With Parity
Thank you for joining us for this edition of Plug, Image, Repeat! Make sure you never miss an issue by clicking the "Subscribe"?? button in the upper right corner of the page. For more articles and insights, visit our website. If you have any questions, please ask us or send them using the comments section below.