Tar Compression: tarfile library in python

Tar Compression: tarfile library in python

Tar stands for Tape Archive is an archive file [record file] which is created by making group of some files.

In the world of file handling and compression, Python's tarfile library offers powerful capabilities to archive, compress, and extract files. Whether you're bundling files for distribution or extracting a tarball, the tarfile library makes it seamless.


What is Tar Compression?

Archiving: Combines multiple files into a single .tar archive, with no compression applied.

Compression: Further reduces the size of an archive using algorithms like gzip, bzip2, or lzma.

Key Features of tarfile

  1. Supported Compression Algorithms:

  • gzip (.gz)
  • bzip2 (.bz)
  • lzma (.xz)


2. Two Primary Functions:

  • Archiving: Create tarballs, optionally with compression.
  • Extracting: Retrieve files from archives.

Files used:

1. Two images

2. Two text Files

3. A file which is symbolic link with some system file (to demonstrate extraction filters)



Archiving: How to Create Tar Files


1. Without Compression: Use mode w (overwrite) or x (safe creation) to create tar archives.

tar = tarfile.open("archive.tar", "w")  # Use 'x' for safe creation
tar.add("file1.txt")
tar.add("file2.txt")
tar.close()
        

2. With Compression: Specify a compression algorithm such as gzip, bz2, or xz using w:gz, w:bz2, or w:xz.

tar = tarfile.open("archive.tar.gz", "w:gz")
tar.add("file1.txt")
tar.close()

        

3. Appending Without Compression: Add files to an existing tarball using mode a. Note: Compression cannot be used in append mode.


Extraction: Retrieve Files from an Archive

Functions for Extraction

  • tar.extract: Extracts files, allowing filters for added security.
  • tar.extractfile: Extracts a single file but raises an error if the file doesn’t exist.


with tarfile.open("archive.tar.gz", "r:gz") as tar:
    for member in tar.getmembers():
        tar.extract(member, path="./output_folder")
        


Security with Extraction Filters

When extracting, malicious tar files could overwrite critical system files (e.g., /root or C:/Windows/System32). Filters help mitigate this risk.


Examples of Using Filters

  1. Fully Trusted Filter


2. Tar Filter or Data Filter



要查看或添加评论,请登录

Mohitkumar Talreja的更多文章

社区洞察

其他会员也浏览了