Tips to compress changing data without halting the live system that is changing the data
Deepak Kumar
Propelling AI To Reinvent The Future ||Author|| 150+ Mentorship|| Leader || Innovator || Machine learning Specialist || Distributed architecture | IoT | Cloud Computing
Introduction
Laymen explanation
Think that you want to compress a log file. This file will be written by software and so, file will be modified anytime. So, compressed file may be corrupt sometime. Or a bit better, compression tools like zip will throw error and deny producing compressed file.
Technical explanation
To ensure that file is not being modified while compression, snapshot can be taken . And then apply the compression over snapshot. Operating system should have feature to snapshot. For example, Linux LVM provides snapshot functionality
Linux LVM Snapshot approach
A snapshot volume is a special type of volume that presents all the data that was in the volume at the time the snapshot was created.
A wonderful facility provided by LVM is 'snapshots'. This allows the administrator to create a new block device which presents an exact copy of a logical volume, frozen at some point in time. Typically this would be used when some batch processing, a backup for instance, needs to be performed on the logical volume, but you don't want to halt a live system that is changing the data. When the snapshot device has been finished with the system administrator can just remove the device. This facility does require that the snapshot be made at a time when the data on the logical volume is in a consistent state
Care while using various compression tools
Linux Tar/gzip tool verifies the data and if inconsistency is found (which is true since data changed), then it throws warning and returns non-zero value to the shell. It is important to act on this warning otherwise we may be in surprise mode.
Moreover, this warning will be occasional and so, test cases should be diverse enough to catch such issues.
In best spirit, this should be taken care in design phase.