Python for Azure: Enable Blob Versioning on Azure Data Lake Storage

Python for Azure: Enable Blob Versioning on Azure Data Lake Storage


For complete information on enabling "Blob Versioning on Azure Data Lake Storage" via Python please click on the?Full Article & Solution Implementation

No alt text provided for this image
Click on the Image to follow the 'Python for Azure' Medium Publication

No alt text provided for this image

Python for Azure: Enable Blob Versioning on Azure Data Lake Storage


Introduction: To keep past iterations of an object automatically, you can enable Blob storage versioning. If a blob is edited or removed, you can access prior versions of the blob by turning on blob versioning. Blob versioning is part of a comprehensive data protection strategy for blob data

No alt text provided for this image
Blob Versioning working mechanism

Blob Versioning: Configuration/Setting at the scope of storage-account level

One can enable Blob storage versioning to automatically maintain previous versions of an object. When blob versioning is enabled, you can access earlier versions of a blob to recover your data if it is modified or deleted.

  • When this feature is enabled, azure storage automatically creates a new version with unique version-ID (the value of version-ID is the timestamp when the blob was last modified)
  • A version captures the state of a blob at a given point in time. Each version is identified with the above mentioned unique version-ID
  • A version ID can identify the current version or a previous version. A blob can have only one current-version* at a time
  • If the write operation creates a new blob, then the resulting blob is the current version of the blob.
  • If the write operation modifies an existing blob, then the current version becomes a previous version and updated one is the newest version.

Points to Remember:

  • Blob versions are immutable. You cannot modify the content or metadata of an existing blob version
  • Microsoft recommends maintaining fewer than 1000 versions per blob otherwise latency for blob listing operations can increase
  • Blob versioning cannot help you to recover from the accidental deletion of a storage account or container
  • You can perform read or delete operations on a specific version of a blob by providing its version ID, otherwise operation acts on the current version
  • The version ID remains the same for the lifetime of the version
  • When blob versioning is turned on, each write operation to a blob creates a new version
  • A blob that was created prior to versioning being enabled for the storage account does not have a version ID
  • Disabling blob versioning does not delete existing blobs, versions, or snapshots (and then the blob modified /created doesn’t have a version ID)
  • If versioning and soft delete are both enabled for a storage account, then when you delete a blob, the current version of the blob becomes a previous version
  • Blob versioning is available for standard general-purpose v2, premium block blob, and legacy Blob storage accounts
  • Storage accounts with a hierarchical namespace enabled for use with Azure Data Lake Storage Gen2 are not currently supported
  • Enabling blob versioning can result in additional data storage charges to your account

要查看或添加评论,请登录

社区洞察

其他会员也浏览了