Backup Concets for GCP Cloud Storage

Backup Concets for GCP Cloud Storage

Google Cloud Storage (GCS) is Google’s object storage service for storing and retrieving data in a highly scalable and durable manner. It is a general-purpose cloud service handling a wide range of data types including documents, images, videos, and other files. Durability is a key aspect of such services, i.e., the assurance to customers that no data is lost, even if hardware devices fail. 11 9s is GCP’s marketing promise to customers for durability (per year), i.e., the chances that your data is still there is 99.999999999%.

The fundamental structuring and grouping elements are GCS buckets, which contain the actual objects stored by the service. During creation, customers specify the location type, which defines the geographic placement and redundancy for the new GCS bucket. The three options, as Figure 1 (1) illustrates, are:

  • Region: GCP writes objects synchronously to different availability zones within a region. So, if one AZ crashes, applications continue without disruption and with no data loss or inconsistency. GCP switches transparently for applications to the data in the alternative AZ.
  • Multi-region: GPC replicates all data to AZs in a different region. Due to latency, the replication is asynchronous. GCP states that normal replication takes a few minutes, but there is no guarantee. So, the loss of a region implies a loss of data. However, there is no impact on the availability of the service (RTO=0). Applications accessing the data are not impacted (besides potentially missing / corrupt data).
  • Dual-region is quite similar to multi-region with one significant difference: Google guarantees an RPO of 15 min, even for large data volumes. In other words: a maximum of 15 minutes of changes and new data is lost. This feature works only for selected pairs of regions in larger economic zones (EUR4 with Finland and the Netherlands for the EU zone, ASIA1 for Japan, and NAM4 for the US).

Figure 1: GCP Cloud Storage bucket creation mask - availability-related configuration options

One significant distinction between file and object storage lies in their approach to data changes. Unlike file storage, object storage does not support modifying data written to disk. Instead, altering data in object storage means creating a new version of the object and discarding the previous one. Thus, rolling back undesired changes requires reinstating a prior version of the object – if it is still available. Within GCP, two viable options keep overwritten versions for reinstantiating them if needed (Figure 1, 2):

  • Object versioning retains the previous version of an object when applications or users replace an object with a new version. Engineers have the flexibility to define the maximum number of old versions to be preserved at maximum and to decide after how many days GCP removes old versions to reduce storage consumption.
  • A retention policy prevents the deletion of an object (the data, not the metadata) for a defined period, be it seconds or years.

These two features arise from the need to cater to various usage scenarios. GCP customers can activate one or none of these features. Activating both simultaneously is not possible. Object versioning proves beneficial in addressing operational errors related to unintended object modifications, by allowing to reinstatiate a previous version. A notable advantage is the straightforward calculation of the storage cost impact. The level of protection is, however, limited. Deleting a bucket results in the removal of all objects within it, including all old versions. Also, writing many new versions of an object to crowd out the last suitable old version due to the maximum number of object versions is possible. Thus, complementary measures are a necessity to address the risk of massive data loss due to large-scale operational mistakes and ransomware attacks.

Retention policies offer enhanced protection against intentional harm to an organization through the deletion of mission-critical data. However, cloud architects have to be aware that the cost implications can be difficult to calculate in advance. These retention policies block any attempt to delete objects and buckets with them, making the deletion impossible without deleting the entire GCP project where the data resides.

Certain highly privileged accounts can initiate a GCP project shut-down after specific preparations (i.e., explicitly removing a "lien"). However, shutting down a GCP project triggers considerable ?noise?, including the sending of emails to specific admin accounts and the stop of all workloads – and a large-scale crash of applications is difficult to overlook. This visibility allows organizations to promptly detect and restore the project.

GCP has a 30-day waiting period before deleting all project resources after a shutdown, although cloud storage objects are physically deleted "much earlier," according to GCP documentation without providing concrete details.

To conclude: the implications are clear for organizations storing business-critical data in GCP Cloud Storage:

  • Implement a retention policy on relevant buckets, or explore additional variants such as third-party solutions. If cost is a concern, redesigning the application and object storage structure.
  • Print out the GCP project restore procedure (make sure to understand the IAM-related topics as well) and put the print-outs in a safe place to be prepared for a ransomware attack.

In essence, while hoping for the best, preparing for the worst is essential - particularly in the face of the ever-growing ransomware threat. So, shape your BCM strategy around leveraging Google's backup features for the GCP Cloud Storage service.

要查看或添加评论,请登录

Klaus Haller的更多文章

社区洞察

其他会员也浏览了