Schedule Snapshot creation on AWS using Python
Antoine CHOULA
AWS Community Builder | Solution Architech | Devops | Certified AWS ? | CI/CD ?? | Jenkins | Gitlab | Docker ?? | Kubernetes ? | Terraform ?? | Ansible | Python ?? | Linux ??
Create snapshot to backup EC2 volume using python
Best practices suggest data backup should be scheduled to occur at least once a week, often during weekends or off-business hours. To supplement weekly full backups, enterprises typically schedule a series of differential or incremental data backup jobs that back up only the data that has changed since the last full backup took place.
In the below case, we will use python to automatically create snapshots to ensure data backup and more we will go further and delete old snapshot once we have newer once created.
To do this, once our environment is setup with Python, Boto3 and AWS CLI tools install we explore the following
What is Boto3 ?
Boto3 is the name of python sdk for AWS or you can say it is a module, library or API to work with AWS Services using python Scripts. Using Boto3 we can create, delete and update AWS Services. Boto3 can be executed from local server or using AWS lambda Service.
If we have to work with AWS Services using python we have to install Boto3.
EBS volume Snapshots
Amazon’s EBS (Elastic Block Store) is a block storage service designed to be used with EC2. It provides service to back up the data on our EBS volumes to AWS S3 by taking snapshots in point of time. In short, important data found in our databases are backed up by creating snapshots in appropriate intervals, depending on requirements.
To create Snapshots of Amazon EBS in an optimal way, we will be using AWS Python library boto3.
And the code will look as follow
import boto3
import schedule
ec2_client = boto3.client('ec2', region_name="us-east-1")
def create_volume_snapshots():
volumes = ec2_client.describe_volumes(
Filters=[
{
'Name': 'tag:Name',
'Values': ['prod']
}
]
)
for volume in volumes['Volumes']:
new_snapshot = ec2_client.create_snapshot(
VolumeId=volume['VolumeId']
)
print(new_snapshot)
schedule.every(7).day.do(create_volume_snapshots)
create_volume_snapshots()
while True:
schedule.run_pending()
Executing the script above will give a result as follow
领英推荐
And you can go forward and verify the snapshot creation in the consul with Snapshot Id as reference.
As we can see from above, for a schedule action at a certain level we might have undesired snapshot due to the repetitive task which will be executed after 7 day. As far as snapshot will be incrementally created, previous once need to be delete in other to reduce our cost
Deleting Snapshots
The frequency of creation of snapshots may vary with the deletion, so at the time of deletion, it may happen that several snapshots are existing for a given volume ID. We can delete them according to our needs.
As we have use boto3 to create snapshot we will also do the same in other to filter and cleanup the unwanted snapshot
import boto3
from operator import itemgetter
ec2_client = boto3.client('ec2', region_name="us-east-1")
volumes = ec2_client.describe_volumes(
Filters=[
{
'Name': 'tag:Name',
'Values': ['prod']
}
]
)
for volume in volumes['Volumes']:
snapshots = ec2_client.describe_snapshots(
OwnerIds=['self'],
Filters=[
{
'Name': 'volume-id',
'Values': [volume['VolumeId']]
}
]
)
sorted_by_date = sorted(snapshots['Snapshots'], key=itemgetter('StartTime'), reverse=True)
for snap in sorted_by_date[2:]:
response = ec2_client.delete_snapshot(
SnapshotId=snap['SnapshotId']
)
print(response)
On execution the above only the most recent snapshot created in list sorted_by_date won't be deleted hence we will save in are infrastructure cost by implementing this.
Summary
This article is just a basic usage of scripting with python. In production environment we will have similar cases to automate the process in other to gain in time and efficiency.