How AWS S3 E-Tags Work
Many use-cases would run more efficiently if files are only downloaded when they are updated or changed. When using AWS S3, one of the important object properties is it's "Etag" which is some sort of a checksum that's used by AWS to check on file completeness files on upload or download.
Comparing a local file Etag with an AWS Etag can be a tricky business because of how AWS calculates them. Here is a summary of my findings going through their documentation, code base and random blogs.
Small Files
Small files are uploaded in a single request and the E-tag is the md5 digest of the file.
Large Files
For larger files AWS uses multipart upload. And here is where E-tag calculation gets tricky. The E-tag of a multipart file is calculated as follows:
So far so good, but if you don't know the chunk size being used then there is some trouble ahead since there are lots of chunksizes that correspond to the same number of parts. Finding a small set of chunksizes to calculate possible Etag values is crucial to make this comparison possible. After a decent amount of reading, debugging and monitoring browser network tabs. Here are the values used most commonly
Finally i summarized all of this into a github gist. hope it makes someone's day easier.
Transfer your files from anywhere to anywhere
3 个月Very nice indeed, thank you for this information.
CEO at EZOps Cloud | Leading the future of DevOps with secure and efficient solutions allied with AI-powered innovation
1 年Nice content!