I'm working on a desktop Java application. It needs to check for a specific file on my S3 server.
I don't want to download the entire file to compare, I need to find out if the one in the server is newer then the local one and then download and replace.
I'm not sure how to do the check if newer available part of this.
I've heard of hashing as a method but I have little experience with how to actually do that on both fronts (locally and via S3)
To get the hash of the remote file:
How to get the md5sum of a file on Amazon's S3
To get the hash of the local file:
Getting a File's MD5 Checksum in Java
Compare E-Tag programmatically for file with size < 5 GB.
Compute hash for local file:
String hash = DigestUtils.md5Hex(new FileInputStream(path));
Get Etag of S3 Object:Get Etag of S3 Object Already mentioned by @dnault
If you compute hash as explained above, then it should be same for all the cases when file size is less than 5 GB.
If file size is greater than 5 GB: Multi-part MD5
If you are also the one originally creating the file on S3, you can store custom ObjectMetadata
with an MD5 (e.g. meta.setUserMetadata(mymap)
) when you first putObject()
, and then look this up with s3.getObjectMetadata()
.