So I have an S3 bucket of videos (several hundred), upon which I used ElasticTranscoder to transcode everything into a second, optimised bucket.
However, when I inspect my second bucket, there are 40-50 less objects, but I cannot figure out what they are (the directory structure is deeply nested etc).
How can I get the file diff of two buckets using aws s3api list-objects
?
Perhaps there are files in the bucket which are not videos, which I somehow didn't know about.
Using Display only filenames:
aws s3 ls s3://bucket-1 --recursive | awk '{$1=$2=$3=""; print $0}' | sed 's/^[ \t]*//' | sort > bucket_1_files
aws s3 ls s3://bucket-2 --recursive | awk '{$1=$2=$3=""; print $0}' | sed 's/^[ \t]*//' | sort > bucket_2_files
diff bucket_1_files bucket_2_files
You can use the sync
command with the --dryrun
option to compare instead of syncing.
aws s3 sync s3://bucket s3://bucket2 --dryrun
You can, of course, also use it to compare a local directory with a bucket.
aws s3 sync . s3://bucket2 --dryrun