I've got a whole heap of files on a server, and I want to upload these onto S3. The files are stored with a .data extension, but really they're just a bunch of jpegs,pngs,zips or pdfs.
I've already written a short script which finds the mime type and uploads them onto S3 and that works but it's slow. Is there any way to make the below run using gnu parallel?
for n in $(find -name "*.data")
extension=`file $n | cut -d ' ' -f2 | awk '{print tolower($0)}'`
mimetype=`file --mime-type $n | cut -d ' ' -f2`
fullpath=`readlink -f $n`
s3upload="s3cmd put -m $mimetype --acl-public $fullpath s3://tff-xenforo-data"$filePathWithExtensionChanged
echo $response
Also I'm sure this code could be greatly improved in general :) Feedback tips would be greatly appreciated.
Try s3-cli: Command line utility frontend to node-s3-client. Inspired by s3cmd and attempts to be a drop-in replacement.
Paraphrasing from https://erikzaadi.com/2015/04/27/s3cmd-is-dead-long-live-s3-cli/ :
Use aws cli. It supports parallel upload of files and it is really fast while uploading and downloading.
you can just use s3cmd-modified which allows you to put/get/sync with multiple workers in parallel
$ git clone https://github.com/pcorliss/s3cmd-modification.git $ cd s3cmd-modification $ python setup.py install $ s3cmd --parallel --workers=4 sync /source/path s3://target/path
You are clearly skilled in writing shell, and extremely close to a solution: