What's the fastest way to move tens of thousands of small image files from my local machine to a container within Azure Cloud Storage?
I am trying the highly-recommended CloudBerry explorer for Azure, and the estimated-time of completion is roughly 4 hours for me right now (around ~30K files in total, 5KB average file size). This is unaccetable for me - I want to drastically cut down that time.
Can you suggest any other options? I think non-GUI ones will be faster. I'll provide an example (below) of one Linux-based solution I tried, which didn't work for me. Perhaps an expert can point out something similar, but with a correct usage example. The solution below isn't particularly well-documented when it comes to exhaustive examples. Thanks in advance, and feel free to ask me for more information in case you need it.
The Linux based solution I tried is called blobxfer - which is like AzCopy, but for Linux. The command I used was blobxfer mystorageaccount pictures /home/myuser/s3 --upload --storageaccountkey=<primary access key from portal.azure.com> --no-container
. But I keep getting an arcane error: Unknown error (The value for one of the HTTP headers is not in the correct format.)
Full traceback:
<?xml version="1.0" encoding="utf-8"?><Error><Code>InvalidHeaderValue</Code><Message>The value for one of the HTTP headers is not in the correct format.
RequestId:61a1486c-0101-00d6-13b5-408578134000
Time:2015-12-27T12:56:03.5390180Z</Message><HeaderName>x-ms-blob-content-length</HeaderName><HeaderValue>0</HeaderValue></Error>
Exception in thread Thread-49 (most likely raised during interpreter shutdown):
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
File "/home/myuser/.virtualenvs/redditpk/local/lib/python2.7/site-packages/blobxfer.py", line 506, in run
File "/home/myuser/.virtualenvs/redditpk/local/lib/python2.7/site-packages/blobxfer.py", line 597, in putblobdata
File "/home/myuser/.virtualenvs/redditpk/local/lib/python2.7/site-packages/blobxfer.py", line 652, in azure_request
<type 'exceptions.AttributeError'>: 'NoneType' object has no attribute 'Timeout'
Please try upgrading your blobxfer to 0.9.9.6. There were a few bugs with zero-byte files that were recently fixed.
Regarding your question with blobxfer, you should directly open issues on the GitHub page rather than on stackoverflow. Maintainers of the code will have an easier time looking at your issue and replying and/or fixing your issue with regard to that specific tool. If you are still encountering issues with blobxfer after upgrading to 0.9.9.6 then post an issue directly on the GitHub project page.
In general, as shellter has pointed out, for thousands of small files you should archive them first then upload the archive to achieve greater throughput.