I use the Python Requests library to download a big file, e.g.:
r = requests.get("http://bigfile.com/bigfile.bin")
content = r.content
The big file downloads at +- 30 Kb per second, which is a bit slow. Every connection to the bigfile server is throttled, so I would like to make multiple connections.
Is there a way to make multiple connections at the same time to download one file?
You can use HTTP
Range
header to fetch just part of file (already covered for python here).Just start several threads and fetch different range with each and you're done ;)
Also note that not every server supports
Range
header (and especially servers with php scripts responsible for data fetching often don't implement handling of it).This solution requires the linux utility named "aria2c", but it has the advantage of easily resuming downloads.
It also assumes that all the files you want to download are listed in the http directory list for location
MY_HTTP_LOC
. I tested this script on an instance of lighttpd/1.4.26 http server. But, you can easily modify this script so that it works for other setups.Here's a Python script that saves given url to a file and uses multiple threads to download it:
The end of file is detected if a server returns empty body, or 416 http code, or if the response size is not
chunksize
exactly.It supports servers that doesn't understand
Range
header (everything is downloaded in a single request in this case; to support large files, changedownload_chunk()
to save to a temporary file and return the filename to be read in the main thread instead of the file content itself).It allows to change independently number of concurrent connections (pool size) and number of bytes requested in a single http request.
To use multiple processes instead of threads, change the import: