I have a small utility that I use to download a MP3 from a website on a schedule and then builds/updates a podcast XML file which I've obviously added to iTunes.
The text processing that creates/updates the XML file is written in Python. I use wget inside a Windows .bat
file to download the actual MP3 however. I would prefer to have the entire utility written in Python though.
I struggled though to find a way to actually down load the file in Python, thus why I resorted to wget
.
So, how do I download the file using Python?
Just for the sake of completeness, it is also possible to call any program for retrieving files using the
subprocess
package. Programs dedicated to retrieving files are more powerful than Python functions likeurlretrieve
. For example,wget
can download directories recursively (-R
), can deal with FTP, redirects, HTTP proxies, can avoid re-downloading existing files (-nc
), andaria2
can do multi-connection downloads which can potentially speed up your downloads.In Jupyter Notebook, one can also call programs directly with the
!
syntax:Python 3
urllib.request.urlopen
urllib.request.urlretrieve
Python 2
urllib2.urlopen
(thanks Corey)urllib.urlretrieve
(thanks PabloG)You can get the progress feedback with urlretrieve as well:
In 2012, use the python requests library
You can run
pip install requests
to get it.Requests has many advantages over the alternatives because the API is much simpler. This is especially true if you have to do authentication. urllib and urllib2 are pretty unintuitive and painful in this case.
2015-12-30
People have expressed admiration for the progress bar. It's cool, sure. There are several off-the-shelf solutions now, including
tqdm
:This is essentially the implementation @kvance described 30 months ago.
You can use PycURL on Python 2 and 3.
An improved version of the PabloG code for Python 2/3: