I'm trying to download an image using this code:
from urllib import urlretrieve
urlretrieve('http://gdimitriou.eu/wp-content/uploads/2008/04/google-image-search.jpg',
'google-image-search.jpg')
It worked. The image was downloaded and can be open by any image viewer software.
However, the code below is not working. Downloaded image is only 2KB and can't be opened by any image viewer.
from urllib import urlretrieve
urlretrieve('http://upload.wikimedia.org/wikipedia/en/4/44/Zindagi1976.jpg',
'Zindagi1976.jpg')
Here is the result in HTML format.
ERROR
The requested URL could not be retrieved
While trying to retrieve the URL: http://upload.wikimedia.org/wikipedia/en/4/44/Zindagi1976.jpg
The following error was encountered:
Access Denied.
Access control configuration prevents your request from being allowed at this time. Please contact your service provider if you feel this is incorrect.
Your cache administrator is nobody.
Generated Mon, 05 Dec 2011 17:19:53 GMT by sq56.wikimedia.org (squid/2.7.STABLE9)
If you used the following, you can download the image:
But if you did the following:
You may not be able to download image. This may be the case because wikipedia may have rules (robot.txt) to deny robots or bots (unknown clients). Try emulating a browser.
To do that you have to add the following as a part of header:
You can do something like this:
This retrieves the file