How to use urllib to download image from web

I'm trying to download an image using this code:

from urllib import urlretrieve
urlretrieve('http://gdimitriou.eu/wp-content/uploads/2008/04/google-image-search.jpg', 
            'google-image-search.jpg')

It worked. The image was downloaded and can be open by any image viewer software.

However, the code below is not working. Downloaded image is only 2KB and can't be opened by any image viewer.

from urllib import urlretrieve
urlretrieve('http://upload.wikimedia.org/wikipedia/en/4/44/Zindagi1976.jpg', 
            'Zindagi1976.jpg')

Here is the result in HTML format.

    ERROR

The requested URL could not be retrieved

While trying to retrieve the URL: http://upload.wikimedia.org/wikipedia/en/4/44/Zindagi1976.jpg

The following error was encountered:

Access Denied.
Access control configuration prevents your request from being allowed at this time. Please contact your service provider if you feel this is incorrect.

Your cache administrator is nobody. 
Generated Mon, 05 Dec 2011 17:19:53 GMT by sq56.wikimedia.org (squid/2.7.STABLE9)

标签： python urllib

1条回答

一夜七次

2楼-- · 2019-01-11 22:04

If you used the following, you can download the image:

wget http://upload.wikimedia.org/wikipedia/en/4/44/Zindagi1976.jpg

But if you did the following:

from urllib import urlretrieve
urlretrieve('http://upload.wikimedia.org/wikipedia/en/4/44/Zindagi1976.jpg', 
            'Zindagi1976.jpg')

You may not be able to download image. This may be the case because wikipedia may have rules (robot.txt) to deny robots or bots (unknown clients). Try emulating a browser.

To do that you have to add the following as a part of header:

('User-agent', 
 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) 
 Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1')

You can do something like this:

>>> from urllib import FancyURLopener
>>> class MyOpener(FancyURLopener):
...     version = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; it; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11'
... 
>>> myopener = MyOpener()
>>> myopener.retrieve('http://upload.wikimedia.org/wikipedia/en/4/44/Zindagi1976.jpg', 'Zindagi1976.jpg')
('Zindagi1976.jpg', <httplib.HTTPMessage instance at 0x1007bfe18>)

This retrieves the file

0人赞添加讨论(0) 举报

How to use urllib to download image from web

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间