how can i use wget to download files with scrapy

2019-06-21 04:41发布

站内文章 / Python

17 0

叼着烟拽天下

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

scrapy is not downloading files properly. I have URLs of my items, so I figured I can use wget to download the files.

How can i use wget inside the scrapy process_item function? Alternatively, is there another way of download files?

class MyImagesPipeline(ImagesPipeline):
    #Name download version
    def image_key(self, url):
        image_guid = url.split('/')[-1]
        return 'full/%s' % (image_guid)

    def get_media_requests(self, item, info):
        if item['image_urls']:
            for image_url in item['image_urls']:
            # wget -nH image_ul -P  images/
            yield Request(image_url)

回答1:

This code will execute wget, you can replace your comment with the following lines

import subprocess

    ...

subprocess.call(['wget','-nH', image_url, '-P  images/'])

You can read about subprocess.call here: http://docs.python.org/2/library/subprocess.html

标签： python wget scrapy

叼着烟拽天下

女 | 书童

私信

收藏的人(0)

Ta的文章更多文章

0条评论

还没有人评论过~

how can i use wget to download files with scrapy

问题:

回答1:

收藏的人(0)

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮