Scrapy pipeline error cannot import name

I am new to python programming and using scrapy. I have setup my crawler and so far it was working until I got to the point where I wanted to figure out how to download images. The error I am getting is cannot import name NsiscrapePipeline. I dont know what I am doing wrong and I dont understand some of the documentation as I am new. Please help

Items File

from scrapy.item import Item, Field

class NsiscrapeItem(Item):
    # define the fields for your item here like:
    # name = Field()
    location = Field()
    stock_number = Field()
    year = Field()
    manufacturer = Field()
    model = Field()
    length = Field()
    price = Field()
    status = Field()
    url = Field()

    pass

Spider

from scrapy.spider import BaseSpider
from scrapy.selector import HtmlXPathSelector
from NSIscrape.items import NsiscrapeItem
from scrapy.http import Request
from scrapy.contrib.pipeline.images import NsiscrapePipeline
import Image

class NsiscrapeSpider(BaseSpider):
    name = "Nsiscrape"
    allowed_domain = ["yachtauctions.com"]
    start_urls = [
    "http://www.yachtauctions.com/inventory/"
    ]

    def parse(self, response):
    hxs = HtmlXPathSelector(response)
    sites = hxs.select('//tr')
    items = []
    for site in sites:
    item = NsiscrapeItem()
    item['location'] = site.select('td[2]/text()').extract()
    item['stock_number'] = site.select('td[3]/a/text()').extract()
    item['year'] = site.select('td[4]/text()').extract()
    item['manufacturer'] = site.select('td[5]/text()').extract()
    item['model'] = site.select('td[6]/text()').extract()
    item['length'] = site.select('td[7]/text()').extract()
    item['price'] = site.select('td[8]/text()').extract()
    item['status'] = site.select('td[10]/img/@src').extract()
    item['url'] = site.select('td[1]/a/@href').extract()
    item['image_urls'] = site.select('td/a[3]/img/@data-original').extract()
        item['images'] = item['image_urls']
        yield Request(item['url'][0], meta={'item':item}, callback=self.product_detail_page)


    def product_detail_page(self, response):
    hxs = HtmlXPathSelector(response)
    item = response.request.meta['item']
    #add all images url in the item['image_urls']
    yield item

settings

ITEM_PIPELINES = ['scrapy.contrib.pipeline.image.NsiscrapePipeline']
IMAGES_STORE = 'c:\Python27\NSIscrape\IMG'
IMAGES_EXPIRES = 90

Pipelines This is where I am unsure if I am missing something

from scrapy.item import Item 

class NsiscrapePipeline(Item):
image_urls = Field()
    images = Field()

    def process_item(self, item, spider):
        return item

error

File "NSIscrape\spiders\NSI_Spider.py", line 9, in <module>
from scrapy.contrib.pipeline.images import NsiscrapePipeline
ImportError: cannot import name NsiscrapePipeline

标签： python scrapy

3条回答

【Aperson】

2楼-- · 2019-08-28 03:30

That isn't part of the library :) - at least by looking at their current master branch

I think you're looking for ImagesPipeline

Their example may help! example

p.s. I don't think you custom name the class - at least not by how scapy is designed; i'm reasonably sure you use their class ;)

0人赞添加讨论(0) 举报

唯我独甜

3楼-- · 2019-08-28 03:34

Heres my final code thats working. There was two issues

1: I was missing the second backslash that needede to be in the request --> //td[1]/a[3]/img/@data-original

2: I had to check the full URL in which the image would be displayed and join them together which was the main URL or the allowed URL and the image URL.

def parse(self, response):
    hxs = HtmlXPathSelector(response)
    images = hxs.select('//tr')
    url = []
    for image in images:
        urls = NsiscrapeItem()
        urls['image_urls'] = ["http://www.yachtauctions.com" +  x for x in image.select('//td[1]/a[3]/img/@data-original').extract()]
        url.append(urls)
    return url

0人赞添加讨论(0) 举报

唯我独甜

4楼-- · 2019-08-28 03:38

You tried to pass list, but this function accepts only string. Pass only one element from list (for example list[0]).

0人赞添加讨论(0) 举报

Scrapy pipeline error cannot import name

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间