Filling Items in scrapy in different functions acr

2019-04-02 14:42发布

What I'd want to do is filling item fields defined in item.py in different functions within the spider.py file, say in the start_requests function where all the requests are being made, I'd want to fill a Field called 'item_id'.

def start_requests(self):
    forms = []
    for item in self.yhd_items:
        self.item["item_id"] = item.ItemCode
        forms.append(FormRequest(self.base_url + item.ItemCode, method='GET',
                                 callback = self.parse_search_result))

    return forms

Note that I made an instance of items up in the init function. This way just the item_id filed is being filled and passed to the next parser method(parse_search_result). Other Fields in item.py will be filled in the next function and passed again to another parser method. Would it be a legitimate?

1条回答
Viruses.
2楼-- · 2019-04-02 15:24

This is exactly what is meta argument for, example:

def parse_page1(self, response):
    item = MyItem()
    item['main_url'] = response.url
    request = scrapy.Request("http://www.example.com/some_page.html",
                             callback=self.parse_page2)
    request.meta['item'] = item
    return request

def parse_page2(self, response):
    item = response.meta['item']
    item['other_url'] = response.url
    return item

Here we define an item instance in parse_page1, filling main_url field and then passing the item to parse_page2 in meta dictionary. In parse_page2, other_url field is set and the item is returned.

Hope this is what you were asking about.

查看更多
登录 后发表回答