When I pass elements through callback function like in this example found on the official scrapy documentation.
I was wondering if the element item
passed to parse_page2
once modified inside the aforementioned function could be retrieved modified in the parge_page1
function.
I mean assume the example below. In the parse_page2
function we add the response.url
into the 'other_url' field.
Does it exist a way to get 'other_url' inside parse_page1
after the execution of parse_page2
has completed?
def parse_page1(self, response):
item = MyItem()
item['main_url'] = response.url
request = scrapy.Request("http://www.example.com/some_page.html",
callback=self.parse_page2)
request.meta['item'] = item
return request
def parse_page2(self, response):
item = response.meta['item']
item['other_url'] = response.url
return item
Instead of creating your item in the parse_page1 function, you can simply pass the response.url in the meta dict, and create the item in your parse_page2 function.
Or, if you really want to return the info from the parse_page2, you can callback parse_page1, and add a conditional in your function: