How to get title from class attribute in XPath(Pyt

2019-05-29 14:27发布

Im working on getting the data from tripadvisor but most of the first ones are relative date and the rest are normal MM/DD/YYYY, but with closer inspection I see that relative date has this

<span class="ratingDate relativeDate" title="20 June 2015">Reviewed 4 weeks ago
</span>

I am using this Xpath to get the data

response.xpath('//div[@class="col2of2"]//span[@class="ratingDate relativeDat
e" or @class="ratingDate"]/text()').extract()

My question is How do I add the @title so that I can get the title which has the normal date format.

I tried

response.xpath('//div[@class="col2of2"]//span[@class="ratingDate relativeDat
e"/@title or @class="ratingDate"]/text()').extract()

response.xpath('//div[@class="col2of2"]//span[@class="ratingDate relativeDat
e" or @class="ratingDate"]/@title/text()').extract()

1条回答
唯我独甜
2楼-- · 2019-05-29 15:03

Figured it out in the spider you have to do a conditional statement that will dynamically check whether that xpath contains values or not.

Here's my rendition.

item['date'] = sel.xpath('//*[@class="ratingDate relativeDate"]/@title').extract()
item['date'] += sel.xpath('//div[@class="col2of2"]//span[@class="ratingDate"]/text()').extract()
查看更多
登录 后发表回答