How to get the text from child nodes if it is pare

2019-05-12 02:55发布

I am facing a problem where I have to get the result from the child node which may or may not be parents to some other node using Xpath in scrapy. consider the case like

<h1 class="main">
 <span class="child">data</span>
</h1>

or

<h1 class="main">
<span class="child">
 <span class="child2">data</span>
</span>
</h1>

My solution was response.xpath(".//h1[@class='main']/span/text()").extract()

2条回答
贪生不怕死
2楼-- · 2019-05-12 03:32

use //text, and it will return all text elements in a list from within your span, both parent and child:

response.xpath(".//h1[@class='main']/span//text()").extract()
查看更多
Deceive 欺骗
3楼-- · 2019-05-12 03:56

You can use:

  • response.xpath("string(.//h1[@class='main']/span)").extract()
  • or even response.xpath("string(.//h1[@class='main'])").extract() if you're after the whole header text
查看更多
登录 后发表回答