XPath to select between two HTML comments is not w

2019-03-04 05:36发布

I'm trying to select some content between two HTML comments, but having some trouble getting it right (as seen in "XPath to select between two HTML comments?"). There seems to be a problem when new comments that are on the same line.

My HTML:

<html>
 ........
 <!-- begin content -->
 <div>some text</div>
 <div>
   <p>Some more elements</p>
 </div>
 <!-- end content --><!-- begin content -->
 <div>more text</div>
 <!-- end content -->
 .......
</html>

I use:

doc.xpath("//node()[preceding-sibling::comment()[. = ' begin content ']]
          [following-sibling::comment()[. = ' end content ']]")

Result:

<div>some text</div>
<div>
  <p>Some more elements</p>
</div>
<!-- end content --><!-- begin content -->
<div>more text</div>

What I'm trying to get:

<div>some text</div>
<div>
  <p>Some more elements</p>
</div>

1条回答
Summer. ? 凉城
2楼-- · 2019-03-04 06:09

If you are interested in the first pair of comments, you could start with looking for the first comment:

//comment()[.=' begin content ']/following::*[not(preceding::comment()[.=' end content '])]

I.e.:

//comment()[1][.=' begin content ']           <-- look for first suitable comment
    /following::*                             <-- take all following nodes
         [not(preceding::comment()[.=' end content '])] <-- satisfying condition there is no preceding "end comment"
查看更多
登录 后发表回答