what xpath to select CDATA content when some child

2020-02-14 07:00发布

问题:

Let's say I have an XML that looks like this:

<a>
  <b>
     <![CDATA[some text]]>
     <c>xxx</c>
     <d>yyy</d>
  </b>
</a>

I can't find a way to get "some text". Any idea?

If I'm using "a/b" it returns also xxx and yyy If I'm using "a/b/text()" it returns nothing

回答1:

You can't actually select a CDATA section: CDATA is just a way of telling the parser to avoid unescaping special characters, and your input document looks to XPath exactly the same as:

<a>
  <b>
     some text
     <c>xxx</c>
     <d>yyy</d>
  </b>
</a>

(Having said that, if you're using DOM, then some DOM XPath engines fail to implement the spec correctly, and treat the CDATA content as a separate text node from the text outside the CDATA section).

The XPath expression a/b/text() should select three text nodes, of which the first contains "some text" along with surrounding whitespace.



回答2:

With the XPath data model the path /a/b/text()[1] should select a text node with the string value

some text

that is a line break, some spaces, the text some text followed by a line break and some spaces.



标签: xpath cdata