I would like to construct an XPath query that will return a "div" or "table" element, so long as it has a descendant containing the text "abc". The one caveat is that it can not have any div or table descendants.
<div>
<table>
<form>
<div>
<span>
<p>abcdefg</p>
</span>
</div>
<table>
<span>
<p>123456</p>
</span>
</table>
</form>
</table>
</div>
So the only correct result of this query would be:
/div/table/form/div
My best attempt looks something like this:
//div[contains(//text(), "abc") and not(descendant::div or descendant::table)] | //table[contains(//text(), "abc") and not(descendant::div or descendant::table)]
but does not return the correct result.
Thanks for your help.
Something different: :)
//text()[contains(.,'abc')]/ancestor::*[self::div or self::table][1]
Seems a lot shorter than the other solutions, doesn't it? :)
Translated to simple English: For any text node in the document that contains the string "abc"
select its first ancestor that is either a div
or a table
.
This is more efficient, as only one full scan of the document tree (and not any other) is required, and the ancestor::*
traversal is very cheap compared to a descendent::
(tree) scan.
To verify that this solution "really works":
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<xsl:copy-of select=
"//text()[contains(.,'abc')]/ancestor::*[self::div or self::table][1] "/>
</xsl:template>
</xsl:stylesheet>
when this transformation is performed on the provided XML document:
<div>
<table>
<form>
<div>
<span>
<p>abcdefg</p>
</span>
</div>
<table>
<span>
<p>123456</p>
</span>
</table>
</form>
</table>
</div>
the wanted, correct result is produced:
<div>
<span>
<p>abcdefg</p>
</span>
</div>
Note: It isn't necessary to use XSLT -- any XPath 1.0 host -- such as DOM, must obtain the same result.
//*[self::div|self::table]
[descendant::text()[contains(.,"abc")]]
[not(descendant::div|descendant::table)]
The problem with contains(//text(), "abc")
is that functions cast node sets taking the first node.
you could try:
//div[
descendant::text()[contains(., "abc")]
and not(descendant::div or descendant::table)
] |
//table[
descendant::text()[contains(., "abc")]
and not(descendant::div or descendant::table)
]
does that help?