Find position of a node within a nodeset using xpa

2019-02-17 13:36发布

问题:

After playing around with position() in vain I was googling around for a solution and arrived at this older stackoverflow question which almost describes my problem.

The difference is that the nodeset I want the position within is dynamic, rather than a contiguous section of the document.

To illustrate I'll modify the example from the linked question to match my requirements. Note that each <b> element is within a different <a> element. This is the critical bit.

<root>
    <a>
        <b>zyx</b>
    </a>
    <a>
        <b>wvu</b>
    </a>
    <a>
        <b>tsr</b>
    </a>
    <a>
        <b>qpo</b>
    </a>
</root>

Now if I queried, using XPath for a/b I'd get a nodeset of the four <b> nodes. I want to then find the position within that nodeset of the node that contains the string 'tsr'. The solution in the other post breaks down here: count(a/b[.='tsr']/preceding-sibling::*)+1 returns 1 because preceding-sibling is navigating the document rather than the context node-set.

Is it possible to work within the context nodeset?

回答1:

Here is a general solution that works on any node that belongs in any node-set of nodes in the same document:

I am using XSLT to implement the solution, but finally obtain a single XPath expression that may be used with any other hosting language.

Let $vNodeSet be the node-set and $vNode be the node in this node-set whose position we want to find.

Then, let $vPrecNodes contains all nodes in the XML document preceding $vNode.

Then, let $vAncNodes contains all nodes in the XML document that are ancestors of $vNode.

The set of nodes in $vNodeSet that precede $vNode in document order consists of all nodes in the nodeset that belong also to $vPrecNodes and all nodes in the node-set that also belong to $vAncNodes.

I will use the well-known Kaysian formula for intersection of two nodesets:

$ns1[count(.|$ns2) = count($ns2)]

contains exactly the nodes in the intersection of $ns1 with $ns2.

Based on all this, let $vPrecInNodeSet is the set of nodes in $vNodeSet that precede $vNode in document order. The following XPath expression defines $vPrecInNodeSet:

$vNodeSet
      [count(.|$vPrecNodes) = count($vPrecNodes)
      or
       count(.|$vAncNodes) = count($vAncNodes)
      ]

Finally, the wanted position is: count($vPrecInNodeSet) +1

Here's how this all works together:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

 <xsl:variable name="vNodeSet" select="/*/a/b"/>

 <xsl:variable name="vNode" select="$vNodeSet[. = 'tsr'][1]"/>

 <xsl:variable name="vPrecNodes" select="$vNode/preceding::node()"/>

 <xsl:variable name="vAncNodes" select="$vNode/ancestor::node()"/>

 <xsl:variable name="vPrecInNodeSet" select=
  "$vNodeSet
      [count(.|$vPrecNodes) = count($vPrecNodes)
      or
       count(.|$vAncNodes) = count($vAncNodes)
      ]
  "/>

 <xsl:template match="/">
   <xsl:value-of select="count($vPrecInNodeSet) +1"/>
 </xsl:template>
</xsl:stylesheet>

When the above transformation is applied on the provided XML document:

<root>
    <a>
        <b>zyx</b>
    </a>
    <a>
        <b>wvu</b>
    </a>
    <a>
        <b>tsr</b>
    </a>
    <a>
        <b>qpo</b>
    </a>
</root>

the correct result is produced:

3

Do note: This solution does not depend on XSLT (used only for illustrative purposes). You may assemble a single XPath expression, substituting the variables with their definition, until there are no more variables to substitute.



回答2:

I think I have a working solution

The idea is to count how many elements are preceding our target element in the document and count how many nodes in the nodeset there are that have less or equally many preceding elements. In XPath this is:

count(//a/b[count(./preceding::node()) &lt;= count(//a/b[.='tsr']/preceding::node())])

You can also use variables in this expression to find different nodesets or to match different text contents. Important part here is that the variables have correct type. Below is an XSLT example and an example output using the example document of the question as the input file

XSLT document

<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

    <xsl:output encoding="utf-8" method="text"/>

    <xsl:variable name="nodeset" select="//a/b"/>
    <xsl:variable name="path-string">//a/b</xsl:variable>
    <xsl:variable name="text">tsr</xsl:variable>

    <xsl:template match="/">
        <xsl:text>Find and print position of a node within a nodeset&#10;&#10;</xsl:text>

        <xsl:text>Position of "tsr" node in the nodeset = "</xsl:text>
        <xsl:value-of select="count(//a/b[count(./preceding::node()) &lt;= count(//a/b[.='tsr']/preceding::node()) ])"/>
        <xsl:text>"&#10;&#10;</xsl:text>

        <xsl:text>( Try the same using variables "$nodeset" and "$text" )&#10;</xsl:text>
        <xsl:text>Size of nodeset "$nodeset" = "</xsl:text>
        <xsl:value-of select="count($nodeset)"/>
        <xsl:text>"&#10;</xsl:text>
        <xsl:text>Variable "$text" = "</xsl:text>
        <xsl:value-of select="$text"/>
        <xsl:text>"&#10;</xsl:text>
        <xsl:text>Position of "</xsl:text>
        <xsl:value-of select="$text"/>
        <xsl:text>" node in the nodeset = "</xsl:text>
        <xsl:value-of select="count($nodeset[count(./preceding::node()) &lt;= count($nodeset[.=$text]/preceding::node()) ])"/>
        <xsl:text>"&#10;&#10;</xsl:text>

        <xsl:text>( Show that using a variable that has the path as a string does not work )&#10;</xsl:text>
        <xsl:text>Variable "$path-string" = "</xsl:text>
        <xsl:value-of select="$path-string"/>
        <xsl:text>"&#10;</xsl:text>
        <xsl:text>Result of "count($path-string)" = "</xsl:text>
        <xsl:value-of select="count($path-string)"/>
        <xsl:text>"&#10;&#10;</xsl:text>

        <xsl:text>End of tests&#10;</xsl:text>
    </xsl:template>

</xsl:stylesheet>

Output from the example document

Find and print position of a node within a nodeset

Position of "tsr" node in the nodeset = "3"

( Try the same using variables "$nodeset" and "$text" )
Size of nodeset "$nodeset" = "4"
Variable "$text" = "tsr"
Position of "tsr" node in the nodeset = "3"

( Show that using a variable that has the path as a string does not work )
Variable "$path-string" = "//a/b"
Result of "count($path-string)" = "1"

End of tests

I have not tested my solution extensively so please give feedback if you use it.



回答3:

The earlier count-the-preceding(-sibling) answers work well in some cases; you're just re-specifying the context nodeset from the perspective of the item selected, and then applying count(preceding:: ) to it.

But in other cases, count-the-preceding is really hard to keep within the nodeset you want to work with, as you were hinting at. E.g. suppose your working nodeset was /html/body/div[3]//a (all the <a> anchors in the third <div> of the web page), and you wanted to find the position of a[@href="foo.html"] within that set. If you tried to use count(preceding::a), you'd accidentally be counting <a> anchors from other divs, i.e. outside your working nodeset. And if you tried count(preceding-sibling::a), you wouldn't get them all because the relevant <a> elements could be at any level.

You could try to restrict the count using preceding::a[ancestor::div[count(preceding-sibling::div) = 2]] but it gets really awkward fast, and still wouldn't be possible in all cases. Moreover you'd have to rework this expression if you ever updated the XPath expression for your working set, and keeping them equivalent would be non-trivial.

However if you're using XSLT, the following avoids these problems. If you can specify the working nodeset, you can find the position of a node within it matching supplied criteria. And you don't have to specify the nodeset twice:

    <xsl:for-each select="/root/a/b">
        <xsl:if test=". = 'tsr'"><xsl:value-of select="position()"/></xsl:if>
    </xsl:for-each>

This works because within the for-each, the context position "identifies the position of the context item in the sequence being processed."

If you aren't working in XSLT, what environment are you in? There is probably a similar construct there for iterating through the result of the outer XPath expression, and there you can maintain your own counter (if there's not a context position available), and test each item against your inner criteria.

The reason why the other guy's attempt on the older question, a/b[.='tsr']/position(), didn't work was because at each slash, a new context is pushed on the stack, so when position() is called, the context position is always 1. (This syntax only works in XPath 2.0 by the way.)



回答4:

The reason you are getting 1 is nothing to do with context vs. document, but because you are only counting b nodes within the one a node (so you will always get a count of 0 because there are never any preceding 'b' nodes.

Rather you need to find the count of preceding 'a' nodes before the 'b' that contains your 'a'.

Something like:

count(a[b[.='tsr']]/preceding-sibling::a)


回答5:

From (i.e. against) the root:

count(//a/b[.='tsr']/preceding::b)

If you had say another node eg:

<c>
    <b>qqq</b>
</c>

and wanted to ignore all b elems not having an "a" parent you could do something like

count(//a/b[.='tsr']/preceding::b[local-name(parent::node())='a'])

etc



回答6:

How about this..

count(a/b[.='tsr']/preceding-sibling::b) + count(a[b[.='tsr']]/preceding-sibling::a/b) + 1

Count the previous siblings of the b element within the current a element, and then count the b elements of all previous siblings of the a element. Or something like that.