how to get the most deeply nested element nodes us

2019-01-25 21:18发布

问题:

I need to extract (XSLT, xpath, xquery... Preferably xpath) the most deeply nested element nodes with method (DEST id="RUSSIA" method="delete"/>) and his direct ancestor (SOURCE id="AFRICA" method="modify">).

I don't want to get the top nodes with methods ( main method="modify"> or main method="modify"> ).

The deepest nested elements with method correspond to real actions. The top elements with method actually are dummy actions that must not be taken into account.

Here is my XML sample file:

<?xml version="1.0" encoding="UTF-8"?>
<main method="modify">
<MACHINE method="modify">  
  <SOURCE id="AFRICA" method="modify">
    <DEST id="RUSSIA" method="delete"/>
    <DEST id="USA" method="modify"/>
  </SOURCE>

  <SOURCE id="USA" method="modify">
    <DEST id="AUSTRALIA" method="modify"/>
    <DEST id="CANADA" method="create"/>
  </SOURCE>
</MACHINE>
</main>

This is Xpath output I expect:

<SOURCE id="AFRICA" method="modify"><DEST id="RUSSIA" method="delete"/>

<SOURCE id="AFRICA" method="modify"><DEST id="USA" method="modify"/>

<SOURCE id="USA" method="modify"><DEST id="AUSTRALIA" method="modify"/>

<SOURCE id="USA" method="modify"><DEST id="CANADA" method="create"/>

My current xpath command does not provide the adequate result.

Command xpath("//[@method]/ancestor::*") which is returning:

<main><MACHINE method="modify">                                        # NOT WANTED

<MACHINE method="modify"><SOURCE id="AFRICA" method="modify">          # NOT WANTED

<MACHINE method="modify"><SOURCE id="USA" method="modify">             # NOT WANTED

<SOURCE id="AFRICA" method="modify"><DEST id="RUSSIA" method="delete"/>

<SOURCE id="AFRICA" method="modify"><DEST id="USA" method="modify"/>

<SOURCE id="USA" method="modify"><DEST id="AUSTRALIA" method="modify"/>

<SOURCE id="USA" method="modify"><DEST id="CANADA" method="create"/>

My xmltwig code for additional information (context):

#!/usr/bin/perl -w
use warnings;
use XML::Twig;
use XML::XPath;

@my $t= XML::Twig->new;
my $v= XML::Twig::Elt->new;
$t-> parsefile ('input.xml');

@abc=$t->get_xpath("\/\/[\@method]\/ancestor\:\:\*") ;
 foreach $v (@abc)   # outer 1
 {
    foreach $v ($v ->children)  # internal 1
    {
      $w=$v->parent;
      print $w->start_tag;
      print $v->start_tag;
    }
  }

回答1:

The nodes with maximum depth can be found with

//*[count(ancestor::*) = max(//*/count(ancestor::*))]

but it might perform horribly, depending how smart your optimizer is.

Having found those nodes, it is of course trivial to find their ancestors. But you are looking for output with more structure than XPath alone can provide.



回答2:

As I mentioned in my comment on the question, I don't think this is possible with pure XPath as XPath doesn't have anything like a current() function that would allow to refer to the context outside of a [] restriction.

The most similar solution should be this XSLT:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:ZD="http://xyz.abc">
    <xsl:output method="text"/>

    <xsl:template match="//*">
        <xsl:choose>
            <xsl:when test="not(//*[count(ancestor::node()) > count(current()/ancestor::node())])"><xsl:value-of select="local-name(.)"/><xsl:text>
</xsl:text></xsl:when>
            <xsl:otherwise>
                <xsl:copy>
                    <xsl:apply-templates select="@*|node()"/>
                </xsl:copy>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:template>

    <xsl:template match="text()|@*"/>
</xsl:stylesheet>

The <xsl:when> element finds the most deeply nested elements. As an example, I'm outputting the local names of the found elements, followed by a newline, but of course you can output anything you need there.

Update: Note that this is based on XPath 1.0 knowledge/tools. It seems that this is indeed possible to express in XPath 2.0.



回答3:

The stylesheet

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>

<xsl:template match="/">
  <xsl:apply-templates 
     select="//DEST[@method and not(node())]"/>
</xsl:template>

<xsl:template match="@* | node()">
  <xsl:copy>
    <xsl:apply-templates select="@* , node()"/>
  </xsl:copy>
</xsl:template>

<xsl:template match="DEST[@method and not(node())]">
  <xsl:apply-templates select="..">
    <xsl:with-param name="leaf" select="current()"/>
  </xsl:apply-templates>
</xsl:template>

<xsl:template match="*[DEST[@method and not(node())]]">
  <xsl:param name="leaf"/>
  <xsl:copy>
    <xsl:copy-of select="@* , $leaf"/>
  </xsl:copy>
</xsl:template>

</xsl:stylesheet>

transforms

<?xml version="1.0" encoding="UTF-8"?>
<main method="modify">
<MACHINE method="modify">  
  <SOURCE id="AFRICA" method="modify">
    <DEST id="RUSSIA" method="delete"/>
    <DEST id="USA" method="modify"/>
  </SOURCE>

  <SOURCE id="USA" method="modify">
    <DEST id="AUSTRALIA" method="modify"/>
    <DEST id="CANADA" method="create"/>
  </SOURCE>
</MACHINE>
</main>

into

<SOURCE id="AFRICA" method="modify">
   <DEST id="RUSSIA" method="delete"/>
</SOURCE>
<SOURCE id="AFRICA" method="modify">
   <DEST id="USA" method="modify"/>
</SOURCE>
<SOURCE id="USA" method="modify">
   <DEST id="AUSTRALIA" method="modify"/>
</SOURCE>
<SOURCE id="USA" method="modify">
   <DEST id="CANADA" method="create"/>
</SOURCE>


回答4:

One such XPath2.0 expression is:

//*[not(*)
  and
   count(ancestor::*)
  =
   max(//*[not(*)]/count(ancestor::*))
   ]
     /(self::node|..)

To illustrate this with a complete XSLT 2.0 example:

<xsl:stylesheet version="2.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output omit-xml-declaration="yes" indent="yes"/>

    <xsl:variable name="vResult" select=
     "//*[not(*)
        and
          count(ancestor::*)
       =
        max(//*[not(*)]/count(ancestor::*))
        ]
          /(self::node|..)
     "/>

 <xsl:template match="/">
     <xsl:sequence select="$vResult"/>
 </xsl:template>
</xsl:stylesheet>

When this transformation is applied on the provided XML document:

<main method="modify">
    <MACHINE method="modify">
        <SOURCE id="AFRICA" method="modify">
            <DEST id="RUSSIA" method="delete"/>
            <DEST id="USA" method="modify"/>
        </SOURCE>
        <SOURCE id="USA" method="modify">
            <DEST id="AUSTRALIA" method="modify"/>
            <DEST id="CANADA" method="create"/>
        </SOURCE>
    </MACHINE>
</main>

the XPath expression is evaluated and the selected elements (the elements at maximum depth and their parents) are copied to the output:

<SOURCE id="AFRICA" method="modify">
            <DEST id="RUSSIA" method="delete"/>
            <DEST id="USA" method="modify"/>
        </SOURCE>
<SOURCE id="USA" method="modify">
            <DEST id="AUSTRALIA" method="modify"/>
            <DEST id="CANADA" method="create"/>
        </SOURCE>