merge two or more node in xml using xslt

2019-08-24 02:37发布

There are 3 scenarios in this problem:

First possibility: Input:

<root>
    <node id="N1">
        <fruit id="1" action="aaa">
            <orange id="x" action="create">
                <attribute>
                    <color>Orange</color>
                    <year>2012</year>
                </attribute>
            </orange>
            <orange id="x" action="change">
                <attribute>
                    <color>Red</color>
                </attribute>
            </orange>
            <orange id="x" action="change">
                <attribute>
                    <color>Blue</color>
                    <condition>good</condition>
                </attribute>
            </orange>
        </fruit>
    </node>
</root>

Expected output:

<root>
    <node id="N1">
        <fruit id="1" action="aaa">
            <orange id="x" action="create">
                <attribute>
                    <color>Blue</color>
                    <year>2012</year>
                    <condition>good</condition>
                </attribute>
            </orange>
        </fruit>
    </node>
</root>

Second Possibility: Input:

<root>
    <node id="N1">
        <car id="1">
            <bmw id="i" action="change">
                <attribute>
                    <color>Blue</color>
                    <owner>a</owner>
                </attribute>
            </bmw>
            <bmw id="i" action="change">
                <attribute>
                    <color>Yellow</color>
                    <status>avaailable</status>
                </attribute>
            </bmw>
        </car>
    </node>
</root>

Expected Output:

<root>
    <node id="N1">
        <car id="1">
            <bmw id="i" action="change">
                <attribute>
                    <color>Yellow</color>
                    <owner>a</owner>
                    <status>available</status>
                </attribute>
            </bmw>
        </car>
    </node>
</root>

Third Scenario:

<root>
    <node id="N1">
        <car id="1">
            <bmw id="j" action="delete">
                <attribute>
                    <color>Blue</color>
                    <year>2000</year>
                </attribute>
            </bmw>
            <bmw id="j" action="delete">
                <attribute>
                    <color>Pink</color>
                    <status>available</status>
                </attribute>
            </bmw>
        </car>
    </node>
</root>

Expected Output:

<root>
    <node id="N1">
        <car id="1">
            <bmw id="j" action="delete">
                <attribute>
                    <color>Pink</color>
                    <year>2000</year>
                    <status>available</status>
                </attribute>
            </bmw>            
        </car>
    </node>
</root>

Explanation on second and third scenario:

  • Two or more node with 'action=change' will be merged into one node with 'action=change'
  • Two or more node with 'action=delete' will be merged into one node with 'action=delete'
  • while merging, we update we only keep the value from the last node, keep the initial node and add any new additional node with it.

I hope the explanation is clear.

Please advise me on XSLT solution for this problem. Thank you.

kind regards, John

标签: xml xslt
1条回答
冷血范
2楼-- · 2019-08-24 03:22

Here's a solution of a different flavor compared to the one I gave you here.

I figured it would be worth to go step by step. I made an assumption about @actions appearing in a logical order - create first, change next, and remove last. There can be multiple occurrences of the same @action but it wouldn't be random. Now we're ready to look at the main logic:

<xsl:template match="@* | node()">
    <xsl:copy>
        <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
</xsl:template>

We declare the identity transformation and then intercept it in a few places. We only stop at unique occurrences of a node with the same @id, parent @id, and @action:

<xsl:template match="node/*/*[a:is-primary(.)]" priority="1">
    <xsl:copy>
        <xsl:apply-templates select="@*"/>
        <xsl:apply-templates select="attribute" mode="consolidate-most-recent"/>
    </xsl:copy>
</xsl:template>

We ignore the "duplicates":

<xsl:template match="node/*/*[not(a:is-primary(.))]"/>

and also ignore creates following by a change as well as all creates and change followed by a remove.

<xsl:template match="node/*/*[@action = 'change'][a:preceded-by(., 'create')]" priority="2"/>
<xsl:template match="node/*/*[@action = 'create' or action='change'][a:followed-by(., 'remove')]" priority="2"/>

When the unique @action not followed by another @action that would make us ignore it is captured, we do a simple thing - collect all attributes of elements with the same @ids ignoring the @action and use their most "recent" values (the ones appearing last in the document order).

<xsl:template match="attribute" mode="consolidate-most-recent">
    <xsl:copy>
        <xsl:for-each-group 
                    select="/root/node/*/*[a:matches(current()/parent::*, ., 'any')]/attribute/*" 
                    group-by="local-name()">
            <!-- take the last in the document order -->
            <xsl:apply-templates select="current-group()[last()]"/>
        </xsl:for-each-group>
    </xsl:copy>
</xsl:template>

That's it. Now let's look at the functions that would make it work:

We have got a key to simplify the lookup

<xsl:key name="entity" match="/root/node/*/*" use="concat(parent::*/@id, '_', @id, '_', @action)"/>

A function to check whether it's that unique occurrence of the node (we could add this directly into the template match predicate but since we started with the functions anyway let's just keep it the same):

<xsl:function name="a:is-primary" as="xs:boolean">
    <xsl:param name="ctx"/>
    <!-- need to establish "focus"(context) for the key() function to work -->
    <xsl:for-each select="$ctx">
        <xsl:sequence select="generate-id($ctx) = generate-id(key('entity', concat($ctx/parent::*/@id, '_', $ctx/@id, '_', $ctx/@action))[1])"/>
    </xsl:for-each>
</xsl:function> 

a matches function that will do all kind of comparisons for us (again, can put it all in predicates but this way we'll keep it nice and clean in the real templates):

<xsl:function name="a:matches" as="xs:boolean">
    <xsl:param name="src"/>
    <xsl:param name="target"/>
    <!-- can be one of the following:
        'any' - only match the @id(s) and ignore @action
        'same' - match by @id(s) and expect $src/@action to match $target/@action
         a certain value - match by @id(s) and expect @action to match this value
     -->
    <xsl:param name="action"/>

    <xsl:value-of select="
                  ($src/local-name() = $target/local-name()) and
                  ($src/parent::*/@id = $target/parent::*/@id) and 
                  ($src/@id = $target/@id) and 
                  (if ($action = 'any') 
                      then true()
                      else if ($action = 'same')
                          then ($target/@action = $src/@action)
                          else ($target/@action = $action))"/>  
</xsl:function>

And the preceded-by and followed-by syntax sugar on top of the "raw" matches function:

<xsl:function name="a:preceded-by" as="xs:boolean">
    <xsl:param name="ctx"/>
    <xsl:param name="action"/>

    <xsl:value-of select="count($ctx/preceding::*[a:matches($ctx, ., $action)]) > 0"/>
</xsl:function>

<xsl:function name="a:followed-by" as="xs:boolean">
    <xsl:param name="ctx"/>
    <xsl:param name="action"/>

    <xsl:value-of select="count($ctx/following::*[a:matches($ctx, ., $action)]) > 0"/>
</xsl:function>

SUMMARY

Here's a full transformation:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:a="http://a.com">
    <xsl:output omit-xml-declaration="yes" indent="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:key name="entity" match="/root/node/*/*" use="concat(parent::*/@id, '_', @id, '_', @action)"/>

    <xsl:function name="a:is-primary" as="xs:boolean">
        <xsl:param name="ctx"/>
        <!-- need to establish "focus"(context) for the key() function to work -->
        <xsl:for-each select="$ctx">
            <xsl:sequence select="generate-id($ctx) = generate-id(key('entity', concat($ctx/parent::*/@id, '_', $ctx/@id, '_', $ctx/@action))[1])"/>
        </xsl:for-each>
    </xsl:function> 

    <xsl:function name="a:preceded-by" as="xs:boolean">
        <xsl:param name="ctx"/>
        <xsl:param name="action"/>

        <xsl:value-of select="count($ctx/preceding::*[a:matches($ctx, ., $action)]) > 0"/>
    </xsl:function>

    <xsl:function name="a:followed-by" as="xs:boolean">
        <xsl:param name="ctx"/>
        <xsl:param name="action"/>

        <xsl:value-of select="count($ctx/following::*[a:matches($ctx, ., $action)]) > 0"/>
    </xsl:function>

    <xsl:function name="a:matches" as="xs:boolean">
        <xsl:param name="src"/>
        <xsl:param name="target"/>
        <!-- can be one of the following:
            'any' - only match the @id(s) and ignore @action
            'same' - match by @id(s) and expect $src/@action to match $target/@action
             a certain value - match by @id(s) and expect @action to match this value
         -->
        <xsl:param name="action"/>

        <xsl:value-of select="
                      ($src/local-name() = $target/local-name()) and
                      ($src/parent::*/@id = $target/parent::*/@id) and 
                      ($src/@id = $target/@id) and 
                      (if ($action = 'any') 
                          then true()
                          else if ($action = 'same')
                              then ($target/@action = $src/@action)
                              else ($target/@action = $action))"/>  
    </xsl:function>

    <xsl:template match="@* | node()">
        <xsl:copy>
            <xsl:apply-templates select="@* | node()"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="node/*/*[a:is-primary(.)]" priority="1">
        <xsl:copy>
            <xsl:apply-templates select="@*"/>
            <xsl:apply-templates select="attribute" mode="consolidate-most-recent"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="attribute" mode="consolidate-most-recent">
        <xsl:copy>
            <xsl:for-each-group 
                        select="/root/node/*/*[a:matches(current()/parent::*, ., 'any')]/attribute/*" 
                        group-by="local-name()">
                <!-- take the last in the document order -->
                <xsl:apply-templates select="current-group()[last()]"/>
            </xsl:for-each-group>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="node/*/*[not(a:is-primary(.))]"/>

    <!-- assume a remove is never followed by a change or create -->
    <xsl:template match="node/*/*[@action = 'change'][a:preceded-by(., 'create')]" priority="2"/>
    <xsl:template match="node/*/*[@action = 'create' or action='change'][a:followed-by(., 'remove')]" priority="2"/>
</xsl:stylesheet>

when applied to a document:

<root>
    <node id="N1">
        <fruit id="1" action="aaa">
            <orange id="x" action="create">
                <attribute>
                    <color>Orange</color>
                    <year>2012</year>
                </attribute>
            </orange>
            <orange id="x" action="change">
                <attribute>
                    <color>Red</color>
                    <something>!!</something>
                </attribute>
            </orange>
            <orange id="x" action="change">
                <attribute>
                    <color>Blue</color>
                    <condition>good</condition>
                </attribute>
            </orange>
            <orange id="x" action="remove">
                <attribute>
                    <condition>awesome</condition>
                </attribute>
            </orange>
        </fruit>
    </node>
</root>

produces the following result:

<root>
   <node id="N1">
      <fruit id="1" action="aaa">
         <orange id="x" action="remove">
            <attribute>
               <color>Blue</color>
               <year>2012</year>
               <something>!!</something>
               <condition>awesome</condition>
            </attribute>
         </orange>
      </fruit>
   </node>
</root>

I hope it's clear. You can expand on this concept and build yourself a nice library of those reusable functions that you would then use as simple predicates merging your nodes one way or the other. Unlikely to be the most effective way to do the job but at least a clean way to express the solution.

查看更多
登录 后发表回答