There are 3 scenarios in this problem:
First possibility:
Input:
<root>
<node id="N1">
<fruit id="1" action="aaa">
<orange id="x" action="create">
<attribute>
<color>Orange</color>
<year>2012</year>
</attribute>
</orange>
<orange id="x" action="change">
<attribute>
<color>Red</color>
</attribute>
</orange>
<orange id="x" action="change">
<attribute>
<color>Blue</color>
<condition>good</condition>
</attribute>
</orange>
</fruit>
</node>
</root>
Expected output:
<root>
<node id="N1">
<fruit id="1" action="aaa">
<orange id="x" action="create">
<attribute>
<color>Blue</color>
<year>2012</year>
<condition>good</condition>
</attribute>
</orange>
</fruit>
</node>
</root>
Second Possibility:
Input:
<root>
<node id="N1">
<car id="1">
<bmw id="i" action="change">
<attribute>
<color>Blue</color>
<owner>a</owner>
</attribute>
</bmw>
<bmw id="i" action="change">
<attribute>
<color>Yellow</color>
<status>avaailable</status>
</attribute>
</bmw>
</car>
</node>
</root>
Expected Output:
<root>
<node id="N1">
<car id="1">
<bmw id="i" action="change">
<attribute>
<color>Yellow</color>
<owner>a</owner>
<status>available</status>
</attribute>
</bmw>
</car>
</node>
</root>
Third Scenario:
<root>
<node id="N1">
<car id="1">
<bmw id="j" action="delete">
<attribute>
<color>Blue</color>
<year>2000</year>
</attribute>
</bmw>
<bmw id="j" action="delete">
<attribute>
<color>Pink</color>
<status>available</status>
</attribute>
</bmw>
</car>
</node>
</root>
Expected Output:
<root>
<node id="N1">
<car id="1">
<bmw id="j" action="delete">
<attribute>
<color>Pink</color>
<year>2000</year>
<status>available</status>
</attribute>
</bmw>
</car>
</node>
</root>
Explanation on second and third scenario:
- Two or more node with 'action=change' will be merged into one node with 'action=change'
- Two or more node with 'action=delete' will be merged into one node with 'action=delete'
- while merging, we update we only keep the value from the last node, keep the initial node and add any new additional node with it.
I hope the explanation is clear.
Please advise me on XSLT solution for this problem.
Thank you.
kind regards,
John
Here's a solution of a different flavor compared to the one I gave you here.
I figured it would be worth to go step by step. I made an assumption about @action
s appearing in a logical order - create
first, change
next, and remove
last. There can be multiple occurrences of the same @action
but it wouldn't be random. Now we're ready to look at the main logic:
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
We declare the identity transformation and then intercept it in a few places. We only stop at unique occurrences of a node with the same @id
, parent @id
, and @action
:
<xsl:template match="node/*/*[a:is-primary(.)]" priority="1">
<xsl:copy>
<xsl:apply-templates select="@*"/>
<xsl:apply-templates select="attribute" mode="consolidate-most-recent"/>
</xsl:copy>
</xsl:template>
We ignore the "duplicates":
<xsl:template match="node/*/*[not(a:is-primary(.))]"/>
and also ignore create
s following by a change
as well as all create
s and change
followed by a remove
.
<xsl:template match="node/*/*[@action = 'change'][a:preceded-by(., 'create')]" priority="2"/>
<xsl:template match="node/*/*[@action = 'create' or action='change'][a:followed-by(., 'remove')]" priority="2"/>
When the unique @action
not followed by another @action
that would make us ignore it is captured, we do a simple thing - collect all attributes of elements with the same @id
s ignoring the @action
and use their most "recent" values (the ones appearing last in the document order).
<xsl:template match="attribute" mode="consolidate-most-recent">
<xsl:copy>
<xsl:for-each-group
select="/root/node/*/*[a:matches(current()/parent::*, ., 'any')]/attribute/*"
group-by="local-name()">
<!-- take the last in the document order -->
<xsl:apply-templates select="current-group()[last()]"/>
</xsl:for-each-group>
</xsl:copy>
</xsl:template>
That's it. Now let's look at the functions that would make it work:
We have got a key
to simplify the lookup
<xsl:key name="entity" match="/root/node/*/*" use="concat(parent::*/@id, '_', @id, '_', @action)"/>
A function to check whether it's that unique occurrence of the node (we could add this directly into the template match
predicate but since we started with the functions anyway let's just keep it the same):
<xsl:function name="a:is-primary" as="xs:boolean">
<xsl:param name="ctx"/>
<!-- need to establish "focus"(context) for the key() function to work -->
<xsl:for-each select="$ctx">
<xsl:sequence select="generate-id($ctx) = generate-id(key('entity', concat($ctx/parent::*/@id, '_', $ctx/@id, '_', $ctx/@action))[1])"/>
</xsl:for-each>
</xsl:function>
a matches
function that will do all kind of comparisons for us (again, can put it all in predicates but this way we'll keep it nice and clean in the real templates):
<xsl:function name="a:matches" as="xs:boolean">
<xsl:param name="src"/>
<xsl:param name="target"/>
<!-- can be one of the following:
'any' - only match the @id(s) and ignore @action
'same' - match by @id(s) and expect $src/@action to match $target/@action
a certain value - match by @id(s) and expect @action to match this value
-->
<xsl:param name="action"/>
<xsl:value-of select="
($src/local-name() = $target/local-name()) and
($src/parent::*/@id = $target/parent::*/@id) and
($src/@id = $target/@id) and
(if ($action = 'any')
then true()
else if ($action = 'same')
then ($target/@action = $src/@action)
else ($target/@action = $action))"/>
</xsl:function>
And the preceded-by
and followed-by
syntax sugar on top of the "raw" matches
function:
<xsl:function name="a:preceded-by" as="xs:boolean">
<xsl:param name="ctx"/>
<xsl:param name="action"/>
<xsl:value-of select="count($ctx/preceding::*[a:matches($ctx, ., $action)]) > 0"/>
</xsl:function>
<xsl:function name="a:followed-by" as="xs:boolean">
<xsl:param name="ctx"/>
<xsl:param name="action"/>
<xsl:value-of select="count($ctx/following::*[a:matches($ctx, ., $action)]) > 0"/>
</xsl:function>
SUMMARY
Here's a full transformation:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:a="http://a.com">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:key name="entity" match="/root/node/*/*" use="concat(parent::*/@id, '_', @id, '_', @action)"/>
<xsl:function name="a:is-primary" as="xs:boolean">
<xsl:param name="ctx"/>
<!-- need to establish "focus"(context) for the key() function to work -->
<xsl:for-each select="$ctx">
<xsl:sequence select="generate-id($ctx) = generate-id(key('entity', concat($ctx/parent::*/@id, '_', $ctx/@id, '_', $ctx/@action))[1])"/>
</xsl:for-each>
</xsl:function>
<xsl:function name="a:preceded-by" as="xs:boolean">
<xsl:param name="ctx"/>
<xsl:param name="action"/>
<xsl:value-of select="count($ctx/preceding::*[a:matches($ctx, ., $action)]) > 0"/>
</xsl:function>
<xsl:function name="a:followed-by" as="xs:boolean">
<xsl:param name="ctx"/>
<xsl:param name="action"/>
<xsl:value-of select="count($ctx/following::*[a:matches($ctx, ., $action)]) > 0"/>
</xsl:function>
<xsl:function name="a:matches" as="xs:boolean">
<xsl:param name="src"/>
<xsl:param name="target"/>
<!-- can be one of the following:
'any' - only match the @id(s) and ignore @action
'same' - match by @id(s) and expect $src/@action to match $target/@action
a certain value - match by @id(s) and expect @action to match this value
-->
<xsl:param name="action"/>
<xsl:value-of select="
($src/local-name() = $target/local-name()) and
($src/parent::*/@id = $target/parent::*/@id) and
($src/@id = $target/@id) and
(if ($action = 'any')
then true()
else if ($action = 'same')
then ($target/@action = $src/@action)
else ($target/@action = $action))"/>
</xsl:function>
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="node/*/*[a:is-primary(.)]" priority="1">
<xsl:copy>
<xsl:apply-templates select="@*"/>
<xsl:apply-templates select="attribute" mode="consolidate-most-recent"/>
</xsl:copy>
</xsl:template>
<xsl:template match="attribute" mode="consolidate-most-recent">
<xsl:copy>
<xsl:for-each-group
select="/root/node/*/*[a:matches(current()/parent::*, ., 'any')]/attribute/*"
group-by="local-name()">
<!-- take the last in the document order -->
<xsl:apply-templates select="current-group()[last()]"/>
</xsl:for-each-group>
</xsl:copy>
</xsl:template>
<xsl:template match="node/*/*[not(a:is-primary(.))]"/>
<!-- assume a remove is never followed by a change or create -->
<xsl:template match="node/*/*[@action = 'change'][a:preceded-by(., 'create')]" priority="2"/>
<xsl:template match="node/*/*[@action = 'create' or action='change'][a:followed-by(., 'remove')]" priority="2"/>
</xsl:stylesheet>
when applied to a document:
<root>
<node id="N1">
<fruit id="1" action="aaa">
<orange id="x" action="create">
<attribute>
<color>Orange</color>
<year>2012</year>
</attribute>
</orange>
<orange id="x" action="change">
<attribute>
<color>Red</color>
<something>!!</something>
</attribute>
</orange>
<orange id="x" action="change">
<attribute>
<color>Blue</color>
<condition>good</condition>
</attribute>
</orange>
<orange id="x" action="remove">
<attribute>
<condition>awesome</condition>
</attribute>
</orange>
</fruit>
</node>
</root>
produces the following result:
<root>
<node id="N1">
<fruit id="1" action="aaa">
<orange id="x" action="remove">
<attribute>
<color>Blue</color>
<year>2012</year>
<something>!!</something>
<condition>awesome</condition>
</attribute>
</orange>
</fruit>
</node>
</root>
I hope it's clear. You can expand on this concept and build yourself a nice library of those reusable functions that you would then use as simple predicates merging your nodes one way or the other. Unlikely to be the most effective way to do the job but at least a clean way to express the solution.