How to eliminate duplicate nodes bases on values o

2019-04-16 06:53发布

How can I eliminate duplicate nodes based on values of multiple (more than 1) attributes? Also the attribute names are passed as parameters to the stylesheet. Now I am aware of the Muenchian method of grouping that uses a <xsl:key> element. But I came to know that XSLT 1.0 does not allow paramters/variables in <xsl:key>.

Is there another method(s) to achieve duplicate nodes removal? It is fine if it not as efficient as the Munechian method.

Update from previus question:

XML:

<data id = "root">
  <record id="1" operator1='xxx' operator2='yyy' operator3='zzz'/>
  <record id="2" operator1='abc' operator2='yyy' operator3='zzz'/>
  <record id="3" operator1='abc' operator2='yyy' operator3='zzz'/>
  <record id="4" operator1='xxx' operator2='yyy' operator3='zzz'/>
  <record id="5" operator1='xxx' operator2='lkj' operator3='tyu'/>
  <record id="6" operator1='xxx' operator2='yyy' operator3='zzz'/>
  <record id="7" operator1='abc' operator2='yyy' operator3='zzz'/>
  <record id="8" operator1='abc' operator2='yyy' operator3='zzz'/>
  <record id="9" operator1='xxx' operator2='yyy' operator3='zzz'/>
  <record id="10" operator1='rrr' operator2='yyy' operator3='zzz'/>
</data>

标签: xslt xslt-1.0
3条回答
看我几分像从前
2楼-- · 2019-04-16 07:04

If you want to pass in the attribute names as a parameter then one approach could be a two step transformation where the first step takes any XML input and simply the attribute names and the element names as parameters to generate a second stylesheet that then eliminates the duplicates. Here is an example first stylesheet:

<xsl:stylesheet
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:exsl="http://exslt.org/common"
  xmlns:axsl="http://www.w3.org/1999/XSL/TransformAlias"
  exclude-result-prefixes="axsl exsl"  
  version="1.0">

  <xsl:param name="parent-name" select="'items'"/>
  <xsl:param name="element-name" select="'item'"/>
  <xsl:param name="att-names" select="'att1,att2'"/>
  <xsl:param name="sep" select="'|'"/>

  <xsl:namespace-alias stylesheet-prefix="axsl" result-prefix="xsl"/>

  <xsl:output method="xml" indent="yes"/>

  <xsl:variable name="key-value">
    <xsl:text>concat(</xsl:text>
    <xsl:call-template name="define-values">
      <xsl:with-param name="att-names" select="$att-names"/>
    </xsl:call-template>
    <xsl:text>)</xsl:text>
  </xsl:variable>

  <xsl:template name="define-values">
    <xsl:param name="att-names"/>
    <xsl:choose>
      <xsl:when test="contains($att-names, ',')">
        <xsl:value-of select="concat('@', substring-before($att-names, ','), ',&quot;', $sep, '&quot;,')"/>
        <xsl:call-template name="define-values">
          <xsl:with-param name="att-names" select="substring-after($att-names, ',')"/>
        </xsl:call-template>
      </xsl:when>
      <xsl:otherwise>
        <xsl:value-of select="concat('@', $att-names)"/>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>

  <xsl:template match="/">
    <axsl:stylesheet version="1.0">
      <axsl:output indent="yes"/>
      <axsl:key name="k1" match="{$parent-name}/{$element-name}" use="{$key-value}"/>
      <axsl:template match="@* | node()">
        <axsl:copy>
          <axsl:apply-templates select="@* | node()"/>
        </axsl:copy>
      </axsl:template>
      <axsl:template match="{$parent-name}">
        <axsl:copy>
          <axsl:apply-templates select="@*"/>
          <axsl:apply-templates select="{$element-name}[generate-id() = generate-id(key('k1', {$key-value})[1])]"/>
        </axsl:copy>
      </axsl:template>
    </axsl:stylesheet>
  </xsl:template>

</xsl:stylesheet>

It takes four parameters:

  1. parent-name: the name of the element containing those elements of which you want to eliminate duplicates
  2. element-name: the name of those elements of which you want to eliminate duplicates
  3. att-names: a comma separated list of attribute names
  4. sep: a separator character that should not occur in attribute values in the input XML

The stylesheet then generates a second stylesheet that applies Muenchian grouping to eliminate duplicates. For instance with the default parameters given in the stylesheet Saxon 6.5.5 generates the following stylesheet:

<axsl:stylesheet xmlns:axsl="http://www.w3.org/1999/XSL/Transform" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
   <axsl:output indent="yes"/>
   <axsl:key name="k1" match="items/item" use="concat(@att1,&#34;|&#34;,@att2)"/>
   <axsl:template match="@* | node()">
      <axsl:copy>
         <axsl:apply-templates select="@* | node()"/>
      </axsl:copy>
   </axsl:template>
   <axsl:template match="items">
      <axsl:copy>
         <axsl:apply-templates select="@*"/>
         <axsl:apply-templates select="item[generate-id() = generate-id(key('k1', concat(@att1,&#34;|&#34;,@att2))[1])]"/>
      </axsl:copy>
   </axsl:template>
</axsl:stylesheet>

This can the be applied to an XML document like

<items>
  <item att1="a" att2="1" att3="A"/>
  <item att1="b" att2="1" att3="A"/>
  <item att1="a" att2="1" att3="B"/>
  <item att1="c" att2="2" att3="A"/>
  <item att1="d" att2="3" att3="C"/>
</items>

and the output is

<items>
   <item att1="a" att2="1" att3="A"/>
   <item att1="b" att2="1" att3="A"/>
   <item att1="c" att2="2" att3="A"/>
   <item att1="d" att2="3" att3="C"/>
</items>
查看更多
叛逆
3楼-- · 2019-04-16 07:17

Use this transformation (simple and no need to generate a new stylesheet):

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:ext="http://exslt.org/common">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:param name="pAttribs">
 <name>operator1</name>
 <name>operator2</name>
 <name>operator3</name>
 </xsl:param>

 <xsl:variable name="vAttribs" select=
    "document('')/*/xsl:param[@name='pAttribs']"/>

 <xsl:key name="kRecByAtts" match="record"
   use="@___g_key"/>

 <xsl:template match="node()|@*">
               <xsl:copy>
                       <xsl:apply-templates select="node()|@*"/>
               </xsl:copy>
 </xsl:template>

 <xsl:template match="/">
 <xsl:variable name="vrtdPass1">
   <xsl:apply-templates/>
 </xsl:variable>

 <xsl:variable name="vPass1" select=
  "ext:node-set($vrtdPass1)/*"/>

 <xsl:apply-templates select="$vPass1"/>
 </xsl:template>

 <xsl:template match="record[not(@___g_key)]">
 <xsl:copy>
   <xsl:copy-of select="@*"/>

   <xsl:attribute name="___g_key">
    <xsl:for-each select="@*[name()=$vAttribs/name]">
      <xsl:sort select="name()"/>

       <xsl:value-of select=
          "concat('___Attrib___',name(),'___Value___',.,'+++')"/>
    </xsl:for-each>
   </xsl:attribute>
 </xsl:copy>
 </xsl:template>

 <xsl:template match=
  "record[@___g_key]
         [not(generate-id()
             =
               generate-id(key('kRecByAtts', @___g_key)[1])
              )
          ]
   "/>

  <xsl:template match="@___g_key"/>
</xsl:stylesheet>

When applied to the XML document of your previous question:

<data id = "root">
    <record id="1" operator1='xxx' operator2='yyy' operator3='zzz'/>
    <record id="2" operator1='abc' operator2='yyy' operator3='zzz'/>
    <record id="3" operator1='abc' operator2='yyy' operator3='zzz'/>
    <record id="4" operator1='xxx' operator2='yyy' operator3='zzz'/>
    <record id="5" operator1='xxx' operator2='lkj' operator3='tyu'/>
    <record id="6" operator1='xxx' operator2='yyy' operator3='zzz'/>
    <record id="7" operator1='abc' operator2='yyy' operator3='zzz'/>
    <record id="8" operator1='abc' operator2='yyy' operator3='zzz'/>
    <record id="9" operator1='xxx' operator2='yyy' operator3='zzz'/>
    <record id="10" operator1='rrr' operator2='yyy' operator3='zzz'/>
</data>

The wanted, correct result is produced:

<data id="root">
   <record id="1" operator1="xxx" operator2="yyy" operator3="zzz"/>
   <record id="2" operator1="abc" operator2="yyy" operator3="zzz"/>
   <record id="5" operator1="xxx" operator2="lkj" operator3="tyu"/>
   <record id="10" operator1="rrr" operator2="yyy" operator3="zzz"/>
</data>
查看更多
乱世女痞
4楼-- · 2019-04-16 07:26

Other approach for a single transformation in two steps:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:msxsl="urn:schemas-microsoft-com:xslt"
 exclude-result-prefixes="msxsl">
    <xsl:key name="kItemByLocal" match="record[@local-key]" use="@local-key"/>
    <xsl:param name="pAttNames" select="'operator1 operator2 operator3'"/>
    <xsl:template match="/">
        <xsl:variable name="vFirstRTF">
            <xsl:apply-templates/>
        </xsl:variable>
        <xsl:apply-templates select="msxsl:node-set($vFirstRTF)/node()"/>
    </xsl:template>
    <xsl:template match="node()|@*">
        <xsl:copy>
            <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
    </xsl:template>
    <xsl:template match="record[not(@local-key)]">
        <xsl:copy>
            <xsl:attribute name="local-key">
                <xsl:call-template name="local-key"/>
            </xsl:attribute>
            <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
    </xsl:template>
    <xsl:template match="record[@local-key]
                               [count(.|key('kItemByLocal',@local-key)[1])
                                 != 1]|@local-key"/>
    <xsl:template name="local-key">
        <xsl:param name="pAttributes" select="concat($pAttNames,' ')"/>
        <xsl:if test="normalize-space($pAttributes)">
            <xsl:variable name="vName"
                          select="substring-before($pAttributes,' ')"/>
            <xsl:variable name="vAttribute" select="@*[name()=$vName]"/>
            <xsl:value-of select="concat($vName,'+',$vAttribute,'+')"/>
            <xsl:call-template name="local-key">
                <xsl:with-param name="pAttributes"
                                select="substring-after($pAttributes,' ')"/>
            </xsl:call-template>
        </xsl:if>
    </xsl:template>
</xsl:stylesheet>

Output:

<data id="root">
    <record id="1" operator1="xxx" operator2="yyy" operator3="zzz"></record>
    <record id="2" operator1="abc" operator2="yyy" operator3="zzz"></record>
    <record id="5" operator1="xxx" operator2="lkj" operator3="tyu"></record>
    <record id="10" operator1="rrr" operator2="yyy" operator3="zzz"></record>
</data>

Edit: Also without named template for @local-key generation

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:msxsl="urn:schemas-microsoft-com:xslt"
 exclude-result-prefixes="msxsl">
    <xsl:key name="kItemByLocal" match="record[@local-key]" use="@local-key"/>
    <xsl:param name="pAttNames" select="'operator1 operator2 operator3'"/>
    <xsl:template match="/">
        <xsl:variable name="vFirstRTF">
            <xsl:apply-templates/>
        </xsl:variable>
        <xsl:apply-templates select="msxsl:node-set($vFirstRTF)/node()"/>
    </xsl:template>
    <xsl:template match="node()|@*">
        <xsl:copy>
            <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
    </xsl:template>
    <xsl:template match="record[not(@local-key)]">
        <xsl:variable name="vAttNames"
                      select="concat(' ',$pAttNames,' ')"/>
        <xsl:copy>
            <xsl:attribute name="local-key">
                <xsl:for-each select="@*[contains(
                                             $vAttNames,
                                             concat(' ',name(),' ')
                                                 )]">
                    <xsl:sort select="substring-before(
                                             $vAttNames,
                                             concat(' ',name(),' ')
                                                      )"/>
                    <xsl:value-of select="concat(name(),'++',.,'++')"/>
                </xsl:for-each>
            </xsl:attribute>
            <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
    </xsl:template>
    <xsl:template match="record[@local-key]
                               [count(.|key('kItemByLocal',@local-key)[1])
                                 != 1]|@local-key"/>
</xsl:stylesheet>

Note: If you are positive sure that attributes order is the same for all elements, then you could remove the sorting.

查看更多
登录 后发表回答