XSLT Ignore duplicate elements across multiple fil

2019-01-15 21:22发布

I recently asked a question regarding how to ignore multiple elements, and got some good responses regarding using 'preceding' and the Muenchian Method. However I was wondering whether it is possible to do this across multiple files, with an index xml file.

Index.xml

<?xml-stylesheet type="text/xsl" href="merge2.xsl"?>
<list>
    <entry name="File1.xml" />
    <entry name="File2.xml" />
</list>

Example of XML file

<Main>
    <Records>
        <Record>
            <Description>A</Description>
        </Record>
        <Record>
            <Description>A</Description>
        </Record>
        <Record>
            <Description>B</Description>
        </Record>
        <Record>
            <Description>C</Description>
        </Record>
    </Records>
    <Records>
        <Record>
            <Description>B</Description>
        </Record>
        <Record>
            <Description>A</Description>
        </Record>
        <Record>
            <Description>C</Description>
        </Record>
        <Record>
            <Description>C</Description>
        </Record>
    </Records>
</Main>

Merge2.xsl

  <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
  <xsl:output method="xml" indent="yes" />
  <xsl:key name="Record-by-Description" match="Record" use="Description"/>

  <xsl:template match="@* | node()">
    <xsl:apply-templates select="@* | node()"/>
  </xsl:template>

  <xsl:template match="Main">
    <table>
      <tr>
        <th>Type</th>
        <th>Count</th>
      </tr>
      <xsl:apply-templates select="Records"/>
    </table>
  </xsl:template>

  <xsl:template match="Records">
    <xsl:apply-templates select="Record[generate-id() = generate-id(key('Record-by-Description', Description)[1])]" mode="group"/>
  </xsl:template>

  <xsl:template match="Record" mode="group">
    <tr>
      <td>
        <xsl:value-of select="Description"/>
      </td>
      <td>
        <xsl:value-of select="count(key('Record-by-Description', Description))"/>
      </td>
    </tr>
  </xsl:template>

</xsl:stylesheet>

This works fine on one file, and gives me the desired result of producing one table, with unique items only being displayed and the count being added. However I have been unable to produce the desired result when going through the index.xml for multiple files.

I have tried using a seperate template targeting the index.xml and applying the 'Main' template to the different XML files, and also tried using a for-each to cycle through the different files.

Before being introduced to the Muenchian Method I was using for-each with 'preceding' to check for duplicate nodes, however 'preceding' only seems to search back through the current document and have been unable to find information on using this across multi documents.

Is it possible with either of these methods to be able to search through multiple documents for duplicated element text?

Many thanks for any help.

标签: xml xslt
2条回答
唯我独甜
2楼-- · 2019-01-15 21:26

Basically keys are built per document so a direct key based Muenchian grouping will not allow you to identify and remove duplicates in more than one document.

You could however first merge the two documents into one and then apply the Muenchian grouping to the merged document.

If you want to merge and group in one stylesheet you need to use exsl:node-set or similar:

  <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
    xmlns:exsl="http://exslt.org/common" exclude-result-prefixes="exsl">

  <xsl:output method="xml" indent="yes" />
  <xsl:key name="Record-by-Description" match="Record" use="Description"/>

  <xsl:template match="/">
    <xsl:variable name="merged-rtf">
      <Main>
        <xsl:copy-of select="document(list/entry/@name)/Main/Records"/>
      </Main>
    </xsl:variable>
    <xsl:apply-templates select="exsl:node-set($merged-rtf)/Main"/>
   </xsl:template>

  <xsl:template match="@* | node()">
    <xsl:apply-templates select="@* | node()"/>
  </xsl:template>

  <xsl:template match="Main">
    <table>
      <tr>
        <th>Type</th>
        <th>Count</th>
      </tr>
      <xsl:apply-templates select="Records"/>
    </table>
  </xsl:template>

  <xsl:template match="Records">
    <xsl:apply-templates select="Record[generate-id() = generate-id(key('Record-by-Description', Description)[1])]" mode="group"/>
  </xsl:template>

  <xsl:template match="Record" mode="group">
    <tr>
      <td>
        <xsl:value-of select="Description"/>
      </td>
      <td>
        <xsl:value-of select="count(key('Record-by-Description', Description))"/>
      </td>
    </tr>
  </xsl:template>

</xsl:stylesheet>

You would now pass your index.xml as the main input document to the stylesheet.

If you want to do this transformation in the IE browser then you need to replace the exsl:node-set with Microsoft's ms:node-set (with the proper namespace) or you need to use the approach in http://dpcarlisle.blogspot.de/2007/05/exslt-node-set-function.html to make sure the exsl:node-set function is implemented.

查看更多
Lonely孤独者°
3楼-- · 2019-01-15 21:48

If I may, although this has been answered with the Muenchian method, for over 12 years I have been promoting the variable-based grouping method for XSLT 1.0 on mail lists (e.g. http://www.sourceware.org/ml/xsl-list/2001-10/msg00933.html) and in the classroom.

The variable-based grouping method allows you to group across multiple files in one pass. It also is quite straightforward to do subgroups using the variable-based method. Whatever population you can address can be put into a variable and then the grouping method works on that variable.

I hope the illustrative transcript below helps ... you can see that the stylesheet is quite compact and you do not need two passes and you do not need to use any extension.

Data:

t:\ftemp>type multi.xml 
<?xml-stylesheet type="text/xsl" href="merge2.xsl"?>
<list>
    <entry name="File1.xml" />
    <entry name="File2.xml" />
</list>

t:\ftemp>type File1.xml 
<Main>
    <Records>
        <Record>
            <Description>A</Description>
        </Record>
        <Record>
            <Description>A</Description>
        </Record>
        <Record>
            <Description>B</Description>
        </Record>
        <Record>
            <Description>C</Description>
        </Record>
    </Records>
    <Records>
        <Record>
            <Description>B</Description>
        </Record>
        <Record>
            <Description>A</Description>
        </Record>
        <Record>
            <Description>C</Description>
        </Record>
        <Record>
            <Description>C</Description>
        </Record>
    </Records>
</Main>

Results:

t:\ftemp>call xslt multi.xml multi.xsl 
<?xml version="1.0" encoding="utf-8"?>
<table>
   <tr>
      <th>Type</th>
      <th>Count</th>
   </tr>
   <tr>
      <td>A</td>
      <td>6</td>
   </tr>
   <tr>
      <td>B</td>
      <td>4</td>
   </tr>
   <tr>
      <td>C</td>
      <td>6</td>
   </tr>
</table>

Stylesheet:

t:\ftemp>type multi.xsl 
  <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
  <xsl:output method="xml" indent="yes" />

  <xsl:template match="list">
    <table>
      <tr>
        <th>Type</th>
        <th>Count</th>
      </tr>
      <xsl:variable name="records"
                          select="document(entry/@name)/Main/Records/Record"/>

      <xsl:for-each select="$records">
        <xsl:if test="
                  generate-id(.)=
                  generate-id($records[Description=current()/Description][1])">
          <tr>
            <td>
              <xsl:value-of select="Description"/>
            </td>
            <td>
              <xsl:value-of
                  select="count($records[Description=current()/Description])"/>
            </td>
          </tr>
        </xsl:if>
      </xsl:for-each>
    </table>
  </xsl:template>

  </xsl:stylesheet>
查看更多
登录 后发表回答