XSLT split output files - muenchian grouping

2019-06-22 12:34发布

问题:

I have an XSLT file so as to transform large amount of data. I would like to add a "split" functionality, either as a chained XSLT or within the current XSLT that can create multiple output files so as to limit the size of the files under a certain threshold. Let's assume that the input XML is as below:

<People>
<Person>             
<name>John</name>             
<date>June12</date>             
<workTime taskID="1">34</workTime>             
<workTime taskID="2">12</workTime>             
</Person>             
<Person>             
<name>John</name>             
<date>June13</date>             
<workTime taskID="1">21</workTime>             
<workTime taskID="2">11</workTime>             
</Person>
<Person>             
<name>Jack</name>             
<date>June19</date>             
<workTime taskID="1">20</workTime>             
<workTime taskID="2">30</workTime>             
</Person>    
</People>

The XSLT file is as below using muenchian grouping.

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:key name="PersonTasks" match="workTime" use="concat(@taskID, ../name)"/>
<xsl:template match="/">
    <People>
    <xsl:apply-templates select="//workTime[generate-id() = generate-id(key('PersonTasks',concat(@taskID, ../name))[1])]"/>
    </People>
</xsl:template>

<xsl:template match="workTime">
    <xsl:variable name="taskID">
        <xsl:value-of select="@taskID"/>
    </xsl:variable>
    <xsl:variable name="name">
        <xsl:value-of select="../name"/>
    </xsl:variable>
    <Person>
        <name>
            <xsl:value-of select="$name"/>
        </name>
        <taskID>
            <xsl:value-of select="$taskID"/>
        </taskID>
        <xsl:for-each select="//workTime[../name = $name][@taskID = $taskID]">
            <workTime>
                <date>
                    <xsl:value-of select="../date"/>
                </date>
                <time>
                    <xsl:value-of select="."/>
                </time>
            </workTime>
        </xsl:for-each>
    </Person>
</xsl:template>
</xsl:stylesheet>

However, I'd like ,as an output, several files as below instead of a large one. For this example, I have set only one name per file..but this should be a parameter.

Output file for first person:

<People>
    <Person>
        <name>John</name>
        <taskID>1</taskID>
        <workTime>
        <date>June12</date>
        <time>34</time>
        </workTime>
        <workTime>
        <date>June13</date>
        <time>21</time>
        </workTime>
    </Person>
    <Person>
        <name>John</name>
        <taskID>2</taskID>
        <workTime>
        <date>June12</date>
        <time>12</time>
        </workTime>
        <workTime>
        <date>June13</date>
        <time>11</time>
        </workTime>
    </Person>
</People>

Output file for second person:

<People>
    <Person>
        <name>Jack</name>
        <taskID>1</taskID>
        <workTime>
        <date>June19</date>
        <time>20</time>
        </workTime>
    </Person>
    <Person>
        <name>Jack</name>
        <taskID>2</taskID>
        <workTime>
        <date>June19</date>
        <time>30</time>
        </workTime>
    </Person>
</People>

What would be the preferred and most elegant approach using XSLT 1.0? Is there a way to call a chained XSLT within the XSLT so as to split the output files?

Cheers.

回答1:

Is there a way to call a chained XSLT within the XSLT so as to split the output files?

A few ways:

  1. You could write an extension function to do this -- check the documentation of your XSLT processor.

  2. Use the <exsl:document> extension element of EXSLT, in case this is supported by your XSLT processor

  3. Use the <saxon:output> extension element if you have Saxon 6.x

  4. In a loop from your programming language invoke a separate transformation, passing to it as parameter the name of the person for which to produce results.

Here are code examples for 2. and 3. above:

Using <saxon:output> :

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:saxon="http://icl.com/saxon"
  extension-element-prefixes="saxon" >

 <xsl:template match="/">
  <xsl:for-each select="/*/*[not(. > 3)]">
   <saxon:output href="c:\xml\doc{.}">
    <xsl:copy-of select="."/>
   </saxon:output>
  </xsl:for-each>
 </xsl:template>
</xsl:stylesheet>

when this transformation is applied on the following XML document:

<nums>
  <num>01</num>
  <num>02</num>
  <num>03</num>
  <num>04</num>
  <num>05</num>
  <num>06</num>
  <num>07</num>
  <num>08</num>
  <num>09</num>
  <num>10</num>
</nums>

three files: c:\xml\doc1 , c:\xml\doc2 and c:\xml\doc3 are created with the wanted contents.

The same example using <exslt:document>:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:ext="http://exslt.org/common"
  extension-element-prefixes="saxon" >

 <xsl:template match="/">
  <xsl:for-each select="/*/*[not(. > 3)]">
   <ext:document href="c:\xml\doc{.}">
    <xsl:copy-of select="."/>
   </ext:document>
  </xsl:for-each>
 </xsl:template>
</xsl:stylesheet>