How to do this in XSLT without incrementing variab

2019-08-08 14:01发布

I'm trying to think functional, in XSLT terms, as much as possible, but in this case, I really don't see how to do it without tweaking. I have roughly this data structure:

<transactions>
  <trx>
    <text>abc</text>
    <text>def</text>

    <detail>
      <text>xxx</text>
      <text>yyy</text>
      <text>zzz</text>
    </detail>
  </trx>
</transactions>

Which I roughly want to flatten into this form

<row>abc</row>
<row>def</row>
<row>xxx</row>
<row>yyy</row>
<row>zzz</row>

But the tricky thing is: I want to create chunks of 40 text-rows and transactions mustn't be split across chunks. I.e. if my current chunk already has 38 rows, the above transaction would have to go into the next chunk. The current chunk would need to be filled with two empty rows to complete the 40:

<row/>
<row/>

In imperative/procedural programming, it's very easy. Just create a global iterator variable counting to multiples of 40, and insert empty rows if needed (I have provided an answer showing how to tweak XSLT/Xalan to allow for such variables). But how to do it with XSLT? N.B: I'm afraid recursion is not possible considering the size of data I'm processing... But maybe I'm wrong on that

3条回答
叛逆
2楼-- · 2019-08-08 14:51

I. Here is an XSLT 1.0 solution (the XSLT 2.0 solution is much easier):

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:ext="http://exslt.org/common" exclude-result-prefixes="ext">

 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:param name="pChunkSize" select="8"/>
 <xsl:param name="vChunkSize" select="$pChunkSize+1"/>

 <xsl:variable name="vSheet" select="document('')"/>

 <xsl:variable name="vrtfEmptyChunk">
  <xsl:for-each select=
   "($vSheet//node())[not(position() > $pChunkSize)]">
    <row/>
  </xsl:for-each>
 </xsl:variable>

 <xsl:variable name="vEmptyChunk" select=
  "ext:node-set($vrtfEmptyChunk)/*"/>

 <xsl:variable name="vrtfDummy">
  <delete/>
 </xsl:variable>

 <xsl:variable name="vDummy" select="ext:node-set($vrtfDummy)/*"/>

 <xsl:template match="/*">
  <chunks>
   <xsl:call-template name="fillChunks">
    <xsl:with-param name="pNodes" select="trx"/>
    <xsl:with-param name="pCurChunk" select="$vDummy"/>
   </xsl:call-template>
  </chunks>
 </xsl:template>

 <xsl:template name="fillChunks">
  <xsl:param name="pNodes"/>
  <xsl:param name="pCurChunk"/>

  <xsl:choose>
    <xsl:when test="not($pNodes)">
     <chunk>
      <xsl:apply-templates mode="rename" select="$pCurChunk[self::text]"/>
      <xsl:copy-of select=
        "$vEmptyChunk[not(position() > $vChunkSize - count($pCurChunk))]"/>
     </chunk>
    </xsl:when>
    <xsl:otherwise>
      <xsl:variable name="vAvailable" select=
          "$vChunkSize - count($pCurChunk)"/>

      <xsl:variable name="vcurNode" select="$pNodes[1]"/>

      <xsl:variable name="vTrans" select="$vcurNode//text"/>

      <xsl:variable name="vNumNewNodes" select="count($vTrans)"/>

      <xsl:choose>
        <xsl:when test="not($vNumNewNodes > $vAvailable)">
         <xsl:variable name="vNewChunk"
              select="$pCurChunk | $vTrans"/>

         <xsl:call-template name="fillChunks">
           <xsl:with-param name="pNodes" select="$pNodes[position() > 1]"/>
           <xsl:with-param name="pCurChunk" select="$vNewChunk"/>
         </xsl:call-template>
        </xsl:when>

        <xsl:otherwise>
         <chunk>
          <xsl:apply-templates mode="rename" select="$pCurChunk[self::text]"/>
          <xsl:copy-of select=
            "$vEmptyChunk[not(position() > $vAvailable)]"/>
         </chunk>

         <xsl:call-template name="fillChunks">
          <xsl:with-param name="pNodes" select="$pNodes"/>
          <xsl:with-param name="pCurChunk" select="$vDummy"/>
         </xsl:call-template>
        </xsl:otherwise>
      </xsl:choose>
    </xsl:otherwise>
  </xsl:choose>
 </xsl:template>

 <xsl:template match="text" mode="rename">
  <row>
   <xsl:value-of select="."/>
  </row>
 </xsl:template>
</xsl:stylesheet>

When this transformation is applied on the following XML document (based on the provided one, but with three trxelements):

<transactions>
  <trx>
    <text>abc</text>
    <text>def</text>

    <detail>
      <text>xxx</text>
      <text>yyy</text>
      <text>zzz</text>
    </detail>
  </trx>
  <trx>
    <text>abc2</text>
    <text>def2</text>
  </trx>
  <trx>
    <text>abc3</text>
    <text>def3</text>

    <detail>
      <text>xxx3</text>
      <text>yyy3</text>
      <text>zzz3</text>
    </detail>
  </trx>
</transactions>

the wanted, correct result (two chunks with size 8) is produced:

<chunks>
   <chunk>
      <row>abc</row>
      <row>def</row>
      <row>xxx</row>
      <row>yyy</row>
      <row>zzz</row>
      <row>abc2</row>
      <row>def2</row>
      <row/>
   </chunk>
   <chunk>
      <row>abc3</row>
      <row>def3</row>
      <row>xxx3</row>
      <row>yyy3</row>
      <row>zzz3</row>
      <row/>
      <row/>
      <row/>
   </chunk>
</chunks>

Do note:

  1. The first two transactions' text elements total number is 7 and they fit in one 8-place chunk.

  2. The third transaction has 5 text elements and doesn't fit in the remaining space of the first chunk -- it is put in a new chunk.

II. XSLT 2.0 Solution (using FXSL)

<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:f="http://fxsl.sf.net/"
xmlns:dvc-foldl-func="dvc-foldl-func"
exclude-result-prefixes="f dvc-foldl-func"
>

   <xsl:import href="../f/func-dvc-foldl.xsl"/>
   <xsl:output omit-xml-declaration="yes" indent="yes"/>

   <xsl:param name="pChunkSize" select="8"/>

   <dvc-foldl-func:dvc-foldl-func/>

   <xsl:variable name="vPadding">
    <row/>
   </xsl:variable>

   <xsl:variable name="vFoldlFun" select="document('')/*/dvc-foldl-func:*[1]"/>

    <xsl:template match="/">

      <xsl:variable name="vpaddingChunk" select=
       "for $i in 1 to $pChunkSize
         return ' '
       "/>

      <xsl:variable name="vfoldlResult" select=
          "f:foldl($vFoldlFun, (), /*/trx),
           $vpaddingChunk
          "/>
      <xsl:variable name="vresultCount"
           select="count($vfoldlResult)"/>
      <xsl:variable name="vFinalResult"
       select="subsequence($vfoldlResult, 1,
                           $vresultCount - $vresultCount mod $pChunkSize
                           )"/>
      <result>
       <xsl:for-each select="$vFinalResult">
        <row>
          <xsl:value-of select="."/>
        </row>
       </xsl:for-each>
       <xsl:text>&#xA;</xsl:text>
      </result>
    </xsl:template>

    <xsl:template match="dvc-foldl-func:*" mode="f:FXSL">
         <xsl:param name="arg1"/>
         <xsl:param name="arg2"/>

         <xsl:variable name="vCurCount" select="count($arg1)"/>

         <xsl:variable name="vNewCount" select="count($arg2//text)"/>

         <xsl:variable name="vAvailable" select=
         "$pChunkSize - $vCurCount mod $pChunkSize"/>

         <xsl:choose>
           <xsl:when test="$vNewCount le $vAvailable">
             <xsl:sequence select="$arg1, $arg2//text"/>
           </xsl:when>
           <xsl:otherwise>
             <xsl:sequence select="$arg1"/>
             <xsl:for-each select="1 to $vAvailable">
              <xsl:sequence select="$vPadding/*"/>
              </xsl:for-each>
              <xsl:sequence select="$arg2//text"/>
           </xsl:otherwise>
         </xsl:choose>
    </xsl:template>
</xsl:stylesheet>

When this transformation is applied on the same XML document (above), the same correct, wanted result is produced:

<result>
   <row>abc</row>
   <row>def</row>
   <row>xxx</row>
   <row>yyy</row>
   <row>zzz</row>
   <row>abc2</row>
   <row>def2</row>
   <row/>
   <row>abc3</row>
   <row>def3</row>
   <row>xxx3</row>
   <row>yyy3</row>
   <row>zzz3</row>
   <row> </row>
   <row> </row>
   <row> </row>
</result>

Do note:

  1. The use of the f:foldl() function.

  2. A special DVC (Divide and Conquer) variant of f:foldl() so that recursion stack overflow is avoided for all practical purposes -- for example, the maximum recursion stack depth for 1000000 (1M) trx elements is just 19.

查看更多
Anthone
3楼-- · 2019-08-08 14:52

Build the complete XML data structure as you need in Java. Then, do the simple iteration in XSL over prepared XML.

You might save a lot of effort and provide a maintainable solution.

查看更多
淡お忘
4楼-- · 2019-08-08 14:57

As promised a simplified example answer showing how Xalan can be tweaked to allow for incrementing such global iterators:

<xsl:stylesheet version="1.0" xmlns:f="xalan://com.example.Functions">
  <!-- the global row counter variable -->
  <xsl:variable name="row" select="0"/>

  <xsl:template match="trx">
    <!-- wherever needed, the $row variable can be globally incremented -->
    <xsl:variable name="iteration" value="f:increment('row')"/>

    <!-- based upon this variable, calculations can be made -->
    <xsl:variable name="remaining-rows-in-chunk" 
                  value="40 - (($iteration - 1) mod 40) "/>
    <xsl:if test="count(.//text) &gt; $remaining-rows-in-chunk">
      <xsl:call-template name="empty-row">
        <xsl:with-param name="rows" select="$remaining-rows-in-chunk"/>
      </xsl:call-template>
    </xsl:if>

    <!-- process transaction now, that previous chunk has been filled [...] -->
  </xsl:template>

  <xsl:template name="empty-row">
    <xsl:param name="rows"/>

    <xsl:if test="$rows &gt; 0">
      <row/>
      <xsl:variable name="dummy" select="f:increment('row')"/>

      <xsl:call-template name="empty-row">
        <xsl:with-param name="rows" select="$rows - 1"/>
      </xsl:call-template>
    </xsl:if>
  </xsl:template>
</xsl:stylesheet>

And the contents of com.example.Functions:

public class Functions {
  public static String increment(ExpressionContext context, String nodeName) {
    XNumber n = null;

    try {
      // Access the $row variable
      n = ((XNumber) context.getVariableOrParam(new QName(nodeName)));

      // Make it "mutable" using this tweak. I feel horrible about
      // doing this, though ;-)
      Field m_val = XNumber.class.getDeclaredField("m_val");
      m_val.setAccessible(true);

      // Increment it
      m_val.setDouble(n, m_val.getDouble(n) + 1.0);
    } catch (Exception e) {
      log.error("Error", e);
    }

    return n == null ? null : n.str();
  }
}
查看更多
登录 后发表回答