efficient xslt conditional increment

In this question i asked how to perform a conditional increment. The provided answer worked, but does not scale well on huge data-sets.

The Input:

<Users>
    <User>
        <id>1</id>
        <username>jack</username>
    </User>
    <User>
        <id>2</id>
        <username>bob</username>
    </User>
    <User>
        <id>3</id>
        <username>bob</username>
    </User>
    <User>
        <id>4</id>
        <username>jack</username>
    </User>
</Users>

The desired output (in optimal time-complexity):

<Users>
   <User>
      <id>1</id>
      <username>jack01</username>
   </User>
   <User>
      <id>2</id>
      <username>bob01</username>
   </User>
   <User>
      <id>3</id>
      <username>bob02</username>
   </User>
   <User>
      <id>4</id>
      <username>jack02</username>
   </User>
</Users>

For this purpose it would be nice to

sort input by username
for each user
- when previous username is equals current username
  - increment counter and
  - set username to '$username$counter'
- otherwise
  - set counter to 1
(sort by id again - no requirement)

Any thoughts?

标签： xslt sorting increment memory-efficient

3条回答

迷人小祖宗

2楼-- · 2019-07-27 14:34

This is kind of ugly and I'm not fond of using xsl:for-each, but it should be faster than using preceding-siblings, and doesn't need a 2-pass approach:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
  <xsl:key name="count" match="User" use="username" />

  <xsl:template match="Users">
    <Users>
      <xsl:for-each select="User[generate-id()=generate-id(key('count',username)[1])]">
        <xsl:for-each select="key('count',username)">
          <User>
            <xsl:copy-of select="id" />
            <username>
              <xsl:value-of select="username" />
              <xsl:number value="position()" format="01"/>
            </username>
          </User>
        </xsl:for-each>
      </xsl:for-each>
    </Users>
  </xsl:template>
</xsl:stylesheet>

If you really need it sorted by ID afterwards, you can wrap it into a two-pass template:

<xsl:stylesheet version="1.0"
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
   xmlns:msxsl="urn:schemas-microsoft-com:xslt">
  <xsl:key name="count" match="User" use="username" />

  <xsl:template match="Users">
    <xsl:variable name="pass1">
      <xsl:for-each select="User[generate-id()=generate-id(key('count',username)[1])]">
        <xsl:for-each select="key('count',username)">
          <User>
            <xsl:copy-of select="id" />
            <username>
              <xsl:value-of select="username" />
              <xsl:number value="position()" format="01"/>
            </username>
          </User>
        </xsl:for-each>
      </xsl:for-each>
    </xsl:variable>

    <xsl:variable name="pass1Nodes" select="msxsl:node-set($pass1)" />

    <Users>
      <xsl:for-each select="$pass1Nodes/*">
        <xsl:sort select="id" />
        <xsl:copy-of select="." />
      </xsl:for-each>
    </Users>
  </xsl:template>
</xsl:stylesheet>

0人赞添加讨论(0) 举报

Anthone

3楼-- · 2019-07-27 14:34

Here's a slight variation, but possible not a great increase in efficiency

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
   <xsl:output method="xml" indent="yes"/>
   <xsl:key name="User" match="User" use="username" />

   <xsl:template match="username/text()">
      <xsl:value-of select="." />
      <xsl:variable name="id" select="generate-id(..)" />
      <xsl:for-each select="key('User', .)">
         <xsl:if test="generate-id(username) = $id">
            <xsl:number value="position()" format="01"/>
         </xsl:if>
      </xsl:for-each>
   </xsl:template>

   <xsl:template match="@*|node()">
      <xsl:copy>
         <xsl:apply-templates select="@*|node()"/>
      </xsl:copy>
   </xsl:template>
</xsl:stylesheet>

What this is doing is defining a key to group Users by username. Then, for each username element, you look through the elements in the key for that username, and output the position when you find a match.

One slight advantage of this approach is that you are only looking at user records with the same name. This may be more efficient if you don't have huge numbers of the same name.

0人赞添加讨论(0) 举报

神经病院院长

4楼-- · 2019-07-27 14:35

This transformation produces exactly the specified wanted result and is efficient (O(N)):

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:ext="http://exslt.org/common" exclude-result-prefixes="ext">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:key name="kUserByName" match="User" use="username"/>
 <xsl:key name="kUByGid" match="u" use="@gid"/>

 <xsl:variable name="vOrderedByName">
  <xsl:for-each select=
  "/*/User[generate-id()=generate-id(key('kUserByName',username)[1])]">
     <xsl:for-each select="key('kUserByName',username)">
       <u gid="{generate-id()}" pos="{position()}"/>
     </xsl:for-each>
  </xsl:for-each>
 </xsl:variable>

  <xsl:template match="node()|@*">
     <xsl:copy>
       <xsl:apply-templates select="node()|@*"/>
     </xsl:copy>
 </xsl:template>

 <xsl:template match="username/text()">
     <xsl:value-of select="."/>
     <xsl:variable name="vGid" select="generate-id(../..)"/>

     <xsl:for-each select="ext:node-set($vOrderedByName)[1]">
      <xsl:value-of select="format-number(key('kUByGid', $vGid)/@pos, '00')"/>
     </xsl:for-each>
 </xsl:template>
</xsl:stylesheet>

When applied on the provided XML document:

<Users>
    <User>
        <id>1</id>
        <username>jack</username>
    </User>
    <User>
        <id>2</id>
        <username>bob</username>
    </User>
    <User>
        <id>3</id>
        <username>bob</username>
    </User>
    <User>
        <id>4</id>
        <username>jack</username>
    </User>
</Users>

the wanted, correct result is produced:

<Users>
   <User>
      <id>1</id>
      <username>jack01</username>
   </User>
   <User>
      <id>2</id>
      <username>bob01</username>
   </User>
   <User>
      <id>3</id>
      <username>bob02</username>
   </User>
   <User>
      <id>4</id>
      <username>jack02</username>
   </User>
</Users>

0人赞添加讨论(0) 举报

efficient xslt conditional increment

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间