XSLT select only last version element in rowset

2019-08-15 08:26发布

问题:

My xml is:

<RowSet>
  <Row>
     <msg_id>1</msg_id>
     <doc_id>1</doc_id>
     <doc_version>1</doc_version>
  </Row>
  <Row>
     <msg_id>2</msg_id>
     <doc_id>1</doc_id>
     <doc_version>2</doc_version>
  </Row>
    <Row>
     <msg_id>3</msg_id>
     <doc_id>1</doc_id>
     <doc_version>3</doc_version>
  </Row>
      <Row>
     <msg_id>4</msg_id>
     <doc_id>2</doc_id>
     <doc_version>1</doc_version>
  </Row>
  <RowSet>

What I need to do:

If there are Rows with the same doc_id, I need to select only node with the bigger doc_version number.

Expected output:

 <RowSet>
    <Row>
     <msg_id>3</msg_id>
     <doc_id>1</doc_id>
     <doc_version>3</doc_version>
   </Row>
      <Row>
     <msg_id>4</msg_id>
     <doc_id>2</doc_id>
     <doc_version>1</doc_version>
  </Row>
  <RowSet>

May be it might be helpful: msg_id is unique, so Row with bigger msg_id for the same doc_id hold the last doc_version.

回答1:

This transformation works, unlike some other answers:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:key name="kRowByDocId" match="Row" use="doc_id"/>

 <xsl:template match="/*">
    <xsl:apply-templates select=
      "Row[generate-id()=generate-id(key('kRowByDocId', doc_id)[1])]"/>
 </xsl:template>

 <xsl:template match="Row">
     <xsl:for-each select="key('kRowByDocId',doc_id)">
      <xsl:sort select="doc_version" data-type="number" order="descending"/>

      <xsl:if test="position() = 1"><xsl:copy-of select="."/></xsl:if>
     </xsl:for-each>
 </xsl:template>
</xsl:stylesheet>

When applied on the provided XML document:

<RowSet>
    <Row>
        <msg_id>1</msg_id>
        <doc_id>1</doc_id>
        <doc_version>1</doc_version>
    </Row>
    <Row>
        <msg_id>2</msg_id>
        <doc_id>1</doc_id>
        <doc_version>2</doc_version>
    </Row>
    <Row>
        <msg_id>3</msg_id>
        <doc_id>1</doc_id>
        <doc_version>3</doc_version>
    </Row>
    <Row>
        <msg_id>4</msg_id>
        <doc_id>2</doc_id>
        <doc_version>1</doc_version>
    </Row>
</RowSet>

the wanted, correct result is produced:

<Row>
   <msg_id>3</msg_id>
   <doc_id>1</doc_id>
   <doc_version>3</doc_version>
</Row>
<Row>
   <msg_id>4</msg_id>
   <doc_id>2</doc_id>
   <doc_version>1</doc_version>
</Row>

Explanation:

  1. Proper use of the Muenchian Grouping method for finding one item belonging to each different group.

  2. Proper use of sorting for finding a maximum item in a group.

  3. Proper use of the key() function -- for selecting all items in a given group.



回答2:

XSLT 1.0 Solution

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    <xsl:key name="doc_id" match="RowSet/Row" use="doc_id"/>
    <xsl:template match="/">
        <xsl:for-each select="RowSet/Row[generate-id() = generate-id(key('doc_id',doc_id))]">
            <xsl:sort select="doc_id" data-type="number" order="ascending"/>

            <xsl:for-each select="../Row[doc_id = current()/doc_id]">
                <xsl:sort select="doc_version" data-type="number" order="descending"/>
                <xsl:if test="position() = 1">
                    // stuff
                </xsl:if>
            </xsl:for-each>
        </xsl:for-each>
    </xsl:template>
</xsl:stylesheet>

The logic is:

  • Get each unique doc id
  • Then jump up a level and go through each doc_version with that doc_id
  • Take the highest doc_version


回答3:

Try this

 <xsl:for-each-group select="RowSet/RowSet" group-by="doc_id">
       <xsl:for-each select="current-group()">
       <xsl:sort select="doc_version" order="desending"/>
          <xsl:if test="position()=1">
                   // do it your stuff here
            </xsl:if>
        </xsl:for-each>
    </xsl:for-each-group>