XSLT: find duplicates within each child

2019-05-14 10:27发布

I'm new to XSLT/XML.

I have an XML file similar to this:

<event>
  <division name="Div1">
    <team name="Team1">
      <player firstname="A" lastname="F" />
      <player firstname="B" lastname="G" />
      <player firstname="C" lastname="H" />
      <player firstname="D" lastname="G" />
    </team>
    <team name="Team2">
      <player firstname="A" lastname="F" />
      <player firstname="B" lastname="G" />
      <player firstname="C" lastname="H" />
      <player firstname="D" lastname="I" />
    </team>
  </division>
</event>

I'm trying to write a XSL Transformation (to use with xsltproc) to give me the names of players with the same lastname within the same team.

After searching around I came up to this:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
  <xsl:output method="text" encoding="UTF-8"/>

  <xsl:key name="lastnames" match="player" use="@lastname" />

  <xsl:template match="/">    
    <xsl:for-each select="event/division/team">

      <xsl:variable name="dups" select="player[generate-id() = generate-id(key('lastnames', @lastname)[2])]" />

      <xsl:if test="$dups">
        Team: <xsl:value-of select="@name" /> (<xsl:value-of select="../@name" />)
        Players: 
        <xsl:for-each select="$dups">
          <xsl:value-of select="@lastname" />, <xsl:value-of select="@firstname" />.
        </xsl:for-each>
      </xsl:if>

    </xsl:for-each>
  </xsl:template>
</xsl:stylesheet>

The main problem with this is that it's giving me duplicates across all the players of all teams, not just within each team.

In the above example it should return only one occurrence (for player D G in Team1).

Minor problem: this prints only the 2nd occurrence of a duplicate. Preferably it should print 2nd, 3rd, 4th... occurrences (the 1st can be skipped). I know this is because of the "[2]" after the key function. Most examples I found were on how to remove duplicates, here I need the opposite, so this was the trick I found to give me (close to) what I need. Probably there are better ways of achieving this...

Any help is appreciated.

Thanks, Bruno

2条回答
Luminary・发光体
2楼-- · 2019-05-14 10:57

You may also achieve this by:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
  <xsl:output method="text" encoding="UTF-8"/>
  <xsl:strip-space elements="*"/>
  <xsl:key name="lastnames" match="player" use="@lastname"/>

  <xsl:template match="event">
    <xsl:text>Team:&#10;</xsl:text>
    <xsl:apply-templates/>
  </xsl:template>

  <xsl:template match="team">
    <xsl:text>&#10;</xsl:text>
    <xsl:value-of select="concat(@name,' (',parent::division/@name,')')"/>
    <xsl:text> Players: </xsl:text>
    <xsl:for-each select="player[@lastname = preceding-sibling::player/@lastname]">
      <xsl:value-of select="concat(@lastname,', ', @firstname,'.')"/>
    </xsl:for-each>
  </xsl:template>
</xsl:stylesheet>

output:

Team:

Team1 (Div1) Players: G, D.
Team2 (Div1) Players: 
查看更多
一夜七次
3楼-- · 2019-05-14 11:11

You're on the right track! The primary change needed is an adjustment to the key - it particular, in needs to take into account the @lastname value and unique information about the parent <team> element:

<xsl:key
  name="kPlayerByLastnameAndTeam"
  match="player"
  use="concat(parent::team/@name, '+', @lastname)" />

The other change to make is one you've already noted: you need something other than a [2] predicate to get all duplicates. The trick to that is to use the same key in a @match attribute such that all other elements are selected:

key(
  'kPlayerByLastnameAndTeam',
  concat(parent::team/@name, '+', @lastname))
[not(generate-id() = generate-id(current()))]

To see all this in action, take a look at this complete solution.

When this XSLT:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
  <xsl:output omit-xml-declaration="yes" indent="yes" method="text"/>
  <xsl:strip-space elements="*"/>

  <xsl:key
    name="kPlayerByLastnameAndTeam"
    match="player"
    use="concat(../@name, '+', @lastname)"/>

  <xsl:template match="@*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="team">
    <xsl:apply-templates
      select="*[
                generate-id() =
                generate-id(key(
                  'kPlayerByLastnameAndTeam',
                  concat(../@name, '+', @lastname))[1])
               ]"/>
  </xsl:template>

  <xsl:template match="player">
    <xsl:variable
      name="vDups"
      select="key(
                'kPlayerByLastnameAndTeam',
                concat(../@name, '+', @lastname))
              [not(generate-id() = generate-id(current()))]"/>
    <xsl:if test="$vDups">
      <xsl:value-of
        select="concat('Team: ', ../@name, ' (', ../../@name, ')')"/>
      <xsl:text>&#10;Players: </xsl:text>
      <xsl:apply-templates select="$vDups" mode="copy"/>
      <xsl:text>&#10;&#10;</xsl:text>
    </xsl:if>
  </xsl:template>

  <xsl:template match="player" mode="copy">
    <xsl:if test="position() &gt; 1">; </xsl:if>
    <xsl:value-of select="concat(@lastname, ', ', @firstname, '.')"/>
  </xsl:template>

</xsl:stylesheet>

...is applied against the XML provided:

<event>
  <division name="Div1">
    <team name="Team1">
      <player firstname="A" lastname="F"/>
      <player firstname="B" lastname="G"/>
      <player firstname="C" lastname="H"/>
      <player firstname="D" lastname="G"/>
    </team>
    <team name="Team2">
      <player firstname="A" lastname="F"/>
      <player firstname="B" lastname="G"/>
      <player firstname="C" lastname="H"/>
      <player firstname="D" lastname="I"/>
    </team>
  </division>
</event>

...the wanted result is produced:

Team: Team1 (Div1)
Players: G, D.
查看更多
登录 后发表回答