XSLT: How to generate unique id for node based on

2019-05-18 05:37发布

I have a source XML that contains address elements which could have the same values (please note that Contact/id=1 and Contact/id=3 have the same address:

<?xml version="1.0" encoding="utf-8"?>
<Contacts xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <Contact>
        <id>1</id>
        <Address>
            <City City="Wien" />
            <Postcode Postcode="LSP-123" />
        </Address>
    </Contact>
    <Contact>
        <id>2</id>        
        <Address>
            <City City="Toronto" />
            <Postcode Postcode="LKT-947" />
        </Address>
    </Contact>
    <Contact>
        <id>3</id>        
        <Address>
            <City City="Wien" />
            <Postcode Postcode="LSP-123" />
        </Address>
    </Contact>
</Contacts> 

Desired output with XSLT 1.0:

<?xml version="1.0" encoding="utf-8"?>
<Contacts xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <Contact>
        <id>1</id>
        <Address>SomeId_1</Address>
    </Contact>
    <Contact>
        <id>2</id>        
        <Address>SomeId_2</Address>
    </Contact>
    <Contact>
        <id>3</id>        
        <Address>SomeId_1</Address>
    </Contact>
</Contacts>

When I used function generate-id(Address) I got different id for addresses in Contact 1 and Contact 3. What other way to generate unique id for node based on its value only?

Thank you for the help.

标签: xslt xslt-1.0
2条回答
手持菜刀,她持情操
2楼-- · 2019-05-18 05:53

I would advise building a key of values as a lookup table and then just orienting from the first entry of the lookup table for the unique number:

t:\ftemp>type ivan.xml 
<?xml version="1.0" encoding="utf-8"?>
<Contacts xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <Contact>
        <id>1</id>
        <Address>
            <City City="Wien" />
            <Postcode Postcode="LSP-123" />
        </Address>
    </Contact>
    <Contact>
        <id>2</id>        
        <Address>
            <City City="Toronto" />
            <Postcode Postcode="LKT-947" />
        </Address>
    </Contact>
    <Contact>
        <id>3</id>        
        <Address>
            <City City="Wien" />
            <Postcode Postcode="LSP-123" />
        </Address>
    </Contact>
</Contacts> 
t:\ftemp>call xslt ivan.xml ivan.xsl 
<?xml version="1.0" encoding="utf-8"?><Contacts xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <Contact>
        <id>1</id>
        <Address>SomeId_1</Address>
    </Contact>
    <Contact>
        <id>2</id>        
        <Address>SomeId_2</Address>
    </Contact>
    <Contact>
        <id>3</id>        
        <Address>SomeId_1</Address>
    </Contact>
</Contacts>
t:\ftemp>type ivan.xsl 
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="1.0">

<xsl:key name="city-pc-pair" match="Address"
         use="concat(City/@City,'&#xd;',Postcode/@PostCode)"/>

<xsl:template match="Address">
  <xsl:for-each select="key('city-pc-pair',
                            concat(City/@City,'&#xd;',Postcode/@PostCode))[1]">
    <Address>SomeId_<xsl:number level="any"/></Address>
  </xsl:for-each>
</xsl:template>

<xsl:template match="@*|node()"><!--identity for all other nodes-->
  <xsl:copy>
    <xsl:apply-templates select="@*|node()"/>
  </xsl:copy>
</xsl:template>

</xsl:stylesheet>
t:\ftemp>rem Done! 

As for the concatenation that I'm using, I tell my students the technique of using a carriage return as a field delimiter reduces the likelihood of an unintended value collision to an infinitesimal size since there are very few hard carriage returns in XML content (those carriage returns that are parts of end-of-line sequences are normalized to a line-feed and so do not appear in the data).

Edited to add the following entity technique that may improve maintenance since it focuses the lookup expression to a single declaration in the stylesheet, so as not to be accidentally written differently in two different parts of the stylesheet:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE xsl:stylesheet
[
<!ENTITY lookup "concat(City/@City,'&#xd;',Postcode/@PostCode)">
]>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="1.0">

<xsl:key name="city-pc-pair" match="Address" use="&lookup;"/>

<xsl:template match="Address">
  <xsl:for-each select="key('city-pc-pair',&lookup;)[1]">
    <Address>SomeId_<xsl:number level="any"/></Address>
  </xsl:for-each>
</xsl:template>

<xsl:template match="@*|node()"><!--identity for all other nodes-->
  <xsl:copy>
    <xsl:apply-templates select="@*|node()"/>
  </xsl:copy>
</xsl:template>

</xsl:stylesheet>
查看更多
老娘就宠你
3楼-- · 2019-05-18 06:13

XSLT's generate-id() is intended for generating @xml:id's, which are generally attributes meant to uniquely identify a node in the document. So everytime you call generate-id(), you should be getting a unique value.

The identifier's you want to generate are just data, and have nothing to do with what generate-id() does.

If you want an identifier whose value is based on the value of some other data, then you should just generate it from that data. Concat those values together, for example:

<?xml version="1.0" encoding="UTF-8"?>

<xsl:template match="*|@*">
    <xsl:copy>
        <xsl:apply-templates />
    </xsl:copy>
</xsl:template>

<xsl:template match="Address">
    <Address>
        <xsl:value-of select="concat(City/@City, '+', Postcode/@Postcode)"/>
    </Address>
</xsl:template>

Will produce:

    <?xml version="1.0" encoding="UTF-8"?>
<Contacts xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <Contact>
        <id>1</id>
        <Address>Wien+LSP-123</Address>
    </Contact>
    <Contact>
        <id>2</id>        
        <Address>Toronto+LKT-947</Address>
    </Contact>
    <Contact>
        <id>3</id>        
        <Address>Wien+LSP-123</Address>
    </Contact>
</Contacts>

If you have some other requirements for the identifier, then you can write a function or use a lookup table to map from those keys to some other identifiers.

查看更多
登录 后发表回答