I have a source XML that contains address elements which could have the same values (please note that Contact/id=1 and Contact/id=3 have the same address:
<?xml version="1.0" encoding="utf-8"?>
<Contacts xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Contact>
<id>1</id>
<Address>
<City City="Wien" />
<Postcode Postcode="LSP-123" />
</Address>
</Contact>
<Contact>
<id>2</id>
<Address>
<City City="Toronto" />
<Postcode Postcode="LKT-947" />
</Address>
</Contact>
<Contact>
<id>3</id>
<Address>
<City City="Wien" />
<Postcode Postcode="LSP-123" />
</Address>
</Contact>
</Contacts>
Desired output with XSLT 1.0:
<?xml version="1.0" encoding="utf-8"?>
<Contacts xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Contact>
<id>1</id>
<Address>SomeId_1</Address>
</Contact>
<Contact>
<id>2</id>
<Address>SomeId_2</Address>
</Contact>
<Contact>
<id>3</id>
<Address>SomeId_1</Address>
</Contact>
</Contacts>
When I used function generate-id(Address) I got different id for addresses in Contact 1 and Contact 3. What other way to generate unique id for node based on its value only?
Thank you for the help.
I would advise building a key of values as a lookup table and then just orienting from the first entry of the lookup table for the unique number:
As for the concatenation that I'm using, I tell my students the technique of using a carriage return as a field delimiter reduces the likelihood of an unintended value collision to an infinitesimal size since there are very few hard carriage returns in XML content (those carriage returns that are parts of end-of-line sequences are normalized to a line-feed and so do not appear in the data).
Edited to add the following entity technique that may improve maintenance since it focuses the lookup expression to a single declaration in the stylesheet, so as not to be accidentally written differently in two different parts of the stylesheet:
XSLT's generate-id() is intended for generating @xml:id's, which are generally attributes meant to uniquely identify a node in the document. So everytime you call generate-id(), you should be getting a unique value.
The identifier's you want to generate are just data, and have nothing to do with what generate-id() does.
If you want an identifier whose value is based on the value of some other data, then you should just generate it from that data. Concat those values together, for example:
Will produce:
If you have some other requirements for the identifier, then you can write a function or use a lookup table to map from those keys to some other identifiers.