Select unique nodes based on a combination of two

2019-03-30 20:16发布

问题:

I have some XML that looks something like this:

<Root>
    <Documents>
        <Document id="1"/>
    </Documents>
    <People>
        <Person id="1"/>
        <Person id="2"/>
    </People>
    <Links>
        <Link personId="1" documentId="1"/>
        <Link personId="1" documentId="1"/>
        <Link personId="2" documentId="1"/>
    </Links>
</Root>

And I am interested in getting only the 'Link' elements that have a unique combination of 'personId's and 'documentId's, so these two links:

<Root>
    <Links>
        <Link personId="1" documentId="1"/>
        <Link personId="2" documentId="1"/>
    </Links>
</Root>

How might I go about doing that? I have found this question, though I feel mine is slightly more complex and may not apply...I presumme I am going to need to use the key() function somewhere...

Thanks in advance.

回答1:

This stylesheet:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:key name="kDocAndPeoById" match="Document|Person" use="@id"/>
    <xsl:key name="kLinksByIds" match="Link" 
             use="concat(@personId,'++',@documentId)"/>
    <xsl:template match="node()|@*">
        <xsl:copy>
            <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
    </xsl:template>
    <xsl:template match="Documents|People|
     Link[count(.|key('kLinksByIds',concat(@personId,'++',@documentId))[1])!=1
          or not(key('kDocAndPeoById',@personId)/self::Person)
          or not(key('kDocAndPeoById',@documentId)/self::Document)]"/>
</xsl:stylesheet>

Output:

<Root>
    <Links>
        <Link personId="1" documentId="1"></Link>
        <Link personId="2" documentId="1"></Link>
    </Links>
</Root>

If you have no interest into checking if there is such Document or Person @id, then this stylesheet:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:key name="kLinksByIds" match="Link" 
              use="concat(@personId,'++',@documentId)"/>
    <xsl:template match="node()|@*">
        <xsl:copy>
            <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
    </xsl:template>
    <xsl:template match="Documents|People|
  Link[count(.|key('kLinksByIds',concat(@personId,'++',@documentId))[1])!=1]"/>
</xsl:stylesheet>

Output:

<Root>
    <Links>
        <Link personId="1" documentId="1"></Link>
        <Link personId="2" documentId="1"></Link>
    </Links>
</Root>


回答2:

You can combine multiple selector attributes into the XPath query, doesn't have to be just a single attribute=value pair.

Find through multiple attributes in XML



回答3:

You need to filter the <Link>s with something like this, where the current() function returns <Link>s you're checking for uniqueness.

.[not(preceding-sibling::Link[@personId   = current()/@personId and
                              @documentId = current()/@documentId])]

The preceding-sibling:: axis is used to find earlier <Link> elements and the part in square brackets checks for matching ID numbers. The not() wrapping the whole expression means the entire bracketed expression is true only if NO such preceding sibling matches, i.e. there is no prior <Link> with the same person and document IDs.

My XSLT knowledge is rusty so I'll leave that part to you. What I'm thinking is you first find all links with, say, //Link, and then filter them in a second step with the above XPath. I tried hard but couldn't think of any way to do it all in one step since this relies on the current() function to work.



标签: xslt xpath