XPATH To filter out records with letters

2019-07-18 07:40发布

问题:

I am looking for an XPATH expression that will perform a search to ensure a field does not have a letter in it. For example input XML:

<?xml version="1.0" encoding="UTF-8"?>
<payload>
    <records>
        <record>
            <number>123</number>
        </record>
        <record>
            <number>456</number>
        </record> 
        <record>
            <number>78A</number>
        </record> 
    </records>
</payload>

I want it too filter out the third result as it has a letter in the tag. So return this:

<?xml version="1.0" encoding="UTF-8"?>
<payload>
    <records>
        <record>
            <number>123</number>
        </record>
        <record>
            <number>456</number>
        </record> 
    </records>
</payload>

Is that possible to do in a simple XPATH?

So something like /payload/records/record[reg expression here?]

@Cylian

This is what i mean:

<?xml version="1.0" encoding="UTF-8"?>
<payload>
    <records>
        <record>
            <number>123</number>
            <time>12pm</time>
            <zome>UK</zome>
        </record>
        <record>
            <number>456</number>
            <time>12pm</time>
            <zome>UK</zome>
        </record> 
        <record>
            <number>78A</number>
            <time>12pm</time>
            <zome>UK</zome>
        </record> 
    </records>
</payload>

回答1:

XPath (both 1.0 and 2.0) is a query language for XML documents.

As such an XPath expression only selects sets of nodes (or extracts other data), but cannot alter the structure (like delete a node) of the XML document.

Therefore, it is not possible to construct an XPath expression that alters the provided XML document to the wanted one.

This task can easily be accomplished with XSLT or XQuery (not so easily):

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="node()|@*">
     <xsl:copy>
       <xsl:apply-templates select="node()|@*"/>
     </xsl:copy>
 </xsl:template>

 <xsl:template match="record[translate(number, '0123456789', '')]"/>
</xsl:stylesheet>

When this transformation is applied on the provided XML document:

<payload>
    <records>
        <record>
            <number>123</number>
        </record>
        <record>
            <number>456</number>
        </record>
        <record>
            <number>78A</number>
        </record>
    </records>
</payload>

the wanted, correct result is produced:

<payload>
   <records>
      <record>
         <number>123</number>
      </record>
      <record>
         <number>456</number>
      </record>
   </records>
</payload>


回答2:

You can easily delete the nodes using an XQuery Update expression, too:

for $record in doc('payload.xml')//record
where xs:string(number($record/number)) = 'NaN'
return delete node $record


回答3:

Try this(XPath 2.0):

/payload/records/record[matches(child::*/text(),'[^\p{L}]')]


标签: xml xslt xpath