-->

XML::LibXML, namespaces and findvalue

2020-02-06 03:03发布

问题:

I'm using XML::LibXML to parse an XML document with a namespace. I therefore use XML::LibXML::XPathContext to findnodes using the XPath //u:model. This correctly returns 3 nodes.

I now would like to use findvalue on the 3 returned XML::LibXML::Element objects, but am unable to determine a working method/xpath. As an alternative, I iterate on the children and match against the nodeName directly, but this is less than ideal:

use strict;
use warnings;

use XML::LibXML;
use XML::LibXML::XPathContext;

my $dom = XML::LibXML->load_xml( IO => \*DATA );
my $context = XML::LibXML::XPathContext->new( $dom->documentElement() );
$context->registerNs( 'u' => 'http://www.ca.com/spectrum/restful/schema/response' );

for my $node ( $context->findnodes('//u:model') ) {
    #my $mh = $node->findvalue('mh');
    my ($mh)
        = map { $_->textContent() }
        grep  { $_->nodeName() eq 'mh' } $node->childNodes();

    #my $attr = $node->findvalue('attribute');
    my ($attr)
        = map { $_->textContent() }
        grep  { $_->nodeName() eq 'attribute' } $node->childNodes();

    print "mh = $mh, attr = $attr\n";
}

__DATA__
<root xmlns="http://www.ca.com/spectrum/restful/schema/response">
  <error>EndOfResults</error>
  <throttle>86</throttle>
  <total-models>86</total-models>
  <model-responses>
    <model>
      <mh>0x100540</mh>
      <attribute id="0x1006e">wltvbswfc02</attribute>
    </model>
    <model>
      <mh>0x100c80</mh>
      <attribute id="0x1006e">wltvsutm1ds02</attribute>
    </model>
    <model>
      <mh>0x100c49</mh>
      <attribute id="0x1006e">wltvsdora03</attribute>
    </model>
  </model-responses>
</root>

Ouputs:

mh = 0x100540, attr = wltvbswfc02
mh = 0x100c80, attr = wltvsutm1ds02
mh = 0x100c49, attr = wltvsdora03

Is there a way to use the commented out lines to find the nodes instead of the indirect method of iterating on the children? Or is there another way to approach this problem to get the paired values?

回答1:

You can't use $node->findvalue() because of the whole default namespace thing. However, you can reuse your XML::LibXML::XPathContext object to find the values you want:

for my $node ( $context->findnodes('//u:model') ) {
   my $mh   = $context->findvalue('u:mh', $node);
   my $attr = $context->findvalue('u:attribute', $node);
   print "mh = $mh, attr = $attr\n";
}


回答2:

XPath allows ignoring namespaces by using the function local-name:

use XML::LibXML;

my $dom = XML::LibXML->load_xml( IO => \*DATA );

for my $node ( $dom->findnodes('//*[local-name()="model"]') ) {
    my $mh   = $node->findvalue('*[local-name()="mh"]');
    my $attr = $node->findvalue('*[local-name()="attribute"]');

    print "mh = $mh, attr = $attr\n";
}

This removes the need to specify an context for a single namespace document like in the question.

Reference: Re^2: XML::LibXML and namespaces