Perl - XML::LibXML - getting elements that have ce

2019-08-06 15:48发布

问题:

I have a problem I am hoping someone can help with...

I have the following example xml structure:

<library>
    <book>
       <title>Perl Best Practices</title>
       <author>Damian Conway</author>
       <isbn>0596001738</isbn>
       <pages>542</pages>
       <image src="http://www.oreilly.com/catalog/covers/perlbp.s.gif"
            width="145" height="190" />
    </book>
    <book>
       <title>Perl Cookbook, Second Edition</title>
       <author>Tom Christiansen</author>
       <author>Nathan Torkington</author>
       <isbn>0596003137</isbn>
       <pages>964</pages>
       <image src="http://www.oreilly.com/catalog/covers/perlckbk2.s.gif"
            width="145" height="190" />
    </book>
    <book>
       <title>Guitar for Dummies</title>
       <author>Mark Phillips</author>
       <author>John Chappell</author>
       <isbn>076455106X</isbn>
       <pages>392</pages>
       <image src="http://media.wiley.com/product_data/coverImage/6X/0750/0766X.jpg"
           width="100" height="125" />
    </book>
</library>

Code that I thought should work:

use warnings;
use strict;

use XML::LibXML;

my $parser = XML::LibXML->new();
my $xmldoc = $parser->parse_file('/path/to/xmlfile.xml');

my $width = "145";

my $query = "//book/image[\@width/text() = '$width']/author/text()";

foreach my $data ($xmldoc->findnodes($query)) {
    print "Results: $data\n";
}

Expected output:

Damian Conway Tom Christiansen

but I do not get anything returned.

I thought this would match the text content of any "author" elements within a "book" element which also contains an "image" element with an attribute 'width' that has a value of 145.

I'm sure I'm overlooking something very obvious here but cannot work out what I am doing wrong.

Your help is much appreciated thanks

回答1:

You were almost there. Just notice that author is not a child of image. Attributes do not have text() children, you can compare their values directly with strings. Also, toString is needed to print the values out instead of references.

#!/usr/bin/perl
use warnings;
use strict;

use XML::LibXML;

my $parser = XML::LibXML->new();
my $xmldoc = $parser->parse_file('1.xml');

my $width = "145";

my $query = "//book[image/\@width = '$width']/author/text()";

foreach my $data ($xmldoc->findnodes($query)) {
    print "Results: ", $data->toString, "\n";
}


回答2:

[Building in choroba's answer]

In a situation where it's not safe to interpolate $width (e.g. if it might contain a '), you can use:

for my $book ($xmldoc->findnodes('/library/book')) {
    my $image_width = $book->findvalue('image/@width');
    next if !$image_width || $image_width ne '145';

    for my $data ($book->findnodes('author/text()')) {
        print "Results: ", $data->toString, "\n";
    }
}


回答3:

XML attributes don't have text nodes, so your $query should have been "//book/image[\@width='$width']/author/text()"