Perl and Xpath: considering hierarchy

2019-09-07 05:31发布

I would like to extract the attribute values of an XML considering the hierarchy level:

<?xml version="1.0" encoding="UTF-8"?>
<database>
  <row1s>
   <row1 name="fox" category="mammal">
       <row2s>
         <row2 type="1"/>
         <row2 type="2"/>
       </row2s>
   </row1>
   <row1 name="horse" category="mammal">
       <row2s>
         <row2 type="3"/>
       </row2s>
   </row1>
   <row1 name="bee" category="insect"> 
       <row2s/>
   </row1>
   <row1 name="wasp" category="insect">
       <row2s/>
   </row1>
  </row1s>
</database>

This is the Perl-code I extract the values:

use strict;
use DBI;
use XML::XPath;
use XML::XPath::XMLParser;

my $xrow1;
my $xrow2;

my $xp = XML::XPath->new (filename => "animals3.xml");

my $node_list1 = $xp->find ("//row1s/row1");

foreach my $row1 ($node_list1->get_nodelist ())  {
    $xrow1 = $row1->getAttribute("name");
    print "Level row1 gives: $xrow1\n";

    my $node_list2 = $xp->find ("//row2s/row2");

    foreach my $row2 ($node_list2->get_nodelist ()) {
    $xrow2 = $row2->getAttribute("type");    
    print "Level row2 gives: $xrow2\n";
    }
}

What I get is:

Level row1 gives: fox   
Level row2 gives: 1   
Level row2 gives: 2   
Level row2 gives: 3   
Level row1 gives: horse   
Level row2 gives: 1   
Level row2 gives: 2   
Level row2 gives: 3   
Level row1 gives: bee   
Level row2 gives: 1   
Level row2 gives: 2   
Level row2 gives: 3   
Level row1 gives: wasp   
Level row2 gives: 1   
Level row2 gives: 2   
Level row2 gives: 3   

For each level 1 I get all attribute values from level 2. This is not that what i want. I would like to output only the the level 2 entries of the correpondent level 1. But what I want is:

Level row1 gives: fox   
Level row2 gives: 1   
Level row2 gives: 2   
Level row1 gives: horse   
Level row2 gives: 3   
Level row1 gives: bee   
Level row1 gives: wasp   

I would appreciate any hint how to solve this problem.

Thanks.

2条回答
够拽才男人
2楼-- · 2019-09-07 05:59

The following fixes and simplifies your script:

use strict;
use warnings;

use XML::XPath;
use XML::XPath::XMLParser;

#my $xp = XML::XPath->new( filename => "animals3.xml" );
my $xp = XML::XPath->new( ioref => \*DATA );

for my $row1 ( $xp->findnodes('//row1s/row1') ){
    printf "Level row1 gives: %s\n", $row1->getAttribute("name");

    for my $row2 ( $row1->findnodes('.//row2s/row2') ) {
        printf "Level row2 gives: %s\n", $row2->getAttribute("type");
    }
}

__DATA__
<?xml version="1.0" encoding="UTF-8"?>
<database>
  <row1s>
   <row1 name="fox" category="mammal">
       <row2s>
         <row2 type="1"/>
         <row2 type="2"/>
       </row2s>
   </row1>
   <row1 name="horse" category="mammal">
       <row2s>
         <row2 type="3"/>
       </row2s>
   </row1>
   <row1 name="bee" category="insect"> 
       <row2s/>
   </row1>
   <row1 name="wasp" category="insect">
       <row2s/>
   </row1>
  </row1s>
</database>

Outputs:

Level row1 gives: fox
Level row2 gives: 1
Level row2 gives: 2
Level row1 gives: horse
Level row2 gives: 3
Level row1 gives: bee
Level row1 gives: wasp
查看更多
SAY GOODBYE
3楼-- · 2019-09-07 06:12

A leading / indicates an absolute path.

my $node2 = $xp->find("//row2s/row2");

should be

my $node2 = $xp->find("row2s/row2", $row1);

Comments:

  • Neither $node1 nor $node2 are nodes. Pick better names.

  • Declaring your variables where you did partially defies the purpose of declaring them. they should be declared within the appropriate loops.

查看更多
登录 后发表回答