perl - remove node from xml file

2019-07-10 07:43发布

问题:

I have a XML file, and I want to read it - remove a node - save it. I run perl from terminal (perl script.pl)

example XML (filename.xml):

<?xml version="1.0" encoding="UTF-8"?>
<twice>
    <inner>
        <twice>
            <name>John</name>
            <surname>Smith</surname>
        </twice>
        <twice>
            <name>Alan</name>
            <surname>Bowler</surname>
        </twice>
        <twice>
            <name>Michael</name>
            <surname>Deck</surname>
        </twice>
    </inner>
</twice>

example perl script (script.pl):

use strict;
use warnings;
use XML::LibXML;
my $filename = "filename.xml";
my $parser = XML::LibXML->new();
my $xmldoc = $parser->parse_file($filename);
for my $dead ($xmldoc->findnodes(q{/twice/inner/twice[surname = "Bowler"]})) {
    $dead->unbindNode;
}
print $xmldoc->toString;

Now it outputs the expected result in terminal, but without saving the file.
Expected result (filename.xml):

<?xml version="1.0" encoding="UTF-8"?>
<twice>
    <inner>
        <twice>
            <name>John</name>
            <surname>Smith</surname>
        </twice>
        <twice>
            <name>Michael</name>
            <surname>Deck</surname>
        </twice>
    </inner>
</twice>

I have searched for many hours and couldn't find anything, sorry if it's a duplicate!
This is my first experience with perl so please any help would be welcomed, thanks.

回答1:

When using toString the docs say to do it like this:

open my $out_fh, '>', 'somefile.xml';
print {$out_fh} $xmldoc->toString;

You can also use the toFile() function to save it:

$xmldoc->toFile("someFile.xml");

Edit: Also to quote the docs, (which is all I did) you can pass the format parameter to these functions.

If $format is 0, than the document is dumped as it was originally parsed

If $format is 1, libxml2 will add ignorable white spaces, so the nodes content is easier to read. Existing text nodes will not be altered

If $format is 2 (or higher), libxml2 will act as $format == 1 but it add a leading and a trailing line break to each text node.

Giving you:

$xmldoc->toFile("someFile.xml", $format);

or

print {$out_fh} $xmldoc->toString($format);


回答2:

You could also use App::Xml_grep2 to do this from the command line:

xml_grep2 -v '/twice/inner/twice[surname = "Bowler"]' input.xml > output_xml