Perl XML/SVG Parser unable to findnodes

2019-07-02 02:38发布

In the following code, I am trying to parse an SVG file and delete all text nodes in it. However, it does not work (Code never goes into the forloop for findnodes). What am I doing wrong? I tried with XPath and LibXML version of the code, but none of them worked. They parse and dump the file fine, but the findnodes matches nothing.

#!/usr/bin/perl

use strict;
use warnings;

use XML::XPath;
use XML::XPath::XMLParser;

my $num_args=$#ARGV+1;
if($num_args != 1) { print "Usage: $0 <filename>\n"; exit(1); }


my $file=$ARGV[0];


my $doc = XML::XPath->new(filename => $file);

foreach my $dead ($doc->findnodes('/svg/text')) {
    print "Found Text Node\n";
    $dead->unbindNode;
}

Starting few lines of the SVG file:

<svg
   xmlns:dc="http://purl.org/dc/elements/1.1/"
   xmlns:cc="http://creativecommons.org/ns#"
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns:svg="http://www.w3.org/2000/svg"
   xmlns="http://www.w3.org/2000/svg"
   xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
   version="1.1"
   width="675"
   height="832.5"
   id="svg2"
   xml:space="preserve"><metadata
     id="metadata8"><rdf:RDF><cc:Work
         rdf:about=""><dc:format>image/svg+xml</dc:format><dc:type
           rdf:resource="http://purl.org/dc/dcmitype/StillImage" /></cc:Work></rdf:RDF></metadata><defs
     id="defs6" /><g
     transform="matrix(1.25,0,0,-1.25,0,832.5)"
     id="g10"><path
       d="m 54,608.663 450,0 M 54,129.052 l 450,0"
       inkscape:connector-curvature="0"
       id="path12"
       style="fill:none;stroke:#231f20;stroke-width:0.5;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:10;stroke-opacity:1;stroke-dasharray:none" /><text
       transform="matrix(1,0,0,-1,229.0848,615.9133)"
       id="text14"><tspan

@

1条回答
走好不送
2楼-- · 2019-07-02 03:27

/svg/text looks for text elements directly under the svg root element. That is not what you have here. It looks like what you want is text elements anywhere in the document, which would be //text. This should work with XML::XPath.

If you want to use XML::LibXML, which you should since it is a much better module than XML::XPath (better maintained, more efficient, more powerful), then you have to pay attention to namespaces: the whole document has a default namespace (the xmlns="http://www.w3.org/2000/svg" bit in the opening tag). You will need to declare it and use XML::LibXML::XPathContext to evaluate the XPath expression, including the prefix.:

#!/usr/bin/perl

use strict;
use warnings;

use XML::LibXML;
use XML::LibXML::XPathContext;

# it's easier to test directly @ARGV in scalar context than to use $#ARGV
if(@ARGV != 1) { print "Usage: $0 <filename>\n"; exit(1); }

my $file=$ARGV[0];

my $doc = XML::LibXML->load_xml( location => $file);

my $xpc = XML::LibXML::XPathContext->new( $doc);     # create the XPath evaluator
$xpc->registerNs(x => 'http://www.w3.org/2000/svg'); # declare the namespace as x

# the query now uses x as the prefix for the svg namespace
foreach my $dead ($xpc->findnodes('//x:text')) {
    print "Found Text Node\n";
    $dead->unbindNode;
}
查看更多
登录 后发表回答