Extracting data from an XML document that uses nam

I have some XML files where I want to use some information from them. I have written a code that reads those files and then looks for some conditions.

The problem is that these XML file begins with

   <SquishReport version="2.1" xmlns="http://www.froglogic.com/XML2">

and Perl could not read them (at least in my code!). But When I am appending these lines in the first line of XML file

   <?xml version="1.0" encoding="UTF-8"?>
   <?xml-stylesheet type="text/xsl"?>

works very well.

Some lines from my XML file test.xml:

<SquishReport version="2.1" xmlns="http://www.froglogic.com/XML2">
   <test name="TEST">
      <prolog time="2015-10-01T03:45:22+02:00"/>
      <test name="tst_start_app">
          <prolog time="2015-02-01T03:45:23+02:00"/>
          <message line="38" type="LOG" file="C:\squish\test\sources.py" time="2015-02-01T03:45:23+02:00">
              <description>
                <![CDATA[>>  >>  >> start: init (global) - testcase C:\squish\test\tst_start_app]]></description>
          </message>
       </test>
   </test>
</SquishReport>

and the Perl code for reading the XML file is:

use strict;
use warnings;
use feature 'say';
use XML::LibXML;

# Parse the XML
my $xml = XML::LibXML->load_xml(location => 'test.xml');

# Iterate the entries
for my $entry ($xml->findnodes('/SquishReport/test/test')) {
    my $key = $entry->findvalue('@name');
    say "$key";
}

标签： xml perl xml-libxml

2条回答

做个烂人

2楼-- · 2019-01-18 06:11

Perl has so many excellent XML tools - thanks to all the module developers and libxml2, XML almost seems easy. One of those tools is XML::Dataset - a convenience "scaffolding" module that builds on XML::LibXML and uses a "profile" markup language to grab data from XML sources (NB: The profile mark-up is sensitive to whitespace and line endings).

e.g.:

use XML::Dataset;
use DDP;

my $xml = "Squish.xml" ; 
open my $fh, "<", $xml or die "aiiieee!";
my $test_data = do { local $/; <$fh> };

# describe the data using XML::Dataset simplified markup:
my $data_profile
    = q(
          SquishReport
            test
              test
                 name = dataset:name);

# parse it with XML::Dataset profile
my $parsed_data = parse_using_profile($test_data, $data_profile);

# view the element with Data::Printer
foreach my $element ( $parsed_data->{name}){
     p $element ;
};

Squish.xml:

<SquishReport version="2.1" xmlns="http://www.froglogic.com/XML2">
   <test name="TEST">
      <prolog time="2015-10-01T03:45:22+02:00"/>
      <test name="tst_start_app">
          <prolog time="2015-02-01T03:45:23+02:00"/>
          <message line="38" type="LOG" file="C:\squish\test\sources.py" time="2015-02-01T03:45:23+02:00">
              <description>
                <![CDATA[>>  >>  >> start: init (global) - testcase C:\squish\test\tst_start_app]]></description>
          </message>
       </test>
   </test>
</SquishReport>

Output:

\ [
    [0] {
        name   "tst_start_app"
    }
]

0人赞添加讨论(0) 举报

smile是对你的礼貌

3楼-- · 2019-01-18 06:21

The root node of that document is an element which has name SquishReport in the http://www.froglogic.com/XML2 namespace. Concisely, we can say the root node is a

{http://www.froglogic.com/XML2}SquishReport

When one uses SquishReport (as opposed to prefix:SquishReport) in an XPath, that tries to match an element which has name SquishReport in the null namespace. Concisely, we can say it attempts to match a

{}SquishReport

To specify the namespace, one uses prefixes defined in a context, as follows:

use strict;
use warnings;
use feature qw( say );

use XML::LibXML               qw( );
use XML::LibXML::XPathContext qw( );

my $xpc = XML::LibXML::XPathContext->new();
$xpc->registerNs(sr => 'http://www.froglogic.com/XML2');

my $doc = XML::LibXML->load_xml( location => 'test.xml' );
for my $entry ($xpc->findnodes('/sr:SquishReport/sr:test/sr:test', $doc)) {
    my $key = $entry->findvalue('@name');
    say $key;
}

Note: The prefix used in the XPath have no relation to the prefixes used in the XML document (if any). You are expected to know the namespace in which resides the elements for which you are searching, but not the prefixes used by a given document.

0人赞添加讨论(0) 举报

Extracting data from an XML document that uses nam

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间