Validate XML using LibXML

2019-04-07 17:04发布

问题:

Currently, I am using the XML::LibXML perl module to validate an XML file against a defined XML schema. At the moment, if my XML file fails to validate successfully against the defined XML Schema, I will get a list of errors informing me, for example that certain elements were not expected and then what was expected instead. In my XML file I will have many elements of the same name but they may be nested in various places in the XML file.

My question is, is there anyway in which I can output the XPath location of any elements that may error when attempting to perform the validation?

Currently, my XML file is quite big and it is hard to "debug" it when validation fails as the name of the element that is displayed in the error, may occur many times in various places in the XML file.

My code is below for using LibXML to validate an XML file against a schema.

#!/usr/bin/perl
use strict;
use warnings;
use XML::LibXML;

my $schema_file = 'MySchema.xml';
my $document    = 'MyFile.xml';

my $schema = XML::LibXML::Schema->new(location => $schema_file);

my $parser = XML::LibXML->new;
my $doc    = $parser->parse_file($document);

eval { $schema->validate($doc) };
die $@ if $@;

print "$document validated successfully\n";

回答1:

I have just stumbled on the same problem and found that the XML parser does not store the line numbers by default. But you can tell him to do so with the XML_LIBXML_LINENUMBERS parameter of the constructor.

The following script will tell actual line numbers for errors instead of 0

use Modern::Perl;
use XML::LibXML;

my ($instance, $schema) = @ARGV;

my $doc = XML::LibXML->new(XML_LIBXML_LINENUMBERS => 1)->parse_file($instance); 
my $xmlschema = XML::LibXML::Schema->new( location => $schema );
my $res = eval { $xmlschema->validate( $doc ); };

say "error: $@" if $@;
say "res: ", $res//'undef';


回答2:

You might want to look at: XML::Validate to get line number and column number?



回答3:

See source of Padre::Task::SyntaxChecker::XML. This module is used by Padre IDE to do syntax check of XML file. See also t/01-valid.t in Padre-Plugin-XML distribution for an example of usage including line numbers.