How to validate metadata.xml against .dtd in gento

2019-07-13 01:56发布

问题:

I am trying to validate metadata.xml against www.gentoo.org/dtd/metadata.dtd with xmllint from =dev-libs/libxml2-2.9.3 ebuild.

I tried the commands (some from here):

$ xmllint --noout --valid  metadata.xml
error : Unknown IO error
metadata.xml:2: warning: failed to load external entity "http://www.gentoo.org/dtd/metadata.dtd"

the same for xmllint metadata.xml --dtdvalid metadata.dtd and xmllint --loaddtd http://www.gentoo.org/dtd/metadata.dtd

$ xmllint --valid  metadata.xml --schema metadata.dtd
metadata.dtd:1: parser error : StartTag: invalid element name

I need xmllint and not mono-xmltool (from C#/CLI) because xmllint is used in repoman -d command. And repoman is used for gentoo overlay validation in travis-ci

How to validate xml with xmllint properly?

UPD: site returns "HTTP/1.1 301 Moved Permanently" and that is why load fails

part of strace:

recvfrom(3, "HTTP/1.1 301 Moved Permanently\r\n"..., 4096, 0, NULL, NULL) = 446
recvfrom(3, "", 4096, 0, NULL, NULL)    = 0
close(3)                                = 0
write(2, "error : ", 8error : )                 = 8
write(2, "Unknown IO error\n", 17Unknown IO error

probably libxml2 doesn't do https

USE="icu ipv6 python readline -debug -examples -lzma -static-libs {-test}"

libxml2 uses nanoHTTP, nanoHTTP can work with HTTPS

回答1:

Your assumption was right, the problem is HTTPS. To work around this and to save some BW and time, repoman validates against a local file, which it prefetches if not found. The default location is either REPO_ROOT/metadata/dtd/metadata.dtd or DISTDIR/metadata.dtd. To get the exact arguments repoman uses for xmllint you have to have a look at its source code - here. As you can see, it's

xmllint --nonet --noout --dtdvalid <metadata.dtd> metadata.xml

This command still outputs:

metadata.xml:2: warning: failed to load external entity "https://www.gentoo.org/dtd/metadata.dtd"
<!DOCTYPE pkgmetadata SYSTEM "https://www.gentoo.org/dtd/metadata.dtd">

or in case of HTTP:

I/O error : Attempt to load network entity http://www.gentoo.org/dtd/metadata.dtd
metadata.xml:2: warning: failed to load external entity "http://www.gentoo.org/dtd/metadata.dtd"
<!DOCTYPE pkgmetadata SYSTEM "http://www.gentoo.org/dtd/metadata.dtd">

But only as a warning, so the command exits with 0.