format xml, pretty print

2020-06-18 04:09发布

问题:

I know of two ways to "pretty print", or format, xml:

shell tools Hack 38 Pretty-Print XML Using a Generic Identity Stylesheet and Xalan

what other free (as in beer) formatters are there? (aside from using javascript)

回答1:

Well, the identity transform you linked to is portable to any XSLT processor (Saxon, msxml, etc).

Additionally, you could look at xmllint which is part of the LibXML2 toolkit. The --format option allows you to pretty print the input. Similar functionality exists in XMLStarlet (which uses LibXML2 under the hood iirc).



回答2:

xmlstarlet fo is what I use for pretty printing. Xmlstarlet has a number of options:

$ xmlstarlet fo --help
XMLStarlet Toolkit: Format XML document
Usage: xml fo [<options>] <xml-file>
where <options> are
  -n or --noindent            - do not indent
  -t or --indent-tab          - indent output with tabulation
  -s or --indent-spaces <num> - indent output with <num> spaces
  -o or --omit-decl           - omit xml declaration <?xml version="1.0"?>
  -R or --recover             - try to recover what is parsable
  -D or --dropdtd             - remove the DOCTYPE of the input docs
  -C or --nocdata             - replace cdata section with text nodes
  -N or --nsclean             - remove redundant namespace declarations
  -e or --encode <encoding>   - output in the given encoding (utf-8, unicode...)
  -H or --html                - input is HTML

A good XML engineer should be able to wield xmlstarlet.



回答3:

You can use http://prettydiff.com/?m=beautify Unfortunately, it is written in JavaScript, but it is a complete application so you never have to know that. Just know that you can run from inside your browser without downloading or installing anything.



回答4:

I like the java library XOM for XML manipulation. It has a nice Pretty Printer that provides a lot of control over the output.



回答5:

When using libxml2 in python:

with open(pathToSaveResult, 'w') as fd:
   xmlParsed.saveTo(fd,format = libxml2.XML_SAVE_FORMAT)

Edit: It looks like there is some bug in libxml2 ...pretty printing is done with the tag libxml2.XML_SAVE_NO_EMPTY instead of libxml2.XML_SAVE_FORMAT