I have a Java String that contains XML, with no line feeds or indentations. I would like to turn it into a String with nicely formatted XML. How do I do this?
String unformattedXml = "<tag><nested>hello</nested></tag>";
String formattedXml = new [UnknownClass]().format(unformattedXml);
Note: My input is a String. My output is a String.
(Basic) mock result:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<tag>
<nested>hello</nested>
</tag>
</root>
I have found that in Java 1.6.0_32 the normal method to pretty print an XML string (using a Transformer with a null or identity xslt) does not behave as I would like if tags are merely separated by whitespace, as opposed to having no separating text. I tried using
<xsl:strip-space elements="*"/>
in my template to no avail. The simplest solution I found was to strip the space the way I wanted using a SAXSource and XML filter. Since my solution was for logging I also extended this to work with incomplete XML fragments. Note the normal method seems to work fine if you use a DOMSource but I did not want to use this because of the incompleteness and memory overhead.Try this:
Hmmm... faced something like this and it is a known bug ... just add this OutputProperty ..
Hope this helps ...
there is a very nice command line xml utility called xmlstarlet(http://xmlstar.sourceforge.net/) that can do a lot of things which a lot of people use.
Your could execute this program programatically using Runtime.exec and then readin the formatted output file. It has more options and better error reporting than a few lines of Java code can provide.
download xmlstarlet : http://sourceforge.net/project/showfiles.php?group_id=66612&package_id=64589
As an alternative to the answers from max, codeskraps, David Easley and milosmns, have a look at my lightweight, high-performance pretty-printer library: xml-formatter
Sometimes, like when running mocked SOAP services directly from file, it is good to have a pretty-printer which also handles already pretty-printed XML:
As some have commented, pretty-printing is just a way of presenting XML in a more human-readable form - whitespace strictly does not belong in your XML data.
The library is intended for pretty-printing for logging purposes, and also includes functions for filtering (subtree removal / anonymization) and pretty-printing of XML in CDATA and Text nodes.
Regarding comment that "you must first build a DOM tree": No, you need not and should not do that.
Instead, create a StreamSource (new StreamSource(new StringReader(str)), and feed that to the identity transformer mentioned. That'll use SAX parser, and result will be much faster. Building an intermediate tree is pure overhead for this case. Otherwise the top-ranked answer is good.