StAX XML formatting in Java

2019-01-07 19:39发布

问题:

Is it possible using StAX (specifically woodstox) to format the output xml with newlines and tabs, i.e. in the form:

<element1>
  <element2>
   someData
  </element2>
</element1>

instead of:

<element1><element2>someData</element2></element1>

If this is not possible in woodstox, is there any other lightweight libs that can do this?

回答1:

Via the JDK: transformer.setOutputProperty(OutputKeys.INDENT, "yes");.



回答2:

There is com.sun.xml.txw2.output.IndentingXMLStreamWriter

XMLOutputFactory xmlof = XMLOutputFactory.newInstance();
XMLStreamWriter writer = new IndentingXMLStreamWriter(xmlof.createXMLStreamWriter(out));


回答3:

If you're using the StAX cursor API, you can indent the output by wrapping the XMLStreamWriter in an indenting proxy. I tried this in my own project and it worked nicely.



回答4:

Rather than relying on a com.sun...class that might go away (or get renamed com.oracle...class), I recommend downloading the StAX utility classes from java.net. This package contains a IndentingXMLStreamWriter class that works nicely. (Source and javadoc are included in the download.)



回答5:

Using the JDK Transformer:

public String transform(String xml) throws XMLStreamException, TransformerException
{
    Transformer t = TransformerFactory.newInstance().newTransformer();
    t.setOutputProperty(OutputKeys.INDENT, "yes");
    t.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2");
    Writer out = new StringWriter();
    t.transform(new StreamSource(new StringReader(xml)), new StreamResult(out));
    return out.toString();
}


回答6:

How about StaxMate:

http://www.cowtowncoder.com/blog/archives/2006/09/entry_21.html

Works well with Woodstox, fast, low-memory usage (no in-memory tree built), and indents like so:


SMOutputFactory sf = new SMOutputFactory(XMLOutputFactory.newInstance());
SMOutputDocument doc = sf.createOutputDocument(new FileOutputStream("output.xml"));
doc.setIndentation("\n ", 1, 2); // for unix linefeed, 2 spaces per level    
// write doc like:    
SMOutputElement root = doc.addElement("element1");    
root.addElement("element2").addCharacters("someData");    
doc.closeRoot(); // important, flushes, closes output



回答7:

If you're using the iterating method (XMLEventReader), can't you just attach a new line '\n' character to the relevant XMLEvents when writing to your XML file?



回答8:

Not sure about stax, but there was a recent discussion about pretty printing xml here

pretty print xml from java

this was my attempt at a solution

How to pretty print XML from Java?

using the org.dom4j.io.OutputFormat.createPrettyPrint() method



回答9:

With Spring Batch this requires a subclass since this JIRA BATCH-1867

public class IndentingStaxEventItemWriter<T> extends StaxEventItemWriter<T> {

  @Setter
  @Getter
  private boolean indenting = true;

  @Override
  protected XMLEventWriter createXmlEventWriter( XMLOutputFactory outputFactory, Writer writer) throws XMLStreamException {
    if ( isIndenting() ) {
      return new IndentingXMLEventWriter( super.createXmlEventWriter( outputFactory, writer ) );
    }
    else {
      return super.createXmlEventWriter( outputFactory, writer );
    }
  }

}

But this requires an additionnal dependency because Spring Batch does not include the code to indent the StAX output:

<dependency>
  <groupId>net.java.dev.stax-utils</groupId>
  <artifactId>stax-utils</artifactId>
  <version>20070216</version>
</dependency>


回答10:

if you are using XMLEventWriter, then an easier way to do that is:

XMLOutputFactory outputFactory = XMLOutputFactory.newInstance();
        XMLEventWriter writer = outputFactory.createXMLEventWriter(w);
        XMLEventFactory eventFactory = XMLEventFactory.newInstance();
        Characters newLine = eventFactory.createCharacters("\n"); 
        writer.add(startRoot);
        writer.add(newLine);