How to disable/avoid Ampersand-Escaping in Java-XM

2020-03-02 05:09发布

问题:

I want to create a XML where blanks are replaced by  . But the Java-Transformer escapes the Ampersand, so that the output is  

Here is my sample code:

public class Test {

    public static void main(String[] args) {

        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        DocumentBuilder builder = factory.newDocumentBuilder();
        Document document = builder.newDocument();

        Element element = document.createElement("element");
        element.setTextContent(" ");
        document.appendChild(element);

        ByteArrayOutputStream stream = new ByteArrayOutputStream();
        Transformer transformer = TransformerFactory.newInstance().newTransformer();
        StreamResult streamResult = new StreamResult(stream);
        transformer.transform(new DOMSource(document), streamResult);
        System.out.println(stream.toString());

    }

}

And this is the output of my sample code:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<element>&amp;#160;</element>

Any ideas to fix or avoid that? thanks a lot!

回答1:

Set the text content directly to the character you want, and the serializer will escape it for you if necessary:

element.setTextContent("\u00A0");


回答2:

The solution is very funny:

Node disableEscaping = document.createProcessingInstruction(StreamResult.PI_DISABLE_OUTPUT_ESCAPING, "&");
 Element element = document.createElement("element");
 element.setTextContent("&#160;");
 document.appendChild(disableEscaping );
 document.appendChild(element);
Node enableEscaping = document.createProcessingInstruction(StreamResult.PI_ENABLE_OUTPUT_ESCAPING, "&");
document.appendChild(enableEscaping )

So basically you need put your code between escaping element.



回答3:

Try to use

element.appendChild (document.createCDATASection ("&#160;"));

instead of

element.setTextContent(...);

You'll get this in your xml: It may work if I understand correctly what you're trying to do.



回答4:

As addon to forty-two's answer:

If, like me, you're trying the code in a non-patched Eclipse IDE, you're likely to see some weird A's appearing instead of the non-breaking space. This is because of the encoding of the console in Eclipse not matching Unicode (UTF-8).

Adding -Dfile.encoding=UTF-8 to your eclipse.ini should solve this.

Cheers, Wim