Given xml like this:
I would like to select all text that is under item/xmlText. I would like to print all the content of this node with tags (someTag, otherTag).
I would prefer to handle with this with XPath, but this is part of Java program, so if there is such mechanism I could take it as well.
Use XSLT for this:
<xsl:stylesheet version="1.0"
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<xsl:copy-of select="/container/item/xmlText/node()"/>
When this is applied on the provided XML document (corrected to be well-formed !!!):
the wanted, correct result is produced:
When this is your Element retrieved with XPath
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
Element element = (Element) xpath.evaluate(
"/container/item/xmlText", document, XPathConstants.NODE);
Then, you can do something along these lines: data =
new; ps = new;
// These classes are part of Xerces. But you will find them in your JDK,
// as well, in a different package. Use any encoding here:
org.apache.xml.serialize.OutputFormat of =
new org.apache.xml.serialize.OutputFormat("XML", "ISO-8859-1", true);
org.apache.xml.serialize.XMLSerializer serializer =
new org.apache.xml.serialize.XMLSerializer(ps, of);
// Here, serialize the element that you obtained using your XPath expression.
// The output stream now holds serialized XML data, including tags/attributes...
return data.toString();
This would be more concise, rather than using Xerces internals. It's the same as Dimitre's solution, just not with an XSLT stylesheet but all in Java:
ByteArrayOutputStream out = new ByteArrayOutputStream();
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
Source source = new DOMSource(element);
Result target = new StreamResult(out);
transformer.transform(source, target);
return out.toString();