I have something like this:
<node TEXT=" txt A "/>
<node TEXT="
txt X
"/>
<node>
<html>
<p>
txt Y
</p>
</html>
</node>
<node TEXT="txt B"/>
and i want to use XSLT to get this:
txt A
txt X
txt Y
txt B
I want to strip all useless whitespaces and linebreaks of @TEXT's and CDATA's. The only XML-input that is giving structure to the output are the <node>
-tags.
The following transformation:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="*">
<xsl:apply-templates select="@TEXT | node()"/>
</xsl:template>
<xsl:template match="node/@TEXT | text()">
<xsl:if test="normalize-space(.)">
<xsl:value-of select=
"concat(normalize-space(.), '
')"/>
</xsl:if>
<xsl:apply-templates />
</xsl:template>
</xsl:stylesheet>
when applied against this XML document
<t>
<node TEXT=" txt A "/>
<node TEXT=" txt X"/>
<node>
<html>
<p> txt Y </p>
</html>
</node>
<node TEXT="txt B"/>
</t>
produces the wanted result:
txt A
txt X
txt Y
txt B
Do note the use of the standard XPath function normalize-space(), which strips off all leading and trailing spaces and replaces every sequence of other spaces with just one space.
You probably want
<xsl:strip-space elements="node"/>
explained here. And this article has a lot more details.