Here's one for you XSLT gurus :-)
I have to deal with XML output from a Java program I cannot control.
In the docs outputted by this app the html tags remain as
<u><i><b><em>
etc, instead of
<u><i><b><em> and so on.
That's not a massive problem, I use XSLT to fix that, but using normalize-space to remove excess whitespace also removes spaces before these html tags.
Example
<Locator Precode="7">
<Text LanguageId="7">The next word is <b>bold</b> and is correctly spaced
around the html tag,
but the sentence has extra whitespace and
line breaks</Text>
</Locator>
If I run the XSLT script we use to remove extra white space, of which this is the relevant part
<xsl:template match="text(.)">
<xsl:value-of select="normalize-space()"/>
</xsl:template>
In the resulting output the xslt has correctly removed the extra whitespace and the line breaks, but it has also removed the space before the tag resulting in this output :-
The next word isboldand is correctly spaced around the html tag, but the sentence has extra whitespace and line breaks.
The spacing before and after the word "bold" has been stripped as well.
Anyone have any ideas how to prevent this from happening? Pretty well at my wits end so any help will be greatly appreciated!
:-)
Hi again,
Yes of course, here's the full stylesheet. We have to deal with the html tags and spacing in one pass
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes" omit-xml-declaration="no" encoding="UTF-8"/>
<xsl:strip-space elements="*" />
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="Text//*">
<xsl:value-of select="concat('<',name(),'>')" />
<xsl:apply-templates />
<xsl:value-of select="concat('</',name(),'>')" />
</xsl:template>
<xsl:template match="text()">
<xsl:value-of select="normalize-space(.)"/>
</xsl:template>
<xsl:template match="Instruction//*">
<xsl:value-of select="concat('<',name(),'>')" />
<xsl:apply-templates />
<xsl:value-of select="concat('</',name(),'>')" />
</xsl:template>
<xsl:template match="Title//*">
<xsl:value-of select="concat('<',name(),'>')" />
<xsl:apply-templates />
<xsl:value-of select="concat('</',name(),'>')" />
</xsl:template>
</xsl:stylesheet>