I've got an XSL template that outputs text as opposed to XML.
In this text, I need to include ASCII character 0x10 in certain position.
I understand this character is not allowed in an XML document, but I'm going to output text, so why am I not allowed to use it anyway?
I also understand it will not be possible to put this character literally into the template, neither within a CDATA
section nor as 
. But why does on-the-fly generation not work either? I tried, for instance, to define a function that returns this char and used it as <xsl:value-of select="z:get_char(16)"/>
but that produces an Invalid character exception either.
Is there a way?
The Microsoft .NET framework does not support XML 1.1, that is true, but it has its own (not portable) way to use control characters in XML 1.0 documents, namely you can have as a numeric character reference if you set CheckCharacters to false on your XmlReaderSettings/XmlWriterSettings.
Here is an example stylesheet and some .NET code tested with .NET 3.5 that does not throw an illegal character exception:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:text></xsl:text>
</xsl:template>
</xsl:stylesheet>
XmlReaderSettings xrs = new XmlReaderSettings();
xrs.CheckCharacters = false;
XslCompiledTransform proc = new XslCompiledTransform();
using (XmlReader xr = XmlReader.Create(@"sheet.xslt", xrs))
{
proc.Load(xr);
}
using (XmlReader xr = XmlReader.Create(new StringReader("<foo/>")))
{
XmlWriterSettings xws = proc.OutputSettings.Clone();
xws.CheckCharacters = false;
using (XmlWriter xw = XmlWriter.Create(@"result.txt", xws))
{
proc.Transform(xr, null, xw);
xw.Close();
}
xr.Close();
}
Since the XSLT file is an XML file, you cannot include that character reference. I don't think you can do this in a pure XSLT solution.
The ASCII character HEX 10/DEC 16 is the Data Link Escape (DLE) control character.
The XML Spec only allows the three whitespace(tab, carriage return, line feed) control characters.
Legal characters are tab, carriage
return, line feed, and the legal
characters of Unicode and ISO/IEC
10646.
Everything else under 0x20 is not allowed.
Character Range 2 Char ::=
#x9 | #xA | #xD | [#x20-#xD7FF] |
[#xE000-#xFFFD] |
[#x10000-#x10FFFF] /* any Unicode
character, excluding the surrogate
blocks, FFFE, and FFFF. */
One option is to put a placeholder token value for that character in your output, and then use an external process to find/replace your token with the character.
If you can use XML 1.1 (which allows inserting such characters in an XML document as a character reference) then the following should work, at least it works for me with Sun Java 6 and Saxon 9.2:
<?xml version="1.1" encoding="UTF-8"?>
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="2.0">
<xsl:output method="text"/>
<xsl:template name="main">
<xsl:text></xsl:text>
</xsl:template>
</xsl:stylesheet>
In the past, I have used this technique to enter a linefeed into an XHTML generated textarea. If I didn't put at least one character, the textarea would self close (causing browser issues). Notice the character is wrapped in <xsl:text>
. Also, the original source was on one line, but I formatted for readability.
<textarea name="qry" rows="4" cols="50" id="query">
<xsl:value-of select="$qry" /><xsl:text>
</xsl:text>
</textarea>