For a given XmlElement
, I need to be able to set the inner text to an escaped version of the Unicode string, despite the document ultimately being encoded in UTF-8. Is there any way of achieving this?
Here's a simple version of the code:
const string text = "ñ";
var document = new XmlDocument {PreserveWhitespace = true};
var root = document.CreateElement("root");
root.InnerXml = text;
document.AppendChild(root);
var settings = new XmlWriterSettings {Encoding = Encoding.UTF8, OmitXmlDeclaration = true};
using (var stream = new FileStream("out.xml", FileMode.Create))
using (var writer = XmlWriter.Create(stream, settings))
document.WriteTo(writer);
Expected:
<root>ñ</root>
Actual:
<root>ñ</root>
Using an XmlWriter
directly and calling WriteRaw(text)
works, but I only have access to an XmlDocument
, and the serialization happens later. On the XmlElement
, InnerText
escapes the &
to &
, as expected, and setting Value
throws an exception.
Is there some way of setting the inner text of an XmlElement
to the escaped ASCII text, regardless of the encoding that is ultimately used? I feel like I must be missing something obvious, or it's just not possible.
If you ask XmlWriter to produce ASCII output, it should give you character references for all non-ASCII content.
The output is still valid UTF-8, because ASCII is a subset of UTF-8.