i want to send the xml of an XmlDocument
object to the HTTP client, but i'm concerned that the suggested soltuion might not honor the encoding that the Response
has been set to use:
public void ProcessRequest(HttpContext context)
{
XmlDocument doc = GetXmlToShow(context);
context.Response.ContentType = "text/xml";
context.Response.ContentEncoding = System.Text.Encoding.UTF8;
context.Response.Cache.SetCacheability(HttpCacheability.NoCache);
context.Response.Cache.SetAllowResponseInBrowserHistory(true);
doc.Save(context.Response.OutputStream);
}
What if i changed the encoding to something else, Unicode for instance:
public void ProcessRequest(HttpContext context)
{
XmlDocument doc = GetXmlToShow(context);
context.Response.ContentType = "text/xml";
context.Response.ContentEncoding = System.Text.Encoding.Unicode;
context.Response.Cache.SetCacheability(HttpCacheability.NoCache);
context.Response.Cache.SetAllowResponseInBrowserHistory(true);
doc.Save(context.Response.OutputStream);
}
Will the Response.OutputStream
translate the binary data that's being written to it on the fly, and make it Unicode?
Or is the Response.ContentEncoding
just informative?
If the ContentEncoding is just informative, what content encoding will the follow text strings come back in?
context.Response.ContentEncoding = System.Text.Encoding.Unicode;
context.Response.Write("Hello World");
context.Response.ContentEncoding = System.Text.Encoding.UTF8;
context.Response.Write("Hello World");
context.Response.ContentEncoding = System.Text.Encoding.UTF16;
context.Response.Write("Hello World");
context.Response.ContentEncoding = System.Text.Encoding.ASCII;
context.Response.Write("Hello World");
context.Response.ContentEncoding = System.Text.Encoding.BigEndianUnicode;
context.Response.Write("Hello World");
i found it.
The answer is no: The XmlDocument will not honor the ContentEncoding of the response stream it's writing to.
Update: the proper way to do it
Use Response.Output
, and NOT Response.OutputStream
.
Both are streams, but Output
is a TextWriter
.
When an XmlDocument
saves itself to a TextWriter
, it will use the encoding
specified by the TextWriter
. The XmlDocument
will automatically change any
xml declaration node, i.e.:
<?xml version="1.0" encoding="ISO-8859-1"?>
to match the encoding used by the Response.Output
's encoding setting.
The Response.Output
TextWriter
's encoding settings comes from the
Response.ContentEncoding
value.
Use doc.Save
, not Response.Write(doc.ToString())
or Response.Write(doc.InnerXml)
You DON'T want to Save the xml to a string, or stuff the xml into a string,
and response.Write
that, because that:
- doesn't follow the encoding specified
- wastes memory
To sum up: by Saving to a TextWriter
: the XML Declaration node, the XML contents,
and the HTML Response content-encoding will all match.
Sample code:
public class Handler : IHttpHandler, System.Web.SessionState.IRequiresSessionState
{
//Note: We add IRequiesSessionState so that we'll have access to context.Session object
//Otherwise it will be null
public void ProcessRequest(HttpContext context)
{
XmlDocument doc = GetXmlToShow(context); //GetXmlToShow will look for parameters from the context
if (doc != null)
{
context.Response.ContentType = "text/xml"; //must be 'text/xml'
context.Response.ContentEncoding = System.Text.Encoding.UTF8; //we'd like utf-8
doc.Save(context.Response.Output); //doc save itself to the textwriter, using the encoding of the text-writer (which comes from response.contentEncoding)
}
#region Notes
/*
* 1. Use Response.Output, and NOT Response.OutputStream.
* Both are streams, but Output is a TextWriter.
* When an XmlDocument saves itself to a TextWriter, it will use the encoding
* specified by the TextWriter. The XmlDocument will automatically change any
* xml declaration node, i.e.:
* <?xml version="1.0" encoding="ISO-8859-1"?>
* to match the encoding used by the Response.Output's encoding setting
* 2. The Response.Output TextWriter's encoding settings comes from the
* Response.ContentEncoding value.
* 3. Use doc.Save, not Response.Write(doc.ToString()) or Response.Write(doc.InnerXml)
* 3. You DON'T want to Save the xml to a string, or stuff the xml into a string
* and response.Write that, because that
* - doesn't follow the encoding specified
* - wastes memory
*
* To sum up: by Saving to a TextWriter: the XML Declaration node, the XML contents,
* and the HTML Response content-encoding will all match.
*/
#endregion Notes
}
public bool IsReusable { get { return false; } }
}
The encoding that the XmlDocument will use when saving to a stream depends on the encoding specified in the xml declaration node. e.g.:
<?xml version="1.0" encoding="UTF-8"?>
If "UTF-8" encoding is specified in the xml declaration, then Save(stream) will use UTF-8 encoding.
If no encoding is specified, e.g.:
<?xml version="1.0"?>
or the xml declaration node is omitted entirely, then the XmlDocument will default to UTF-8 unicode encoding. (Reference)
If an encoding attribute is not
included, UTF-8 encoding is assumed
when the document is written or saved
out.
Some common encodings strings, that you could also use in the xml declaration, are:
- UTF-8
- UTF-16
- ISO-10646-UCS-2
- ISO-10646-UCS-4
- ISO-8859-1
- ISO-8859-2
- ISO-8859-3
- ISO-8859-4
- ISO-8859-5
- ISO-8859-6
- ISO-8859-7
- ISO-8859-8
- ISO-8859-9
- ISO-2022-JP
- Shift_JIS
- EUC-JP
Note: The encoding attribute is not case sensitive:
Unlike most XML attributes, encoding
attribute values are not
case-sensitive. This is because
encoding character names follow ISO
and Internet Assigned Numbers
Authority (IANA) standards.
If you loaded your XML from a string or a file, and it did not contain an xml declaration node, you can manually add one to the XmlDocument using:
// Create an XML declaration.
XmlDeclaration xmldecl;
xmldecl = doc.CreateXmlDeclaration("1.0", null, null);
xmldecl.Encoding="UTF-8";
// Add the new node to the document.
XmlElement root = doc.DocumentElement;
doc.InsertBefore(xmldecl, root);
If the XmlDocument does not have an xml declaration, or if the xml declaration does not have an encoding attribute, the saved document will not have one either.
Note: If the XmlDocument is saving to a TextWriter, then the encoding that will be used is taken from the TextWriter object. Additionally, the xml declaration node encoding attribute (if present) will be replaced with the encoding of the TextWriter as the contents are written to the TextWriter. (Reference)
The encoding on the TextWriter
determines the encoding that is
written out (The encoding of the
XmlDeclaration node is replaced by the
encoding of the TextWriter). If there
was no encoding specified on the
TextWriter, the XmlDocument is saved
without an encoding attribute.
If saving to a string, the encoding used is determined by the xml declaration node's encoding attribute, if present.
In my specific example, i am writing back to an Http client through ASP.NET. i want to set the Response.Encoding type to an appropriate value - and i need to to match what the XML itself will contain.
The appropriate way to do this is to save the xml to the Response.Output, rather than the Response.OutputStream. The Response.Output is a TextWriter, who's Encoding value follows what you set for the Response.Encoding.
In other words:
context.Response.ContentEncoding = System.Text.Encoding.ASCII;
doc.Save(context.Response.Output);
results in xml:
<?xml version="1.0" encoding="us-ascii" ?>
<foo>Hello, world!</foo>
while:
context.Response.ContentEncoding = System.Text.Encoding.UTF8;
doc.Save(context.Response.Output);
results in xml:
<?xml version="1.0" encoding="utf-8" ?>
<foo>Hello, world!</foo>
and
context.Response.ContentEncoding = System.Text.Encoding.Unicode;
doc.Save(context.Response.Output);
results in xml:
<?xml version="1.0" encoding="utf-16" ?>
<foo>Hello, world!</foo>
First 2 links from google
How to: Select an Encoding for ASP.NET Web Page Globalization:
http://msdn.microsoft.com/en-us/library/hy4kkhe0.aspx
globalization Element (ASP.NET Settings Schema):
http://msdn.microsoft.com/en-us/library/hy4kkhe0.aspx