I have the following code:
MemoryStream ms = new MemoryStream();
XmlWriter w = XmlWriter.Create(ms);
w.WriteStartDocument(true);
w.WriteStartElement("data");
w.WriteElementString("child", "myvalue");
w.WriteEndElement();//data
w.Close();
ms.Close();
string test = UTF8Encoding.UTF8.GetString(ms.ToArray());
The XML is generated correctly; however, my problem is the first character of the string 'test' is ï (char #239), making it invalid to some xml parsers: where is this coming from? What exactly am I doing incorrectly?
I know I can resolve the issue by just starting after the first character, but I'd rather know why it's there than simply patching over the problem.
Thanks!
All of these are slightly off, if you care about the byte order mark which is something editors use (such as Visual Studio detecting UTF8 encoded XML and syntax highlighting properly).
Here's a solution:
I've got 2 snippets in full here
The problem is that your the XML generated by the writer is UTF-16 while you use UTF-8 to convert it to string. Try this instead:
Found one solution here: http://www.timvw.be/generating-utf-8-with-systemxmlxmlwriter/
I was missing this at the top:
Thanks for the help everyone!
Check
You can change encodings like this: