I'm generating an utf-8 XML file using XDocument
.
XDocument xml_document = new XDocument(
new XDeclaration("1.0", "utf-8", null),
new XElement(ROOT_NAME,
new XAttribute("note", note)
)
);
...
xml_document.Save(@file_path);
The file is generated correctly and validated with an xsd file with success.
When I try to upload the XML file to an online service, the service says that my file is wrong at line 1
; I have discovered that the problem is caused by the BOM on the first bytes of the file.
Do you know why the BOM is appended to the file and how can I save the file without it?
As stated in Byte order mark Wikipedia article:
While Unicode standard allows BOM in UTF-8 it does not require or recommend it. Byte order has no meaning in UTF-8 so a BOM only serves to identify a text stream or file as UTF-8 or that it was converted from another format that has a BOM
Is it an XDocument
problem or should I contact the guys of the online service provider to ask for a parser upgrade?
The most expedient way to get rid of the BOM character when using XDocument is to just save the document, then do a straight File read as a file, then write it back out. The File routines will strip the character out for you:
(it's hoky, but it works for the sake of expediency - at least you'll have a well-formed file to upload to your online provider) ;)
Use an
XmlTextWriter
and pass that to the XDocument's Save() method, that way you can have more control over the type of encoding used:The
UTF8Encoding
class constructor has an overload that specifies whether or not to use the BOM (Byte Order Mark) with a boolean value, in your casefalse
.The result of this code was verified using Notepad++ to inspect the file's encoding.
First of all: the service provider MUST handle it, according to XML spec, which states that BOM may be present in case of UTF-8 representation.
You can force to save your XML without BOM like this:
(Googled from here: http://social.msdn.microsoft.com/Forums/en/xmlandnetfx/thread/ccc08c65-01d7-43c6-adf3-1fc70fdb026a)