-->

XElement fails to load file with accented characte

2019-06-24 20:14发布

问题:

I have a rather curious problem, using the XElement load method to load in a html document (which is well formed checked with HTML Tidy), which work absolutely perfectly for English documents, however moving to French and Spanish docs I'm presented with an XML Exception;

XML Exception
Invalid character in the given encoding. Line 23, position 43.

The method call

XElement doc = XElement.Load("example1.html", LoadOptions.None);

Sniplet of the html document

<font face="Arial" size="3" color="#ffffff">
Le test <b> exemple français, qui devrait éventuellement être suivie d'un texte en langue espagnole. </ b>
</font>

I realise my HTML does not have an encoding type set at the start of the file, is there a way around this?

回答1:

because you're not using XDocument you can't set character encoding, use that instead and set encoding = UTF-8

http://msdn.microsoft.com/en-us/library/bb387063.aspx



标签: .net xelement