XElement.Load Error reading ampersand symbols and

2020-04-20 18:39发布

I'm having problems reading the ampersand symbol from an XML file:

XElement xmlElements = XElement.Load(Path_Xml_Data_File);

I get error when I have:

<Name>Patrick & Phill</Name>

Error: Name cannot begin with the ' ' character, hexadecimal value 0x20. Xml.XmlException) A System.Xml.XmlException was thrown: "Name cannot begin with the ' ' character

Or with special Portuguese characters:

<Extra>Direc&ccedil;&atilde;o Assistida</Extra> (= <Extra>Direcção Assistida</Extra>)

Error: Reference to undeclared entity 'ccedil'

Any idea how to solve this problem?

1条回答
欢心
2楼-- · 2020-04-20 19:42

I'm afraid that you're dealing with malformed XML.

To represent the ampersand, the data that you're loading should use the "&amp;" entity.

The &ccedil; (ç) and &atilde; (ã) named entities are not part of the XML standard, they are more commonly found in HTML (although they can be added to XML by the use of a DTD).

You could use HtmlTidy to tidy up the data first, or you could write something to convert the bare ampersands into entities on the incoming files.

For example:

public string CleanUpData(string data)
{
    var r = new Regex(@"&\s");
    string output = r.Replace(data, "&amp; ");
    output = output.Replace("&ccedil;", "ç");
    output = output.Replace("&atilde;", "ã");
    return output;
}
查看更多
登录 后发表回答