-->

How to stop XMLReader throwing Invalid XML Charact

2019-04-08 17:33发布

问题:

So I have some XML:

<key>my tag</key><value>my tag value &#xB;and my invalid Character</Value>

and an XMLReader:

using (XmlReader reader = XmlReader.Create(new StringReader(xml)))
{
     while (reader.Read())
     {
         //do my thing
     }
}

I have implemented the CleanInvalidCharacters method from here but as the "&#xB" is not yet encoded it doesn't get removed.

The error is being thrown at the reader.Read(); line with exception:

hexadecimal value 0x0B, is an invalid character.

回答1:

The problem is that you don't have XML -- you have some string that sure looks like XML but unfortunately doesn't really qualify. Fortunately you can tell XmlReader to be more lenient:

using (XmlReader reader = XmlReader.Create(new StringReader(xml), new XmlReaderSettings { CheckCharacters = false }))
{
     while (reader.Read())
     {
         //do my thing
     }
}

Note that you will still end up with XML that, when serialized, might produce problems further down the line, so you may wish to filter the characters out afterwards anyway as you're reading it.