Following code
XmlDocument xdoc = new XmlDocument();
String xml = @"<!DOCTYPE lolz [" +
"<!ENTITY lol \"lol\">" +
"<!ENTITY lol2 \"&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;\">" +
"<!ENTITY lol3 \"&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;\">" +
"<!ENTITY lol4 \"&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;\">" +
"<!ENTITY lol5 \"&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;\">" +
"<!ENTITY lol6 \"&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;\">" +
"<!ENTITY lol7 \"&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;\">" +
"<!ENTITY lol8 \"&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;\">" +
"<!ENTITY lol9 \"&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;\">" +
"]>" +
"<lolz>&lol9;</lolz>";
xdoc.LoadXml(xml);
.Net 4.0 This code will throw exception The input document has exceeded a limit set by MaxCharactersFromEntities
.Net 2.0/3.5 This code will not throw any exception and will keep on growing in XML until memory limit is reached
Can somebody explain the reason of this difference?
Research done so far
I disassembled System.Xml v2.0 and v4.0 and only change I saw was in method RegisterConsumedCharacters
v2.0 definition
private void RegisterConsumedCharacters(long characters, bool inEntityReference)
{
if (this.maxCharactersInDocument > 0L)
{
long num = this.charactersInDocument + characters;
if (num < this.charactersInDocument)
{
this.ThrowWithoutLineInfo("XmlSerializeErrorDetails", new string[] { "MaxCharactersInDocument", "" });
}
else
{
this.charactersInDocument = num;
}
if (this.charactersInDocument > this.maxCharactersInDocument)
{
this.ThrowWithoutLineInfo("XmlSerializeErrorDetails", new string[] { "MaxCharactersInDocument", "" });
}
}
if ((this.maxCharactersFromEntities > 0L) && inEntityReference)
{
long num2 = this.charactersFromEntities + characters;
if (num2 < this.charactersFromEntities)
{
this.ThrowWithoutLineInfo("XmlSerializeErrorDetails", new string[] { "MaxCharactersFromEntities", "" });
}
else
{
this.charactersFromEntities = num2;
}
if ((this.charactersFromEntities > this.maxCharactersFromEntities) && XmlTextReaderSection.LimitCharactersFromEntities)
{
this.ThrowWithoutLineInfo("XmlSerializeErrorDetails", new string[] { "MaxCharactersFromEntities", "" });
}
}
}
v4.0 definition
private void RegisterConsumedCharacters(long characters, bool inEntityReference)
{
if (this.maxCharactersInDocument > 0L)
{
long num = this.charactersInDocument + characters;
if (num < this.charactersInDocument)
{
this.ThrowWithoutLineInfo("Xml_LimitExceeded", "MaxCharactersInDocument");
}
else
{
this.charactersInDocument = num;
}
if (this.charactersInDocument > this.maxCharactersInDocument)
{
this.ThrowWithoutLineInfo("Xml_LimitExceeded", "MaxCharactersInDocument");
}
}
if ((this.maxCharactersFromEntities > 0L) && inEntityReference)
{
long num2 = this.charactersFromEntities + characters;
if (num2 < this.charactersFromEntities)
{
this.ThrowWithoutLineInfo("Xml_LimitExceeded", "MaxCharactersFromEntities");
}
else
{
this.charactersFromEntities = num2;
}
if (this.charactersFromEntities > this.maxCharactersFromEntities)
{
this.ThrowWithoutLineInfo("Xml_LimitExceeded", "MaxCharactersFromEntities");
}
}
}
Only difference I see here is change in parameters of ThrowWithoutLineInfo and removal of XmlTextReaderSection.LimitCharactersFromEntities in v4.0, but I am not able to make much out of it and have hit a block here.
The default value for
XmlReaderSettings.MaxCharactersFromEntities
is 0 and means "no limit" as MSDN documentation says.But there is a nasty trick not pointed out by the documentation, in .net 4 if you don't pass a
XmlReaderSettings
to yourXmlTextReader
then the limit is not set to 0 but to 10,000,000.The relevant source code is here, even with a comment pointing out that this is a breaking change: https://referencesource.microsoft.com/#System.Xml/System/Xml/Core/XmlTextReaderImpl.cs,385
So, you have found the culprit. In v2.0,
XmlTextReaderSection.LimitCharactersFromEntities
returnsfalse
so the exception is not thrown. In v4.0XmlTextReaderSection.LimitCharactersFromEntities
is omitted from the condition check so it throws the exception.The question is what does
XmlTextReaderSection.LimitCharactersFromEntities
do and why does it returnfalse
? When we disassemble it, we see that it does the following:So
internal static bool
function callsprivate bool
one.private bool
callsinternal string
. Whatinternal string
does is it tries to get the dictionary entry"limitCharactersFromEntities"
fromConfigurationSection
parent class. It probably returns null, so it cannot be casted tobool
(checkXmlConvert.TryToBoolean
source) and in the endXmlTextReaderSection.LimitCharactersFromEntities
returns false.I checked further into the source code, but I could not find how or where
LimitCharactersFromEntities
configuration property is initialized in v2.0 (I couldn't check the whole .NET so what I did wasCTRL + F
to search for that string). However I've seen thatLimitCharactersFromEntities
is totally omitted in v4.0 and it doesn't exist as one of configuration property names inXmlConfiguration.cs
.So I guess we can conclude that in v2.0, that's a bug.
Note: I have totally no idea about XmlDocument and the System.Xml namespace at all. I've just tried to make some educated guesses reading source code.