Occasionally I like to spend some time looking at the .NET code just to see how things are implemented behind the scenes. I stumbled upon this gem while looking at the String.Equals
method via Reflector.
C#
[ReliabilityContract(Consistency.WillNotCorruptState, Cer.MayFail)]
public override bool Equals(object obj)
{
string strB = obj as string;
if ((strB == null) && (this != null))
{
return false;
}
return EqualsHelper(this, strB);
}
IL
.method public hidebysig virtual instance bool Equals(object obj) cil managed
{
.custom instance void System.Runtime.ConstrainedExecution.ReliabilityContractAttribute::.ctor(valuetype System.Runtime.ConstrainedExecution.Consistency, valuetype System.Runtime.ConstrainedExecution.Cer) = { int32(3) int32(1) }
.maxstack 2
.locals init (
[0] string str)
L_0000: ldarg.1
L_0001: isinst string
L_0006: stloc.0
L_0007: ldloc.0
L_0008: brtrue.s L_000f
L_000a: ldarg.0
L_000b: brfalse.s L_000f
L_000d: ldc.i4.0
L_000e: ret
L_000f: ldarg.0
L_0010: ldloc.0
L_0011: call bool System.String::EqualsHelper(string, string)
L_0016: ret
}
What is the reasoning for checking this
against null
? I have to assume there is purpose otherwise this probably would have been caught and removed by now.
If the argument (obj) does not cast to a string then strB will be null and the result should be false. Example:
writes
false
.Remember that string.Equals() method is called for any argument type, not only for other strings.
Let's see...
this
is the first string you're comparing.obj
is the second object. So it looks like it's an optimization of sorts. It's first castingobj
to a string type. And if that fails, thenstrB
is null. And ifstrB
is null whilethis
isn't, then they're definitely not equal and theEqualsHelper
function can be skipped.That will save a function call. Beyond that, perhaps a better understanding of the
EqualsHelper
function might shed some light on why this optimization is needed.EDIT:
Ah, so the EqualsHelper function is accepting a
(string, string)
as parameters. IfstrB
is null, then that essentially means that it was either a null object to begin with, or it couldn't successfully be cast into a string. If the reason forstrB
being null is that the object was a different type that couldn't be converted to a string then you wouldn't want to call EqualsHelper with essentially two null values (that'll return true). The Equals function should return false in this case. So this if statement is more than an optimization, it actually ensures proper functionality as well.The reason why is that it is indeed possible for
this
to benull
. There are 2 IL op codes which can be used to invoke a function: call and callvirt. The callvirt function causes the CLR to perform a null check when invoking the method. The call instruction does not and hence allows for a method to be entered withthis
beingnull
.Sound scary? Indeed it is a bit. However most compilers ensure this doesn't ever happen. The .call instruction is only ever outputted when
null
is not a possibility (I'm pretty sure that C# always uses callvirt).This isn't true for all languages though and for reasons I don't exactly know the BCL team chose to further harden the
System.String
class in this instance.Another case where this can popup is in reverse pinvoke calls.
I assume you were looking at the .NET 3.5 implementation? I believe the .NET 4 implementation is slightly different.
However, I have a sneaking suspicion that this is because it's possible to call even virtual instance methods non-virtually on a null reference. Possible in IL, that is. I'll see if I can produce some IL which would call
null.Equals(null)
.EDIT: Okay, here's some interesting code:
I got this by compiling the following C# code:
... and then disassembling with
ildasm
and editing. Note this line:Originally, that was
callvirt
instead ofcall
.So, what happens when we reassemble it? Well, with .NET 4.0 we get this:
Hmm. What about with .NET 2.0?
Now that's more interesting... we've clearly managed to get into
EqualsHelper
, which we wouldn't have normally expected.Enough of string... let's try to implement reference equality ourselves, and see whether we can get
null.Equals(null)
to return true:Same procedure as before - disassemble, change
callvirt
tocall
, reassemble, and watch it printtrue
...Note that although another answers references this C++ question, we're being even more devious here... because we're calling a virtual method non-virtually. Normally even the C++/CLI compiler will use
callvirt
for a virtual method. In other words, I think in this particular case, the only way forthis
to be null is to write the IL by hand.EDIT: I've just noticed something... I wasn't actually calling the right method in either of our little sample programs. Here's the call in the first case:
here's the call in the second:
In the first case, I meant to call
System.String::Equals(object)
, and in the second, I meant to callTest::Equals(object)
. From this we can see three things:object.Equals(object)
is happy to compare a null "this" referenceIf you add a bit of console output to the C# override, you can see the difference - it won't be called unless you change the IL to call it explicitly, like this:
So, there we are. Fun and abuse of instance methods on null references.
If you've made it this far, you might also like to look at my blog post about how value types can declare parameterless constructors... in IL.
The source code has this comment:
The short answer is that languages like C# force you to create an instance of this class before calling the method, but the Framework itself does not. There are two¹ different ways in CIL to call a function:
call
andcallvirt
.... Generally speaking, C# will always emitcallvirt
, which requiresthis
to not be null. But other languages (C++/CLI comes to mind) could emitcall
, which doesn't have that expectation.(¹okay, it's more like five if you count calli, newobj etc, but let's keep it simple)