Extract from CLR via C# on Boxing / Unboxing value types ...
On Boxing: If the nullable instance is not null, the CLR takes the value out of the nullable instance and boxes it. In other words a Nullable < Int32 > with a value of 5 is boxed into a boxed-Int32 with a value of 5.
On Unboxing: Unboxing is simply the act of obtaining a reference to the unboxed portion of a boxed object. The problem is that a boxed value type cannot be simply unboxed into a nullable version of that value type because the boxed value doesn't have the boolean hasValue field in it. So, when unboxing a value type into a nullable version, the CLR must allocate a Nullable < T > object, initialize the hasValue field to true, and set the value field to the same value that is in the boxed value type. This impacts your application performance (memory allocation during unboxing).
Why did the CLR team go through so much trouble for Nullable types ? Why was it not simply boxed into a Nullable < Int32 > in the first place ?
I remember this behavior was kind of last minute change. In early betas of .NET 2.0, Nullable<T>
was a "normal" value type. Boxing a null
valued int?
turned it into a boxed int?
with a boolean flag. I think the reason they decided to choose the current approach is consistency. Say:
int? test = null;
object obj = test;
if (test != null)
Console.WriteLine("test is not null");
if (obj != null)
Console.WriteLine("obj is not null");
In the former approach (box null
-> boxed Nullable<T>
), you wouldn't get "test is not null" but you'd get "object is not null" which is weird.
Additionally, if they had boxed a nullable value to a boxed-Nullable<T>
:
int? val = 42;
object obj = val;
if (obj != null) {
// Our object is not null, so intuitively it's an `int` value:
int x = (int)obj; // ...but this would have failed.
}
Beside that, I believe the current behavior makes perfect sense for scenarios like nullable database values (think SQL-CLR...)
Clarification:
The whole point of providing nullable types is to make it easy to deal with variables that have no meaningful value. They didn't want to provide two distinct, unrelated types. An int?
should behaved more or less like a simple int
. That's why C# provides lifted operators.
So, when unboxing a value type into a nullable version, the CLR must allocate a Nullable<T>
object, initialize the hasValue field to true, and set the value field to the same value that is in the boxed value type. This impacts your application performance (memory allocation during unboxing).
This is not true. The CLR would have to allocates memory on stack to hold the variable whether or not it's nullable. There's not a performance issue to allocate space for an extra boolean variable.
I think it makes sense to box a null value to a null reference. Having a boxed value saying "I know I would be an Int32
if I had a value, but I don't" seems unintuitive to me. Better to go from the value type version of "not a value" (a value with HasValue
as false) to the reference type version of "not a value" (a null reference).
I believe this change was made on the feedback of the community, btw.
This also allows an interesting use of as
even for value types:
object mightBeADouble = GetMyValue();
double? unboxed = mightBeADouble as double?;
if (unboxed != null)
{
...
}
This is more consistent with the way "uncertain conversions" are handled with reference types, than the previous:
object mightBeADouble = GetMyValue();
if (mightBeADouble is double)
{
double unboxed = (double) mightBeADouble;
...
}
(It may also perform better, as there's only a single execution time type check.)
A thing that you gain via this behavior is that the boxed version implements all interfaces supported by the underlying type. (The goal is to make Nullable<int>
appear the same as int
for all practical purposes.) Boxing to a boxed-Nullable<int>
instead of a boxed-int
would prevent this behavior.
From the MSDN Page,
double? d = 44.4;
object iBoxed = d;
// Access IConvertible interface implemented by double.
IConvertible ic = (IConvertible)iBoxed;
int i = ic.ToInt32(null);
string str = ic.ToString();
Also getting the int from a boxed version of a Nullable<int>
is straightforward - Usually you can't unbox to a type other than the original src type.
float f = 1.5f;
object boxed_float = f;
int int_value = (int) boxed_float; // will blow up. Cannot unbox a float to an int, you *must* unbox to a float first.
float? nullableFloat = 1.4f;
boxed_float = nullableFloat;
float fValue = (float) boxed_float; // can unbox a float? to a float Console.WriteLine(fValue);
Here you do not have to know if the original version was an int or a Nullable version of it. (+ you get some perf too ; save space of storing the the hasValue
boolean as well in the boxed object)
I guess that is basically what it does. The description given includes your suggestion (ie boxing into a Nullable<T>
).
The extra is that it sets the hasValue
field after boxing.
I would posit that the reason for the behavior stems from the behavior of Object.Equals, most notably the fact that if the first object is null and the second object is not, Object.Equals returns false rather than call the Equals method on the second object.
If Object.Equals would have called the Equals method on the second object in the case where the first object was null but the second was not, then an object which was null-valued Nullable<T> could have returned True when compared to null. Personally, I think the proper remedy would have been to make the HasValue property of a Nullable<T> have nothing to do with the concept of a null reference. With regard to the overhead involved with storing a boolean flag on the heap, one could have provided that for every type Nullable<T> there would a be a static boxed empty version, and then provide that unboxing the static boxed empty copy would yield an empty Nullable<T>, and unboxing any other instance would yield a populated one.