So I just was testing the CLR Profiler from microsoft, and I did a little program that created a List with 1,000,000 doubles in it. I checked the heap, and turns out the List<> size was around 124KB (I don't remember exactly, but it was around that). This really rocked my world, how could it be 124KB if it had 1 million doubles in it? Anyway, after that I decided to check a double[1000000]. And to my surprise (well not really since this is what I expected the with the List<> =P), the array size is 7.6MB. HUGE difference!!
How come they're different? How does the List<> manage its items that it's so (incredibly) memory efficient? I mean, it's not like the other 7.5 mb were somewhere else, because the size of the application was around 3 or 4 KB bigger after I created the 1 million doubles.
List<T>
uses an array to store values/references, so I doubt there there will be any difference in size apart from what little overhead List<T>
adds.
Given the code below
var size = 1000000;
var numbers = new List<double>(size);
for (int i = 0; i < size; i++) {
numbers.Add(0d);
}
the heap looks like this for the relevant object
0:000> !dumpheap -type Generic.List
Address MT Size
01eb29a4 662ed948 24
total 1 objects
Statistics:
MT Count TotalSize Class Name
662ed948 1 24 System.Collections.Generic.List`1[[System.Double, mscorlib]]
Total 1 objects
0:000> !objsize 01eb29a4 <=== Get the size of List<Double>
sizeof(01eb29a4) = 8000036 ( 0x7a1224) bytes (System.Collections.Generic.List`1[[System.Double, mscorlib]])
0:000> !do 01eb29a4
Name: System.Collections.Generic.List`1[[System.Double, mscorlib]]
MethodTable: 662ed948
EEClass: 65ad84f8
Size: 24(0x18) bytes
(C:\Windows\assembly\GAC_32\mscorlib\2.0.0.0__b77a5c561934e089\mscorlib.dll)
Fields:
MT Field Offset Type VT Attr Value Name
65cd1d28 40009d8 4 System.Double[] 0 instance 02eb3250 _items <=== The array holding the data
65ccaaf0 40009d9 c System.Int32 1 instance 1000000 _size
65ccaaf0 40009da 10 System.Int32 1 instance 1000000 _version
65cc84c0 40009db 8 System.Object 0 instance 00000000 _syncRoot
65cd1d28 40009dc 0 System.Double[] 0 shared static _emptyArray
>> Domain:Value dynamic statics NYI
00505438:NotInit <<
0:000> !objsize 02eb3250 <=== Get the size of the array holding the data
sizeof(02eb3250) = 8000012 ( 0x7a120c) bytes (System.Double[])
So the List<double>
is 8,000,036 bytes, and the underlying array is 8,000,012 bytes. This fits well with the usual 12 bytes overhead for a reference type (Array
) and 1,000,000 times 8 bytes for the doubles. On top of that List<T>
adds another 24 bytes of overhead for the fields shown above.
Conclusion: I don't see any evidence that List<double>
will take up less space than double[]
for the same number of elements.
Please note that the List is dynamically grown, usually doubling the size every time you hit the internal buffer size. Hence, the new list would have something like 4 element array initially, and after you add the first 4 elements, the 5th element would cause internal reallocation doubling the buffer to (4 * 2)
.