I'm curious - why does the sys.getsizeof
call return a smaller number for a list than the sum of its elements?
import sys
lst = ["abcde", "fghij", "klmno", "pqrst", "uvwxy"]
print("Element sizes:", [sys.getsizeof(el) for el in lst])
print("Sum of sizes: ", sum([sys.getsizeof(el) for el in lst]))
print("Size of list: ", sys.getsizeof(lst))
The above prints
Element sizes: [42, 42, 42, 42, 42]
Sum of sizes: 210
Size of list: 112
How come?
You are getting the size of the actual list object. As the list object stores pointers to objects its memory size is bound to be different (and lower) than the sum of its elements.
By analogy, it’s like getting the size of an array of pointers in C.
As per the documentation,
sys.getsizeof
does the following:So only very primitive types in built-in objects are you ever really going to get accurate results. Even for built-in container types, you usually need to use some sort of recursive function to find the "total" size of the container (list, dictionary, etc). Keep in mind, though, that a python list is really just a re-sizable array of pointers, so in a sense, it is an accurate number.
However, you are looking for something like this:
https://code.activestate.com/recipes/577504/
Also, note that:
Every numpy object -or any object for that matter- has some overhead, and when you assign a
np.array
as a list element, you create a new object, so really, the following only takes into account the memory of the array contents, and not the overhead of the whole object:The memory of a numpy array
a
can be obtained bya.nbytes
.sys.getsizeof
shows "only the memory consumption directly attributed to the object [...], not the memory consumption of objects it refers to." (according to the documentation). In your case, it does not hold all the data. It can be seen witha.flags
which outputs:For the first array, it is instead:
The
OWNDATA
field beingFalse
explains whysys.getsizeof
outputs only 128 bytes.