A tuple
takes less memory space in Python:
>>> a = (1,2,3)
>>> a.__sizeof__()
48
whereas list
s takes more memory space:
>>> b = [1,2,3]
>>> b.__sizeof__()
64
What happens internally on the Python memory management?
A tuple
takes less memory space in Python:
>>> a = (1,2,3)
>>> a.__sizeof__()
48
whereas list
s takes more memory space:
>>> b = [1,2,3]
>>> b.__sizeof__()
64
What happens internally on the Python memory management?
MSeifert answer covers it broadly; to keep it simple you can think of:
tuple
is immutable. Once it set, you can't change it. So you know in advance how much memory you need to allocate for that object.list
is mutable. You can add or remove items to or from it. It has to know the size of it (for internal impl.). It resizes as needed.There are no free meals - these capabilities comes with a cost. Hence the overhead in memory for lists.
The size of the tuple is prefixed, meaning at tuple initialization the interpreter allocate enough space for the contained data, and that's the end of it, giving it's immutable (can't be modified), whereas a list is a mutable object hence implying dynamic allocation of memory, so to avoid allocating space each time you append or modify the list ( allocate enough space to contain the changed data and copy the data to it), it allocates additional space for future append, modifications, ... that pretty much sums it up.
I assume you're using CPython and with 64bits (I got the same results on my CPython 2.7 64-bit). There could be differences in other Python implementations or if you have a 32bit Python.
Regardless of the implementation,
list
s are variable-sized whiletuple
s are fixed-size.So
tuple
s can store the elements directly inside the struct, lists on the other hand need a layer of indirection (it stores a pointer to the elements). This layer of indirection is a pointer, on 64bit systems that's 64bit, hence 8bytes.But there's another thing that
list
s do: They over-allocate. Otherwiselist.append
would be anO(n)
operation always - to make it amortizedO(1)
(much faster!!!) it over-allocates. But now it has to keep track of the allocated size and the filled size (tuple
s only need to store one size, because allocated and filled size are always identical). That means each list has to store another "size" which on 64bit systems is a 64bit integer, again 8 bytes.So
list
s need at least 16 bytes more memory thantuple
s. Why did I say "at least"? Because of the over-allocation. Over-allocation means it allocates more space than needed. However, the amount of over-allocation depends on "how" you create the list and the append/deletion history:Images
I decided to create some images to accompany the explanation above. Maybe these are helpful
This is how it (schematically) is stored in memory in your example. I highlighted the differences with red (free-hand) cycles:
That's actually just an approximation because
int
objects are also Python objects and CPython even reuses small integers, so a probably more accurate representation (although not as readable) of the objects in memory would be:Useful links:
tuple
struct in CPython repository for Python 2.7list
struct in CPython repository for Python 2.7int
struct in CPython repository for Python 2.7Note that
__sizeof__
doesn't really return the "correct" size! It only returns the size of the stored values. However when you usesys.getsizeof
the result is different:There are 24 "extra" bytes. These are real, that's the garbage collector overhead that isn't accounted for in the
__sizeof__
method. That's because you're generally not supposed to use magic methods directly - use the functions that know how to handle them, in this case:sys.getsizeof
(which actually adds the GC overhead to the value returned from__sizeof__
).I'll take a deeper dive into the CPython codebase so we can see how the sizes are actually calculated. In your specific example, no over-allocations have been performed, so I won't touch on that.
I'm going to use 64-bit values here, as you are.
The size for
list
s is calculated from the following function,list_sizeof
:Here
Py_TYPE(self)
is a macro that grabs theob_type
ofself
(returningPyList_Type
) while_PyObject_SIZE
is another macro that grabstp_basicsize
from that type.tp_basicsize
is calculated assizeof(PyListObject)
wherePyListObject
is the instance struct.The
PyListObject
structure has three fields:these have comments (which I trimmed) explaining what they are, follow the link above to read them.
PyObject_VAR_HEAD
expands into three 8 byte fields (ob_refcount
,ob_type
andob_size
) so a24
byte contribution.So for now
res
is:or:
If the list instance has elements that are allocated. the second part calculates their contribution.
self->allocated
, as it's name implies, holds the number of allocated elements.Without any elements, the size of lists is calculated to be:
i.e the size of the instance struct.
tuple
objects don't define atuple_sizeof
function. Instead, they useobject_sizeof
to calculate their size:This, as for
list
s, grabs thetp_basicsize
and, if the object has a non-zerotp_itemsize
(meaning it has variable-length instances), it multiplies the number of items in the tuple (which it gets viaPy_SIZE
) withtp_itemsize
.tp_basicsize
again usessizeof(PyTupleObject)
where thePyTupleObject
struct contains:So, without any elements (that is,
Py_SIZE
returns0
) the size of empty tuples is equal tosizeof(PyTupleObject)
:huh? Well, here's an oddity which I haven't found an explanation for, the
tp_basicsize
oftuple
s is actually calculated as follows:why an additional
8
bytes is removed fromtp_basicsize
is something I haven't been able to find out. (See MSeifert's comment for a possible explanation)But, this is basically the difference in your specific example.
list
s also keep around a number of allocated elements which helps determine when to over-allocate again.Now, when additional elements are added, lists do indeed perform this over-allocation in order to achieve O(1) appends. This results in greater sizes as MSeifert's covers nicely in his answer.