From what I've been aware of, using [], {}
or ()
to instantiate objects returns a new instance of list, dict
or tuple
respectively; a new instance object with a new identity.
This was pretty clear to me until I actually tested it and I noticed that () is ()
actually returns True
instead of the expected False
:
>>> () is (), [] is [], {} is {}
(True, False, False)
as expected, this behavior is also manifested when creating objects with list()
, dict()
and tuple()
respectively:
>>> tuple() is tuple(), list() is list(), dict() is dict()
(True, False, False)
The only relevant piece of information I could find in the docs for tuple()
states:
[...] For example,
tuple('abc')
returns('a', 'b', 'c')
andtuple([1, 2, 3])
returns(1, 2, 3)
. If no argument is given, the constructor creates a new empty tuple,()
.
Suffice to say, this isn't sufficient for answering my question.
So, why do empty tuples have the same identity whilst others like lists or dictionaries do not?
In short:
Python internally creates a
C
list of tuple objects whose first element contains the empty tuple. Every timetuple()
or()
is used, Python will return the existing object contained in the aforementionedC
list and not create a new one.Such mechanism does not exist for
dict
orlist
objects which are, on the contrary, recreated from scratch every time.This is most likely related to the fact that immutable objects (like tuples) cannot be altered and, as such, are guaranteed to not change during execution. This is further solidified when considering that
frozenset() is frozenset()
returnsTrue
; like()
an emptyfrozenset
is considered an singleton in the implementation ofCPython
. With mutable objects, such guarantees are not in place and, as such, there's no incentive to cache their zero element instances (i.e their contents could change with the identity remaining the same).Take note: This isn't something one should depend on, i.e one shouldn't consider empty tuples to be singletons. No such guarantees are explicitly made in the documentation so one should assume it is implementation dependent.
How it is done:
In the most common case, the implementation of
CPython
is compiled with two macrosPyTuple_MAXFREELIST
andPyTuple_MAXSAVESIZE
set to positive integers. The positive value for these macros results in the creation of an array oftuple
objects with sizePyTuple_MAXSAVESIZE
.When
PyTuple_New
is called with the parametersize == 0
it makes sure to add a new empty tuple to the list if it doesn't already exist:Then, if a new empty tuple is requested, the one that is located in the first position of this list is going to get returned instead of a new instance:
One additional reason causing an incentive to do this is the fact that function calls construct a tuple to hold the positional arguments that are going to be used. This can be seen in the
load_args
function inceval.c
:which is called via
do_call
in the same file. If the number of argumentsna
is zero, an empty tuple is going to be returned.In essence, this might be an operation that's performed frequently so it makes sense to not reconstruct an empty tuple every single time.
Further reading:
A couple more answers shed light on
CPython
's caching behaviour with immutables: