I spent some time investigating the collections.namedtuple
module a few weeks ago. The module uses a factory function which populates the dynamic data (the name of the new namedtuple
class, and the class attribute names) into a very large string. Then exec
is executed with the string (which represents the code) as the argument, and the new class is returned.
Does anyone know why it was done this way, when there is a specific tool for this kind of thing readily available, i.e. the metaclass? I haven't tried to do it myself, but it seems like everything that is happening in the namedtuple
module could have been easily accomplished using a namedtuple
metaclass, like so:
class namedtuple(type):
etc etc.
There are some hints in the issue 3974. The author proposed a new way to create named tuples, which was rejected with the following comments:
It seems the benefit of the original version is that it's faster,
thanks to hardcoding critical methods.
- Antoine Pitrou
There is nothing unholy about using exec. Earlier versions used other
approaches and they proved unnecessarily complex and had unexpected
problems. It is a key feature for named tuples that they are exactly
equivalent to a hand-written class. - Raymond Hettinger
Additionally, here is the part of the description of the original namedtuple
recipe:
... the recipe has evolved to its current exec-style where we get all
of Python's high-speed builtin argument checking for free. The new
style of building and exec-ing a template made both the __new__ and
__repr__ functions faster and cleaner than in previous versions of this recipe.
If you're looking for some alternative implementations:
As a sidenote: The other objection I see most often against using exec
is that some locations (read companies) disable it for security reasons.
Besides an advanced Enum
and NamedConstant
, the aenum library* also has NamedTuple
which is metaclass
-based.
* aenum
is written by the author of enum
and the enum34
backport.
Here is another approach.
""" Subclass of tuple with named fields """
from operator import itemgetter
from inspect import signature
class MetaTuple(type):
""" metaclass for NamedTuple """
def __new__(mcs, name, bases, namespace):
cls = type.__new__(mcs, name, bases, namespace)
names = signature(cls._signature).parameters.keys()
for i, key in enumerate(names):
setattr(cls, key, property(itemgetter(i)))
return cls
class NamedTuple(tuple, metaclass=MetaTuple):
""" Subclass of tuple with named fields """
@staticmethod
def _signature():
" Override in subclass "
def __new__(cls, *args):
new = super().__new__(cls, *args)
if len(new) == len(signature(cls._signature).parameters):
return new
return new._signature(*new)
if __name__ == '__main__':
class Point(NamedTuple):
" Simple test "
@staticmethod
def _signature(x, y, z): # pylint: disable=arguments-differ
" Three coordinates "
print(Point((1, 2, 4)))
If this approach has any virtue at all, it's the simplicity. It would be simpler yet without NamedTuple.__new__
, which serves only the purpose of enforcing the element count. Without that, it happily allows additional anonymous elements past the named ones, and the primary effect of omitting elements is the IndexError
on omitted elements when accessing them by name (with a little work that could be translated to an AttributeError
). The error message for an incorrect element count is a bit strange, but it gets the point across. I wouldn't expect this to work with Python 2.
There is room for further complication, such as a __repr__
method. I have no idea how the performance compares to other implementations (caching the signature length might help), but I much prefer the calling convention as compared to the native namedtuple
implementation.