Any way to bypass namedtuple 255 arguments limitat

2020-07-24 12:03发布

问题:

I'm using a namedtuple to hold sets of strings and their corresponding values. I'm not using a dictionary, because I want the strings accessible as attributes.

Here's my code:

from collections import namedtuple


# Shortened for readability :-)
strings = namedtuple("strings", ['a0', 'a1', 'a2', ..., 'a400'])
my_strings = strings(value0, value1, value2, ..., value400)

Ideally, once my_strings is initialized, I should be able to do this:

print(my_strings.a1) 

and get value1 printed back.

However, I get the following error instead:

strings(value0, value1, value2, ...value400)

   ^SyntaxError: more than 255 arguments

It seems python functions (including namedtuple's init()), do not accept more than 255 arguments when called. Is there any way to bypass this issue and have named tuples with more than 255 items? Why is there a 255 arguments limit anyway?

回答1:

This is a limit to CPython function definitions; in versions before Python 3.7, you cannot specify more than 255 explicit arguments to a callable. This applies to any function definition, not just named tuples.

Note that this limit has been lifted in Python 3.7 and newer, where the new limit is sys.maxint. See What is a maximum number of arguments in a Python function?

It is the generated code for the class that is hitting this limit. You cannot define a function with more than 255 arguments; the __new__ class method of the resulting class is thus not achievable in the CPython implementation.

You'll have to ask yourself, however, if you really should be using a different structure instead. It looks like you have a list-like piece of data to me; 400 numbered names is a sure sign of your data bleeding into your names.

You can work around this by creating your own subclass, manually:

from operator import itemgetter
from collections import OrderedDict

class strings(tuple):
    __slots__ = ()
    _fields = tuple('a{}'.format(i) for i in range(400))
    def __new__(cls, *args, **kwargs):
        req = len(cls._fields)
        if len(args) + len(kwargs) > req:
            raise TypeError(
                '__new__() takes {} positional arguments but {} were given'.format(
                    req, len(args) + len(kwargs)))
        if kwargs.keys() > set(cls._fields):
            raise TypeError(
                '__new__() got an unexpected keyword argument {!r}'.format(
                    (kwargs.keys() - set(cls._fields)).pop()))
        missing = req - len(args)
        if kwargs.keys() & set(cls._fields[:-missing]):
            raise TypeError(
                '__new__() got multiple values for argument {!r}'.format(
                    (kwargs.keys() & set(cls._fields[:-missing])).pop()))
        try:
            for field in cls._fields[-missing:]:
                args += (kwargs[field],)
                missing -= 1
        except KeyError:
            pass
        if len(args) < req:
            raise TypeError('__new__() missing {} positional argument{}: {}'.format(
                missing, 's' if missing > 1 else '',
                ' and '.join(filter(None, [', '.join(map(repr, cls._fields[-missing:-1])), repr(cls._fields[-1])]))))
        return tuple.__new__(cls, args)

    @classmethod
    def _make(cls, iterable, new=tuple.__new__, len=len):
        'Make a new strings object from a sequence or iterable'
        result = new(cls, iterable)
        if len(result) != len(cls._fields):
            raise TypeError('Expected %d arguments, got %d' % (len(cls._fields), len(result)))
        return result

    def __repr__(self):
        'Return a nicely formatted representation string'
        format = '{}({})'.format(self.__class__.__name__, ', '.join('{}=%r'.format(n) for n in self._fields))
        return format % self

    def _asdict(self):
        'Return a new OrderedDict which maps field names to their values'
        return OrderedDict(zip(self._fields, self))

    __dict__ = property(_asdict)

    def _replace(self, **kwds):
        'Return a new strings object replacing specified fields with new values'
        result = self._make(map(kwds.pop, self._fields, self))
        if kwds:
            raise ValueError('Got unexpected field names: %r' % list(kwds))
        return result

    def __getnewargs__(self):
        'Return self as a plain tuple.  Used by copy and pickle.'
        return tuple(self)

    def __getstate__(self):
        'Exclude the OrderedDict from pickling'
        return None

for i, name in enumerate(strings._fields):
    setattr(strings, name, 
            property(itemgetter(i), doc='Alias for field number {}'.format(i)))

This version of the named tuple avoids the long argument lists altogether, but otherwise behaves exactly like the original. The somewhat verbose __new__ method is not strictly needed but does closely emulate the original behaviour when arguments are incomplete. Note the construction of the _fields attribute; replace this with your own to name your tuple fields.

Pass in a generator expression to set your arguments:

s = strings(i for i in range(400))

or if you have a list of values:

s = strings(iter(list_of_values))

Either technique bypasses the limits on function signatures and function call argument counts.

Demo:

>>> s = strings(i for i in range(400))
>>> s
strings(a0=0, a1=1, a2=2, a3=3, a4=4, a5=5, a6=6, a7=7, a8=8, a9=9, a10=10, a11=11, a12=12, a13=13, a14=14, a15=15, a16=16, a17=17, a18=18, a19=19, a20=20, a21=21, a22=22, a23=23, a24=24, a25=25, a26=26, a27=27, a28=28, a29=29, a30=30, a31=31, a32=32, a33=33, a34=34, a35=35, a36=36, a37=37, a38=38, a39=39, a40=40, a41=41, a42=42, a43=43, a44=44, a45=45, a46=46, a47=47, a48=48, a49=49, a50=50, a51=51, a52=52, a53=53, a54=54, a55=55, a56=56, a57=57, a58=58, a59=59, a60=60, a61=61, a62=62, a63=63, a64=64, a65=65, a66=66, a67=67, a68=68, a69=69, a70=70, a71=71, a72=72, a73=73, a74=74, a75=75, a76=76, a77=77, a78=78, a79=79, a80=80, a81=81, a82=82, a83=83, a84=84, a85=85, a86=86, a87=87, a88=88, a89=89, a90=90, a91=91, a92=92, a93=93, a94=94, a95=95, a96=96, a97=97, a98=98, a99=99, a100=100, a101=101, a102=102, a103=103, a104=104, a105=105, a106=106, a107=107, a108=108, a109=109, a110=110, a111=111, a112=112, a113=113, a114=114, a115=115, a116=116, a117=117, a118=118, a119=119, a120=120, a121=121, a122=122, a123=123, a124=124, a125=125, a126=126, a127=127, a128=128, a129=129, a130=130, a131=131, a132=132, a133=133, a134=134, a135=135, a136=136, a137=137, a138=138, a139=139, a140=140, a141=141, a142=142, a143=143, a144=144, a145=145, a146=146, a147=147, a148=148, a149=149, a150=150, a151=151, a152=152, a153=153, a154=154, a155=155, a156=156, a157=157, a158=158, a159=159, a160=160, a161=161, a162=162, a163=163, a164=164, a165=165, a166=166, a167=167, a168=168, a169=169, a170=170, a171=171, a172=172, a173=173, a174=174, a175=175, a176=176, a177=177, a178=178, a179=179, a180=180, a181=181, a182=182, a183=183, a184=184, a185=185, a186=186, a187=187, a188=188, a189=189, a190=190, a191=191, a192=192, a193=193, a194=194, a195=195, a196=196, a197=197, a198=198, a199=199, a200=200, a201=201, a202=202, a203=203, a204=204, a205=205, a206=206, a207=207, a208=208, a209=209, a210=210, a211=211, a212=212, a213=213, a214=214, a215=215, a216=216, a217=217, a218=218, a219=219, a220=220, a221=221, a222=222, a223=223, a224=224, a225=225, a226=226, a227=227, a228=228, a229=229, a230=230, a231=231, a232=232, a233=233, a234=234, a235=235, a236=236, a237=237, a238=238, a239=239, a240=240, a241=241, a242=242, a243=243, a244=244, a245=245, a246=246, a247=247, a248=248, a249=249, a250=250, a251=251, a252=252, a253=253, a254=254, a255=255, a256=256, a257=257, a258=258, a259=259, a260=260, a261=261, a262=262, a263=263, a264=264, a265=265, a266=266, a267=267, a268=268, a269=269, a270=270, a271=271, a272=272, a273=273, a274=274, a275=275, a276=276, a277=277, a278=278, a279=279, a280=280, a281=281, a282=282, a283=283, a284=284, a285=285, a286=286, a287=287, a288=288, a289=289, a290=290, a291=291, a292=292, a293=293, a294=294, a295=295, a296=296, a297=297, a298=298, a299=299, a300=300, a301=301, a302=302, a303=303, a304=304, a305=305, a306=306, a307=307, a308=308, a309=309, a310=310, a311=311, a312=312, a313=313, a314=314, a315=315, a316=316, a317=317, a318=318, a319=319, a320=320, a321=321, a322=322, a323=323, a324=324, a325=325, a326=326, a327=327, a328=328, a329=329, a330=330, a331=331, a332=332, a333=333, a334=334, a335=335, a336=336, a337=337, a338=338, a339=339, a340=340, a341=341, a342=342, a343=343, a344=344, a345=345, a346=346, a347=347, a348=348, a349=349, a350=350, a351=351, a352=352, a353=353, a354=354, a355=355, a356=356, a357=357, a358=358, a359=359, a360=360, a361=361, a362=362, a363=363, a364=364, a365=365, a366=366, a367=367, a368=368, a369=369, a370=370, a371=371, a372=372, a373=373, a374=374, a375=375, a376=376, a377=377, a378=378, a379=379, a380=380, a381=381, a382=382, a383=383, a384=384, a385=385, a386=386, a387=387, a388=388, a389=389, a390=390, a391=391, a392=392, a393=393, a394=394, a395=395, a396=396, a397=397, a398=398, a399=399)
>>> s.a391
391


回答2:

namedtuple out of the box doesn't support what you are trying to do.

So the following might achieve the goal, which might change from 400 to 450 arguments, or lesser and saner.

def customtuple(*keys):
        class string:
            _keys = keys 
            _dict = {}
            def __init__(self, *args):
                args = list(args)
                if len(args) != len(self._keys):
                    raise Exception("No go forward")

                for key in range(len(args)):
                    self._dict[self._keys[key]] = args[key]

            def __setattr__(self, *args):
                raise BaseException("Not allowed")

            def __getattr__(self, arg):
                try:
                    return self._dict[arg]
                except:
                    raise BaseException("Name not defined")

            def __repr__(self):
                return ("string(%s)"
                        %(", ".join(["%s=%r"
                                     %(self._keys[key],
                                       self._dict[self._keys[key]])
                                     for key in range(len(self._dict))])))

        return string
>>> strings = customtuple(*['a'+str(x) for x in range(1, 401)])
>>> s = strings(*['a'+str(x) for x in range(2, 402)])
>>> s.a1
'a2'
>>> s.a1 = 1
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/hus787/p.py", line 15, in __setattr__
    def __setattr__(self, *args):
BaseException: Not allowed

For more light on the subject.



回答3:

Here is my version of a replacement for namedtuple that supports more than 255 arguments. The idea was not to be functionally equivalent but rather to improve on some aspects (in my opinion). This is for Python 3.4+ only:

class SequenceAttrReader(object):
    """ Class to function similar to collections.namedtuple but allowing more than 255 keys.
        Initialize with attribute string (space separated), then load in data via a sequence, then access the list keys as properties

        i.e.

        csv_line = SequenceAttrReader('a b c')
        csv_line = csv_line.load([1, 2, 3])

        print(csv_line.b)

        >> 2
    """

    _attr_string = None
    _attr_list = []
    _data_list = []

    def __init__(self, attr_string):
        if not attr_string:
            raise AttributeError('SequenceAttrReader not properly initialized, please use a non-empty string')

        self._attr_string = attr_string
        self._attr_list = attr_string.split(' ')

    def __getattr__(self, name):
        if not self._attr_string or not self._attr_list or not self._data_list:
            raise AttributeError('SequenceAttrReader not properly initialized or loaded')

        try:
            index = self._attr_list.index(name)
        except ValueError:
            raise AttributeError("'{name}'' not in attribute string".format(name=name)) from None

        try:
            value = self._data_list[index]
        except IndexError:
            raise AttributeError("No attribute named '{name}'' in".format(name=name)) from None

        return value

    def __str__(self):
        return str(self._data_list)

    def __repr__(self):
        return 'SequenceAttrReader("{attr_string}")'.format(attr_string=self._attr_string)

    def load_data(self, data_list):
        if not self._attr_list:
            raise AttributeError('SequenceAttrReader not properly initialized')

        if not data_list:
            raise AttributeError('SequenceAttrReader needs to load a non-empty sequence')

        self._data_list = data_list

This is probably not the most efficient way if you are doing a lot of individual lookups, converting it internally to a dict may be better. I'll work on an optimized version once I have more time or at least see what the performance difference is.