How to convert generator or iterator to list recur

2019-04-26 02:49发布

I want to convert generator or iterator to list recursively.
I wrote a code in below, but it looks naive and ugly, and may be dropped case in doctest.

Q1. Help me good version.
Q2. How to specify object is immutable or not?

import itertools

def isiterable(datum):
    return hasattr(datum, '__iter__')

def issubscriptable(datum):
    return hasattr(datum, "__getitem__")

def eagerlize(obj):
    """ Convert generator or iterator to list recursively.
    return a eagalized object of given obj.
    This works but, whether it return a new object, break given one.

    test 1.0 iterator

    >>> q = itertools.permutations('AB',  2)
    >>> eagerlize(q)
    [('A', 'B'), ('B', 'A')]
    >>>

    test 2.0 generator in list

    >>> q = [(2**x for x in range(3))]
    >>> eagerlize(q)
    [[1, 2, 4]]
    >>>

    test 2.1 generator in tuple

    >>> q = ((2**x for x in range(3)),)
    >>> eagerlize(q)
    ([1, 2, 4],)
    >>>

    test 2.2 generator in tuple in generator

    >>> q = (((x, (y for y in range(x, x+1))) for x in range(3)),)
    >>> eagerlize(q)
    ([(0, [0]), (1, [1]), (2, [2])],)
    >>>

    test 3.0 complex test

    >>> def test(r):
    ...     for x in range(3):
    ...         r.update({'k%s'%x:x})
    ...         yield (n for n in range(1))
    >>>
    >>> def creator():
    ...     r = {}
    ...     t = test(r)
    ...     return r, t
    >>>
    >>> a, b = creator()
    >>> q = {'b' : a, 'a' : b}
    >>> eagerlize(q)
    {'a': [[0], [0], [0]], 'b': {'k2': 2, 'k1': 1, 'k0': 0}}
    >>>

    test 3.1 complex test (other dict order)

    >>> a, b = creator()
    >>> q = {'b' : b, 'a' : a}
    >>> eagerlize(q)
    {'a': {'k2': 2, 'k1': 1, 'k0': 0}, 'b': [[0], [0], [0]]}
    >>>

    test 4.0 complex test with tuple

    >>> a, b = creator()
    >>> q = {'b' : (b, 10), 'a' : (a, 10)}
    >>> eagerlize(q)
    {'a': ({'k2': 2, 'k1': 1, 'k0': 0}, 10), 'b': ([[0], [0], [0]], 10)}
    >>>

    test 4.1 complex test with tuple (other dict order)

    >>> a, b = creator()
    >>> q = {'b' : (b, 10), 'a' : (a, 10)}
    >>> eagerlize(q)
    {'a': ({'k2': 2, 'k1': 1, 'k0': 0}, 10), 'b': ([[0], [0], [0]], 10)}
    >>>

    """
    def loop(obj):
        if isiterable(obj):
            for k, v in obj.iteritems() if isinstance(obj, dict) \
                         else enumerate(obj):
                if isinstance(v, tuple):
                    # immutable and iterable object must be recreate, 
                    # but realy only tuple?
                    obj[k] = tuple(eagerlize(list(obj[k])))
                elif issubscriptable(v):
                    loop(v)
                elif isiterable(v):
                    obj[k] = list(v)
                    loop(obj[k])

    b = [obj]
    loop(b)
    return b[0]

def _test():
    import doctest
    doctest.testmod()

if __name__=="__main__":
    _test()

1条回答
萌系小妹纸
2楼-- · 2019-04-26 03:45

To avoid badly affecting the original object, you basically need a variant of copy.deepcopy... subtly tweaked because you need to turn generators and iterators into lists (deepcopy wouldn't deep-copy generators anyway). Note that some effect on the original object is unfortunately inevitable, because generators and iterators are "exhausted" as a side effect of iterating all the way on them (be it to turn them into lists or for any other purpose) -- therefore, there is simply no way you can both leave the original object alone and have that generator or other iterator turned into a list in the "variant-deepcopied" result.

The copy module is unfortunately not written to be customized, so the alternative ares, either copy-paste-edit, or a subtle (sigh) monkey-patch hinging on (double-sigh) the private module variable _deepcopy_dispatch (which means your patched version might not survive a Python version upgrade, say from 2.6 to 2.7, hypothetically). Plus, the monkey-patch would have to be uninstalled after each use of your eagerize (to avoid affecting other uses of deepcopy). So, let's assume we pick the copy-paste-edit route instead.

Say we start with the most recent version, the one that's online here. You need to rename module, of course; rename the externally visible function deepcopy to eagerize at line 145; the substantial change is at lines 161-165, which in said version, annotated, are:

161 :               copier = _deepcopy_dispatch.get(cls)
162 :               if copier:
163 :                   y = copier(x, memo)
164 :               else:
165 :   tim_one 18729           try:

We need to insert between line 163 and 164 the logic "otherwise if it's iterable expand it to a list (i.e., use the function _deepcopy_list as the copier". So these lines become:

161 :               copier = _deepcopy_dispatch.get(cls)
162 :               if copier:
163 :                   y = copier(x, memo)
                     elif hasattr(cls, '__iter__'):
                         y = _deepcopy_list(x, memo)
164 :               else:
165 :   tim_one 18729           try:

That's all: just there two added lines. Note that I've left the original line numbers alone to make it perfectly clear where exactly these two lines need to be inserted, and not numbered the two new lines. You also need to rename other instances of identifier deepcopy (indirect recursive calls) to eagerize.

You should also remove lines 66-144 (the shallow-copy functionality that you don't care about) and appropriately tweak lines 1-65 (docstrings, imports, __all__, etc).

Of course, you want to work off a copy of the plaintext version of copy.py, here, not the annotated version I've been referring to (I used the annotated version just to clarify exactly where the changes were needed!-).

查看更多
登录 后发表回答