How extract items of sublists in a one-line-compre

2019-07-09 01:31发布

问题:

I am currently learning the concept of list comprehensions in python. However, I have huge problems when the list I am iterating over contains sublists of equal or different length. For example, I would like to turn the code for union_set() into a one-line-comprehension:

def union_set(L):
    S_union = set()

    for i in range(len(L)):
        S_union.update(set(L[i]))

    return S_union


L1 = [1, 2, 3]
L2 = [4, 5, 6]
L3 = [7, 8, 9]

L = [L1, L2, L3]
print(L)

print(union_set(L))

I am pretty sure this should be possible (maybe by 'somehow' unpacking the sublists' content(?)), but I am affraid that I am missing something here. Can anyone help?

回答1:

Using list-comprehension, you can do something like that:

>>> L1 = [1, 2, 3]
>>> L2 = [4, 5, 6]
>>> L3 = [7, 8, 9]
>>> L = [L1, L2, L3]
>>> s=set([x for y in L for x in y])
>>> s
set([1, 2, 3, 4, 5, 6, 7, 8, 9])

y is iterating over the sublist, while x iterates over items in y.



回答2:

Use an empty set and .union it:

L1 = [1, 2, 3]
L2 = [4, 5, 6]
L3 = [7, 8, 9]

print set().union(L1, L2, L3)

Used in your code as:

L = [L1, L2, L3]

def union_set(L):
    return set().union(*L)


回答3:

Use * for unpacking and pass the unpacked items to set.union:

>>> L = [L1, L2, L3]
>>> set.union(*(set(x) for x in L))
set([1, 2, 3, 4, 5, 6, 7, 8, 9])

Efficient versions using itertools:

>>> from itertools import islice
>>> set.union(set(L[0]),*islice(L,1,None))
set([1, 2, 3, 4, 5, 6, 7, 8, 9])

>>> from itertools import chain
>>> set(chain.from_iterable(L))
set([1, 2, 3, 4, 5, 6, 7, 8, 9])

Timing comparisons:

>>> L = [L1, L2, L3]*10**5

>>> %timeit set.union(*(set(x) for x in L))
1 loops, best of 3: 416 ms per loop

>>> %timeit set(chain.from_iterable(L))               # winner
1 loops, best of 3: 69.4 ms per loop

>>> %timeit set.union(set(L[0]),*islice(L,1,None))
1 loops, best of 3: 78.6 ms per loop

>>> %timeit set().union(*L)
1 loops, best of 3: 105 ms per loop

>>> %timeit set(chain(*L))
1 loops, best of 3: 79.2 ms per loop

>>> %timeit s=set([x for y in L for x in y])
1 loops, best of 3: 151 ms per loop


回答4:

You could use itertools.chain like this

>>> L1 = [1, 2, 3]
>>> L2 = [4, 5, 6]
>>> L3 = [7, 8, 9]
>>> L = [L1,L2,L3]

>>> set(itertools.chain(*L))
set([1, 2, 3, 4, 5, 6, 7, 8, 9])

* unpacks the list, and chain creates a list out of sublists.