I am currently learning the concept of list comprehensions in python. However, I have huge problems when the list I am iterating over contains sublists of equal or different length. For example, I would like to turn the code for union_set()
into a one-line-comprehension:
def union_set(L):
S_union = set()
for i in range(len(L)):
S_union.update(set(L[i]))
return S_union
L1 = [1, 2, 3]
L2 = [4, 5, 6]
L3 = [7, 8, 9]
L = [L1, L2, L3]
print(L)
print(union_set(L))
I am pretty sure this should be possible (maybe by 'somehow' unpacking the sublists' content(?)), but I am affraid that I am missing something here. Can anyone help?
Using list-comprehension, you can do something like that:
>>> L1 = [1, 2, 3]
>>> L2 = [4, 5, 6]
>>> L3 = [7, 8, 9]
>>> L = [L1, L2, L3]
>>> s=set([x for y in L for x in y])
>>> s
set([1, 2, 3, 4, 5, 6, 7, 8, 9])
y is iterating over the sublist, while x iterates over items in y.
Use an empty set
and .union
it:
L1 = [1, 2, 3]
L2 = [4, 5, 6]
L3 = [7, 8, 9]
print set().union(L1, L2, L3)
Used in your code as:
L = [L1, L2, L3]
def union_set(L):
return set().union(*L)
Use * for unpacking and pass the unpacked items to set.union
:
>>> L = [L1, L2, L3]
>>> set.union(*(set(x) for x in L))
set([1, 2, 3, 4, 5, 6, 7, 8, 9])
Efficient versions using itertools
:
>>> from itertools import islice
>>> set.union(set(L[0]),*islice(L,1,None))
set([1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> from itertools import chain
>>> set(chain.from_iterable(L))
set([1, 2, 3, 4, 5, 6, 7, 8, 9])
Timing comparisons:
>>> L = [L1, L2, L3]*10**5
>>> %timeit set.union(*(set(x) for x in L))
1 loops, best of 3: 416 ms per loop
>>> %timeit set(chain.from_iterable(L)) # winner
1 loops, best of 3: 69.4 ms per loop
>>> %timeit set.union(set(L[0]),*islice(L,1,None))
1 loops, best of 3: 78.6 ms per loop
>>> %timeit set().union(*L)
1 loops, best of 3: 105 ms per loop
>>> %timeit set(chain(*L))
1 loops, best of 3: 79.2 ms per loop
>>> %timeit s=set([x for y in L for x in y])
1 loops, best of 3: 151 ms per loop
You could use itertools.chain
like this
>>> L1 = [1, 2, 3]
>>> L2 = [4, 5, 6]
>>> L3 = [7, 8, 9]
>>> L = [L1,L2,L3]
>>> set(itertools.chain(*L))
set([1, 2, 3, 4, 5, 6, 7, 8, 9])
*
unpacks the list, and chain
creates a list out of sublists.