All possibilities to split a list into two lists

2020-02-12 01:53发布

问题:

I have a list with some elements and want to iterate over all possible ways to divide this list into two lists. By that I mean all combinations, so the order doesn't matter (i.e. Element 1 and 3 could be in the one list and Element 2 in the other). Currently I do it like this, where facs is my initial list:

patterns = []
for i in range(2**(len(facs)-1)):
    pattern = []
    for j in range((len(facs)-1)):
        pattern.append(i//(2**j)%2)
    patterns.append(pattern)

for pattern in patterns:
    l1 = [facs[-1]]
    l2 = []
    for i in range(len(pattern)):
        if pattern[i] == 1:
            l1.append(facs[i])
        else:
            l2.append(facs[i])

So I basically create a list of length 2^(len(facs)-1) and fill it with every possible combination of ones and zeros. I then 'overlay' every pattern with facs, except for the last element of facs which is always in l1, as I'd otherwise get every result twice, as I handle two lists the same, no matter what lists is l1 or l2.

Is there a faster and more elegant (shorter/more pythonic) way to do this?

回答1:

itertools has product() which could be used to generate the masks and izip() which could combine the lists for easy filtering. As a bonus, since they return iterators, they don't use much memory.

from itertools import *

facs = ['one','two','three']

l1 = []
l2 = []
for pattern in product([True,False],repeat=len(facs)):
    l1.append([x[1] for x in izip(pattern,facs) if x[0]])
    l2.append([x[1] for x in izip(pattern,facs) if not x[0]])


回答2:

Just extending @Ouroborus solution using filters and keeping the results together:

import itertools as it

# itertools recipe
def partition(pred, iterable):
    t1, t2 = it.tee(iterable)
    return it.filterfalse(pred, t1), filter(pred, t2)

>>> facs = ['one','two','three']
>>> [[[x[1] for x in f] for f in partition(lambda x: x[0], zip(pattern, facs))]
...  for pattern in product([True, False], repeat=len(facs))]
[[[], ['one', 'two', 'three']],
 [['three'], ['one', 'two']],
 [['two'], ['one', 'three']],
 [['two', 'three'], ['one']],
 [['one'], ['two', 'three']],
 [['one', 'three'], ['two']],
 [['one', 'two'], ['three']],
 [['one', 'two', 'three'], []]]


回答3:

First part may be one-lined with nested list comprehensions like this:

patterns = [ [ i//(2**j)%2 for j in range(len(facs)-1) ] for i in range(2**(len(facs)-1)) ]

For the second part, you cannot do a list comprehension since there are 2 lists, but you can do a ternary expression to select the list to append to.

And you can zip both pattern and facs lists to avoid playing with indexes:

for pattern in patterns:
    l1 = [facs[-1]]
    l2 = []
    for fac,pat in zip(facs,pattern):
        (l1 if pat == 1 else l2).append(fac)

of course you have to use l1 and l2 during the iteration, because you reset them each time.



回答4:

For completeness, you can also fold the powerset in half to produce the desired results. For example, consider the powerset of {A, B, C} in colexicographic order according to each subset's corresponding bitmask:

{}, {A}, {B}, {A, B} | {C}, {A, C}, {B, C}, {A, B, C}

If you rotate the first half clockwise by 90 degrees, and rotate the second half counterclockwise by 90 degrees, and then line them up, you have two columns of subsets and each row forms a partition of the original set. We can implement this "folding" by zipping the powerset with the reverse of itself and taking half of the generated pairs of subsets. Taking half ensures that only unique partitions are produced (e.g. [['two', 'three'], ['one']] and [['one'], ['two', 'three']] are the same partition), assuming the original sequence is itself distinct.

import itertools

def binary_splits(a):
    partitions = zip(powerset_colex(a), powerset_colex(a, reverse = True))
    return itertools.islice(partitions, 1 << len(a) >> 1)

def powerset_colex(a, reverse = False):
    n = len(a)
    bitmasks = range(1 << n)[::-1] if reverse else range(1 << n)
    return (list(itertools.compress(a, iter_bits(bits, n))) for bits in bitmasks)

def iter_bits(n, k):
    return (n >> i & 1 for i in range(k))

While it's not very useful, it makes for a cute solution. Here's a couple of variations that--instead of running two powerset iterators in reverse--just directly generate the complement for each subset.

def binary_splits_1(a):
    n = len(a)

    for bitmask in range(1 << n >> 1):
        subset     = itertools.compress(a, iter_bits(+bitmask, n))
        complement = itertools.compress(a, iter_bits(~bitmask, n))
        yield list(subset), list(complement)

def binary_splits_2(a):
    n = len(a)

    def dual_compress(items, bitmask):
        buckets = [], []

        for item, bit in zip(items, iter_bits(bitmask, n)):
            buckets[1 - bit].append(item)

        return buckets

    return (dual_compress(a, bitmask) for bitmask in range(1 << n >> 1))