How do I convert Python's itertools.product li

2019-07-29 09:20发布

问题:

According to http://docs.python.org/2/library/itertools.html#itertools.product the following function is equivalent to using their library (I removed a few things I don't need from it):

def product(*args):
    # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
    pools = map(tuple, args)
    result = [[]]
    for pool in pools:
        result = [x+[y] for x in result for y in pool]
    for prod in result:
        yield tuple(prod)

In my case I'm passing the product function 3 lists, but I need to add some conditional checks, so it doesn't mix certain items from one list with items in another list if they don't meet the requirements. So what I figured I need to do is convert:

result = [x+[y] for x in result for y in pool]

into "normal" FOR loops (not sure how to refer to them), so I can add several IF checks to verify whether the items in the lists should be mixed together or not.

What mainly confuses me is that "x" is iterating through the "result" list which is empty, but items are added to it as it iterates, so I think this is what is complicating the conversion to normal loops for me.

Here is one of my attempts:

def product(*args):
    pools = map(tuple, args)
    result = [[]]
    for pool in pools:
        for x in result:
            for y in pool:
                result.append(x+[y])
    for prod in result:
        yield tuple(prod)

Any help is greatly appreciated!

回答1:

You're very close: the right hand side of a nested list comprehension is written in the same order that you'd write the for loops, so you've got that right. However, in the listcomp version, first the RHS of the assignment is computed and then it's bound to the name on the LHS. So

result = [x+[y] for x in result for y in pool]

needs to become

new_result = []
for x in result:
    for y in pool:
        new_result.append(x+[y])
result = new_result

So that you're not modifying result as you iterate over it. If you wanted to forbid certain arrangements -- and you can write your constraint in such a way that it works for that iteration order, which fills in from left-to-right -- then you could do this:

def filtered_product(args, filter_fn):
    pools = map(tuple, args)
    result = [[]]
    for pool in pools:
        new_result = []
        for x in result:
            for y in pool:
                new_val = x+[y]
                if filter_fn(new_val):
                    new_result.append(x+[y])
        result = new_result
        print 'intermediate result:', result
    for prod in result:
        yield tuple(prod)

which gives

In [25]: list(filtered_product([[1,2,3], [4,5,6], [7,8,9]], lambda x: sum(x) % 3 != 2))
intermediate result: [[1], [3]]
intermediate result: [[1, 5], [1, 6], [3, 4], [3, 6]]
intermediate result: [[1, 5, 7], [1, 5, 9], [1, 6, 8], [1, 6, 9], [3, 4, 8], [3, 4, 9], [3, 6, 7], [3, 6, 9]]
Out[25]: 
[(1, 5, 7),
 (1, 5, 9),
 (1, 6, 8),
 (1, 6, 9),
 (3, 4, 8),
 (3, 4, 9),
 (3, 6, 7),
 (3, 6, 9)]

Whether or not this gives you any benefit over simply using (p for p in itertools.product(whatever) if condition(p)) will depend upon how many branches you can prune, because as you can see it constructs all the intermediate lists in memory.



回答2:

The product function is generically multiplying the lists together in a form of reduction operation, which is probably unhelpful for you if you're trying to filter the results as you go. Instead you should write a product function that takes a fixed number of lists:

for x in list1:
    for y in list2:
        for z in list3:
            if condition(x, y, z):
                yield tuple(x, y, z)


回答3:

Note that in the line result = [x+[y] for x in result for y in pool], result appears twice but it is not relevant. This expression builds a list using the old result, and then it assigns this new list into result.

That's probably what got you confused. An equivalent expanded version would be:

def product(*args):
    pools = map(tuple, args)
    result = [[]]
    for pool in pools:
        tmp = []
        for x in result:   # note that it's the old 'result' here
            for y in pool:
                tmp.append(x+[y])
        result = tmp
    for prod in result:
        yield tuple(prod)