Itertools Combinations No Repeats: Where rgb is eq

2020-03-31 02:14发布

问题:

I'm trying to use itertools.combinations to return unique combinations. I've searched through several similar questions but have not been able to find an answer.

An example:

>>> import itertools
>>> e = ['r','g','b','g']
>>> list(itertools.combinations(e,3))
[('r', 'g', 'b'), ('r', 'g', 'g'), ('r', 'b', 'g'), ('g', 'b', 'g')]

For my purposes, (r,g,b) is identical to (r,b,g) and so I would want to return only (rgb),(rgg) and (gbg).

This is just an illustrative example and I would want to ignore all such 'duplicates'. The list e could contain up to 5 elements. Each individual element would be either r, g or b. Always looking for combinations of 3 elements from e.

To be concrete, the following are the only combinations I wish to call 'valid': (rrr), (ggg), (bbb), (rgb).

So perhaps the question boils down to how to treat any variation of (rgb) as equal to (rgb) and therefore ignore it.

Can I use itertools to achieve this or do I need to write my own code to drop the 'dupliates' here? If no itertools solution then I can just easily check if each is a variation of (rgb), but this feels a bit 'un-pythonic'.

回答1:

According to your definition of "valid outputs", you can directly build them like this:

from collections import Counter

# Your distinct values
values = ['r', 'g', 'b']

e = ['r','g','b','g', 'g']

count = Counter(e)
# Counter({'g': 3, 'r': 1, 'b': 1})

# If x appears at least 3 times, 'xxx' is a valid combination  
combinations = [x*3 for x in values if count[x] >=3]

# If all values appear at least once, 'rgb' is a valid combination
if all([count[x]>=1 for x in values]):
    combinations.append('rgb')

print(combinations)
#['ggg', 'rgb']

This will be more efficient than creating all possible combinations and filtering the valid ones afterwards.



回答2:

You can use a set to discard duplicates.

In your case the number of characters is the way you identify duplicates so you could use collections.Counter. In order to save them in a set you need to convert them to frozensets though (because Counter isn't hashable):

>>> import itertools
>>> from collections import Counter
>>> e = ['r','g','b','g']
>>> result = []
>>> seen = set()
>>> for comb in itertools.combinations(e,3):
...     cnts = frozenset(Counter(comb).items())
...     if cnts in seen:
...         pass
...     else:
...         seen.add(cnts)
...         result.append(comb)
>>> result
[('r', 'g', 'b'), ('r', 'g', 'g'), ('g', 'b', 'g')]

If you want to convert them to strings use:

result.append(''.join(comb))  # instead of result.append(comb)

and it will give:

['rgb', 'rgg', 'gbg']

The approach is a variation of the unique_everseen recipe (itertools module documentation) - so it's probably "quite pythonic".



回答3:

It is not completely clear what you want to return. It depends on what comes first when iterating. For example if gbr is found first, then rgb will be discarded as a duplicate:

import itertools

e = ['r','g','b','g']       
s = set(e)
v = [s] * len(s)

solns = []

for c in itertools.product(*v):
    in_order = sorted(c)
    if in_order not in solns:
        solns.append(in_order)

print solns  

This would give you:

[['r', 'r', 'r'], ['b', 'r', 'r'], ['g', 'r', 'r'], ['b', 'b', 'r'], ['b', 'g', 'r'], ['g', 'g', 'r'], ['b', 'b', 'b'], ['b', 'b', 'g'], ['b', 'g', 'g'], ['g', 'g', 'g']]