Merging the results of itertools.product?

2020-04-20 12:14发布

问题:

I am trying to create a list of numbers from 0-9999 using itertools.product. I am able to create a list from 0000-9999 by doing the following:

numbers = ['0','1','2','3','4','5','6','7','8','9']
itertools.product(numbers,numbers,numbers,numbers)

And while I want entries like 0001, I would also like to get 001, 01, and 1.

What would be the most effective way to include these? Should I make calls to itertools.product(numbers,numbers,numbers) and itertools.product(numbers,numbers) and then somehow combine these with the original or is there a cleaner way?

If I should make two other calls and combine, can someone point me towards how this would be done? I attempted to use .append(), but it throws this error:

'itertools.product' object has no attribute 'append'

Thanks for any help.

回答1:

You could use a nested listcomp or genexp (reduced in size here for display purposes):

>>> numbers = ['0','1','2']
>>> [''.join(p) for n in range(1,4) for p in product(numbers, repeat=n)]
['0', '1', '2', '00', '01', '02', '10', '11', '12', '20', '21', '22', '000', '001', '002', '010', '011', '012', '020', '021', '022', '100', '101', '102', '110', '111', '112', '120', '121', '122', '200', '201', '202', '210', '211', '212', '220', '221', '222']


回答2:

numbers = ['0','1','2','3','4','5','6','7','8','9']
list(''.join(subl) for subl in itertools.chain.from_iterable(itertools.product(numbers, repeat=i) for i in range(1,5)))


回答3:

Performance improvement on existing answers:

from itertools import chain, product

list(map(''.join, chain.from_iterable(product(numbers, repeat=i) for i in range(1, 5))))
# Or on Python 3.5+ with additional unpacking generalizations:
[*map(''.join, chain.from_iterable(product(numbers, repeat=i) for i in range(1, 5)))]

omitting list()/[*...] wrapping if you're just iterating the results.

The performance improves significantly (not so much in this case, but dramatically for larger products) on the CPython reference interpreter as (implementation details here):

  1. It pushes the vast majority of work to the C layer, avoiding byte code interpreter loop overhead
  2. product has an optimization that reuses the result tuple (including not needing to set the majority of the values in it) if no references exist when the next result is requested. This optimization isn't available to listcomps and genexprs (the loop structure keeps a reference to the resulting tuple alive just long enough that a reference exists when it's determining if it can reuse the tuple for the next result), but map(''.join avoids that (it only holds the reference to the tuple long enough to call the mapper function, discarding it before it yields the result of the mapper).

Even in this case, the speedup is significant, percentage-wise, demonstrated with ipython microbenchmarks (in this case, on a Linux x64 3.6 install):

>>> %timeit -r5 [''.join(p) for n in range(1, 5) for p in product(nums, repeat=n)]
24.9 μs ± 95.2 ns per loop (mean ± std. dev. of 5 runs, 10000 loops each)
>>> %timeit -r5 list(map(''.join, chain.from_iterable(product(numbers, repeat=i) for i in range(1, 5))))
18.2 μs ± 41.2 ns per loop (mean ± std. dev. of 5 runs, 100000 loops each)

As noted, the gains are large only in percentage terms here (~27% runtime reduction); 6.7 μs is pretty trivial in the grand scheme of things. But if the range to cover gets larger and/or the set of numbers to product over gets bigger, it matters more; for numbers = '0123456789' and range(1, 8), the reduction is from 2.54 s to 1.67 s; asymptotically the savings appears to be a savings of roughly a third, and when the total cost is measured in seconds, reducing that cost by a third is meaningful.