The problem:
I would like to generate a list of permutations of strings in lexigraphical but excluding string inversions. For instance, if I have the following string: abc, I would like to generate the following list
abc
acb
bac
instead of the typical
abc
acb
bac
bca
cab
cba
An alternative example would look something like this:
100
010
instead of
100
010
001
Currently, I can generate the permutations using perl, but I am not sure on how to best remove the reverse duplicates.
I had thought of applying something like the following:
create map with the following:
1) 100
2) 010
3) 001
then perform the reversion/inversion on each element in the map and create a new map with:
1') 001
2') 010
3') 100
then compare and if the primed list value matches the original value, leave it in place, if it is different, if it's index is greater than the median index, keep it, else remove.
Trouble is, I am not sure if this is an efficient approach or not.
Any advice would be great.
The answer in the suggested duplicate Generating permutations which are not mirrors of each other doesn't deal with repeated elements (because that wasn't part of that question) so naively following it would include e.g. both 0100 and 0010. So this isn't an exact duplicate. But the idea applies.
Generate all the permutations but filter only for those with
$_ le reverse $_
. I think this is essentially what you suggest in the question, but there's no need to compute a map when a simple expression applied to each permutation will tell you whether to include it or not.Two possibilities represented by examples are for permutations where all elements are different (
abcd
), or for variations of two symbols where one appears exactly once (1000
). More general cases are addressed as well.Non-repeating elements (permutations)
Here we can make use of Algorithm::Permute, and of the particular observation: Each permutation where the first element is greater than its last need be excluded. It comes from this post, brought up in the answer by ysth.
This rule holds as follows. Consider substrings of a string without its first and last elements. For each such substring, all permutations of the string must contain its inverse. One of these, padded with last and first, is thus the inverse of the string. By construction, for each substring there is exactly one inverse. Thus permutations with swapped first and last elements of each string need be excluded.
Problems with repetead elements (
abbcd
) can be treated the exact same way as above, and we need to also prune duplicates as permutations ofb
generateabbcd
andabbcd
(same)Doing this during construction would not reduce complexity nor speed things up.
The
permute
is quoted as the fastest method in the module, by far. It is about an order of magnitude faster than the other modules I tested (below), taking about 1 second for 10 elements on my system. But note that this problem's complexity is factorial in size. It blows up really fast.Two symbols, where one appears exactly once (variations)
This is different and the above module is not meant for it, nor would the exclusion criterion work. There are other modules, see at the end. However, the problem here is very simple.
Start from
(1,0,0,...)
and 'walk'1
along the list, up to the "midpoint" – which is the half for even sized list (4 for 8-long), or next past half for odd sizes (5 for 9-long). All strings obtained this way, by moving1
by one position up to midpoint, form the set. The second "half" are their inversions.We need to stop before the last index since it has been filled in the previous iteration.
This can be modified if both symbols (
0,1
) may appear repeatedly, but it is far simpler to use a module and then exclude inverses. The Algorithm::Combinatorics has routines for all needs here. For all variations of0
and1
of lenght$size
, where both may repeatInverse elements can then be excluded by a brute-force search, with O(N2) complexity at worst.
Also note Math::Combinatorics.