Finding the index of a given permutation

2020-02-06 02:28发布

问题:

I'm reading the numbers 0, 1, ..., (N - 1) one by one in some order. My goal is to find the lexicography index of this given permutation, using only O(1) space.

This question was asked before, but all the algorithms I could find used O(N) space. I'm starting to think that it's not possible. But it would really help me a lot with reducing the number of allocations.

回答1:

Considering the following data:

chars = [a, b, c, d]
perm = [c, d, a, b]
ids = get_indexes(perm, chars) = [2, 3, 0, 1]

A possible solution for permutation with repetitions goes as follows:

len = length(perm)         (len = 4)
num_chars = length(chars)  (len = 4)

base = num_chars ^ len     (base = 4 ^ 4 = 256)
base = base / len          (base = 256 / 4 = 64)

id = base * ids[0]         (id = 64 * 2 = 128)
base = base / len          (base = 64 / 4 = 16)

id = id + (base * ids[1])  (id = 128 + (16 * 3) = 176)
base = base / len          (base = 16 / 4 = 4)

id = id + (base * ids[2])  (id = 176 + (4 * 0) = 176)
base = base / len          (base = 4 / 4 = 1)

id = id + (base * ids[3])  (id = 176 + (1 * 1) = 177)

Reverse process:

id = 177
(id / (4 ^ 3)) % 4 = (177 / 64) % 4 =   2 % 4 = 2 -> chars[2] -> c
(id / (4 ^ 2)) % 4 = (177 / 16) % 4 =  11 % 4 = 3 -> chars[3] -> d
(id / (4 ^ 1)) % 4 = (177 / 4)  % 4 =  44 % 4 = 0 -> chars[0] -> a
(id / (4 ^ 0)) % 4 = (177 / 1)  % 4 = 177 % 4 = 1 -> chars[1] -> b

The number of possible permutations is given by num_chars ^ num_perm_digits, having num_chars as the number of possible characters, and num_perm_digits as the number of digits in a permutation.

This requires O(1) in space, considering the initial list as a constant cost; and it requires O(N) in time, considering N as the number of digits your permutation will have.

Based on the steps above, you can do:

function identify_permutation(perm, chars) {

    for (i = 0; i < length(perm); i++) {
        ids[i] = get_index(perm[i], chars);
    }

    len = length(perm);
    num_chars = length(chars);

    index = 0;
    base = num_chars ^ len - 1;
    for (i = 0; i < length(perm); i++) {
        index += base * ids[i];
        base = base / len;
    }

}

It's a pseudocode, but it's also quite easy to convert to any language (:



回答2:

If you are looking for a way to obtain the lexicographic index or rank of a unique combination instead of a permutation, then your problem falls under the binomial coefficient. The binomial coefficient handles problems of choosing unique combinations in groups of K with a total of N items.

I have written a class in C# to handle common functions for working with the binomial coefficient. It performs the following tasks:

  1. Outputs all the K-indexes in a nice format for any N choose K to a file. The K-indexes can be substituted with more descriptive strings or letters.

  2. Converts the K-indexes to the proper lexicographic index or rank of an entry in the sorted binomial coefficient table. This technique is much faster than older published techniques that rely on iteration. It does this by using a mathematical property inherent in Pascal's Triangle and is very efficient compared to iterating over the set.

  3. Converts the index in a sorted binomial coefficient table to the corresponding K-indexes. I believe it is also faster than older iterative solutions.

  4. Uses Mark Dominus method to calculate the binomial coefficient, which is much less likely to overflow and works with larger numbers.

  5. The class is written in .NET C# and provides a way to manage the objects related to the problem (if any) by using a generic list. The constructor of this class takes a bool value called InitTable that when true will create a generic list to hold the objects to be managed. If this value is false, then it will not create the table. The table does not need to be created in order to use the 4 above methods. Accessor methods are provided to access the table.

  6. There is an associated test class which shows how to use the class and its methods. It has been extensively tested with 2 cases and there are no known bugs.

To read about this class and download the code, see Tablizing The Binomial Coeffieicent.

The following tested code will iterate through each unique combinations:

public void Test10Choose5()
{
   String S;
   int Loop;
   int N = 10;  // Total number of elements in the set.
   int K = 5;  // Total number of elements in each group.
   // Create the bin coeff object required to get all
   // the combos for this N choose K combination.
   BinCoeff<int> BC = new BinCoeff<int>(N, K, false);
   int NumCombos = BinCoeff<int>.GetBinCoeff(N, K);
   // The Kindexes array specifies the indexes for a lexigraphic element.
   int[] KIndexes = new int[K];
   StringBuilder SB = new StringBuilder();
   // Loop thru all the combinations for this N choose K case.
   for (int Combo = 0; Combo < NumCombos; Combo++)
   {
      // Get the k-indexes for this combination.  
      BC.GetKIndexes(Combo, KIndexes);
      // Verify that the Kindexes returned can be used to retrive the
      // rank or lexigraphic order of the KIndexes in the table.
      int Val = BC.GetIndex(true, KIndexes);
      if (Val != Combo)
      {
         S = "Val of " + Val.ToString() + " != Combo Value of " + Combo.ToString();
         Console.WriteLine(S);
      }
      SB.Remove(0, SB.Length);
      for (Loop = 0; Loop < K; Loop++)
      {
         SB.Append(KIndexes[Loop].ToString());
         if (Loop < K - 1)
            SB.Append(" ");
      }
      S = "KIndexes = " + SB.ToString();
      Console.WriteLine(S);
   }
}

You should be able to port this class over fairly easily to the language of your choice. You probably will not have to port over the generic part of the class to accomplish your goals. Depending on the number of combinations you are working with, you might need to use a bigger word size than 4 byte ints.



回答3:

There is a java solution to this problem on geekviewpoint. It has a good explanation for why it's true and the code is easy to follow. http://www.geekviewpoint.com/java/numbers/permutation_index. It also has a unit test that runs the code with different inputs.



回答4:

There are N! permutations. To represent index you need at least N bits.



回答5:

Here is a way to do it if you want to assume that arithmetic operations are constant time:

def permutationIndex(numbers):
  n=len(numbers)
  result=0
  j=0
  while j<n:
    # Determine factor, which is the number of possible permutations of
    # the remaining digits.
    i=1
    factor=1
    while i<n-j:
      factor*=i
      i+=1
    i=0
    # Determine index, which is how many previous digits there were at
    # the current position.
    index=numbers[j]
    while i<j:
      # Only the digits that weren't used so far are valid choices, so
      # the index gets reduced if the number at the current position
      # is greater than one of the previous digits.
      if numbers[i]<numbers[j]:
        index-=1
      i+=1
    # Update the result.
    result+=index*factor
    j+=1
  return result

I've purposefully written out certain calculations that could be done more simply using some Python built-in operations, but I wanted to make it more obvious that no extra non-constant amount of space was being used.

As maxim1000 noted, the number of bits required to represent the result will grow quickly as n increases, so eventually big integers will be required, which no longer have constant-time arithmetic, but I think this code addresses the spirit of your question.



回答6:

Nothing really new in the idea but a fully matricial method with no explicit loop or recursion (using Numpy but easy to adapt):

import numpy as np
import math
vfact = np.vectorize(math.factorial, otypes='O')

def perm_index(p):
    return np.dot( vfact(range(len(p)-1, -1, -1)),
                   p-np.sum(np.triu(p>np.vstack(p)), axis=0) )


回答7:

I just wrote a code using Visual Basic and my program can directly calculate every index or every corresponding permutation to a given index up to 17 elements (this limit is due to the approximation of the scientific notation of numbers over 17! of my compiler).

If you are interested I can I can send the program or publish it somewhere for download. It works fine and It can be useful for testing and paragon the output of your codes.

I used the method of James D. McCaffrey called factoradic and you can read about it here and something also here (in the discussion at the end of the page).