Iterating shuffled [0..n) without arrays

2019-02-09 10:09发布

I know of a couple of routines that work as follows:

Xn+1 = Routine(Xn, max)

For example, something like a LCG generator:

Xn+1 = (a*Xn + c) mod m

There isn't enough parameterization in this generator to generate every sequence.

Dream Function:

Xn+1 = Routine(Xn, max, permutation number)

This routine, parameterized by an index into the set of all permutations, would return the next number in the sequence. The sequence may be arbitrarily large (so storing the array and using factoradic numbers is impractical.

Failing that, does anyone have pointers to similar functions that are either stateless or have a constant amount of state for arbitrary 'max', such that they will iterate a shuffled list.

6条回答
The star\"
2楼-- · 2019-02-09 10:49

From my response to another question:

It is actually possible to do this in space proportional to the number of elements selected, rather than the size of the set you're selecting from, regardless of what proportion of the total set you're selecting. You do this by generating a random permutation, then selecting from it like this:

Pick a block cipher, such as TEA or XTEA. Use XOR folding to reduce the block size to the smallest power of two larger than the set you're selecting from. Use the random seed as the key to the cipher. To generate an element n in the permutation, encrypt n with the cipher. If the output number is not in your set, encrypt that. Repeat until the number is inside the set. On average you will have to do less than two encryptions per generated number. This has the added benefit that if your seed is cryptographically secure, so is your entire permutation.

I wrote about this in much more detail here.

Of course, there's no guarantee that every permutation can be generated (and depending on your block size and key size, that may not even be possible), but the permutations you can get are highly random (if they weren't, it wouldn't be a good cipher), and you can have as many of them as you want.

查看更多
劫难
3楼-- · 2019-02-09 10:49

Code that unpacks a permutation index into an array, with a certain mapping from index to permutation. There are loads of others, but this one is convenient.

#include <math.h>
#include <stdio.h>
#include <stdlib.h>

typedef unsigned char index_t;
typedef unsigned int permutation;

static void permutation_to_array(index_t *indices, index_t n, permutation p)
{
    index_t used = 0;
    for (index_t i = 0; i < n; ++i) {
        index_t left = n - i;
        index_t digit = p % left;
        for (index_t j = 0; j <= digit; ++j) {
            if (used & (1 << j)) {
                digit++;
            }
        }
        used |= (1 << digit);
        indices[i] = digit;
        p /= left;
    }
}

static void dump_array(index_t *indices, index_t n)
{
    fputs("[", stdout);
    for (index_t i = 0; i < n; ++i) {
        printf("%d", indices[i]);
        if (i != n - 1) {
            fputs(", ", stdout);
        }
    }
    puts("]");
}

static int factorial(int n)
{
    int prod = 1;
    for (int i = 1; i <= n; ++i) {
        prod *= i;
    }
    return prod;
}

int main(int argc, char **argv)
{
    const index_t n = 4;
    const permutation max = factorial(n);
    index_t *indices = malloc(n * sizeof (*indices));
    for (permutation p = 0; p < max; ++p) {
        permutation_to_array(indices, n, p);
        dump_array(indices, n);
    }
    free(indices);
}
查看更多
甜甜的少女心
4楼-- · 2019-02-09 10:51

Code that uses an iterate interface. Time complexity is O(n^2), Space complexity has an overhead of: copy of n (log n bits), an iteration variable (log n bits), keeping track of n-i (log n bits), , copy of current value (log n bits), copy of p (n log n bits), creation of next value (log n bits), and a bit set of used values (n bits). You can't avoid an overhead of n log n bits. Timewise, this is also O(n^2), for setting the bits. This can be reduced a bit, but at the cost of using a decorated tree to store the used values.

This can be altered to use arbitrary precision integers and bit sets by using calls to the appropriate libraries instead, and the above bounds will actually start to kick in, rather than being capped at N=8, portably (an int can be the same as a short, and as small as 16 bits). 9! = 362880 > 65536 = 2^16

#include <math.h>
#include <stdio.h>

typedef signed char index_t;
typedef unsigned int permutation;

static index_t permutation_next(index_t n, permutation p, index_t value)
{
    permutation used = 0;
    for (index_t i = 0; i < n; ++i) {
        index_t left = n - i;
        index_t digit = p % left;
        p /= left;
        for (index_t j = 0; j <= digit; ++j) {
            if (used & (1 << j)) {
                digit++;
            }
        }
        used |= (1 << digit);
        if (value == -1) {
            return digit;
        }
        if (value == digit) {
            value = -1;
        }
    }
    /* value not found */
    return -1;
}

static void dump_permutation(index_t n, permutation p)
{
    index_t value = -1;
    fputs("[", stdout);
    value = permutation_next(n, p, value);
    while (value != -1) {
        printf("%d", value);
        value = permutation_next(n, p, value);
        if (value != -1) {
            fputs(", ", stdout);
        }
    }
    puts("]");
}

static int factorial(int n)
{
    int prod = 1;
    for (int i = 1; i <= n; ++i) {
        prod *= i;
    }
    return prod;
}

int main(int argc, char **argv)
{
    const index_t n = 4;
    const permutation max = factorial(n);
    for (permutation p = 0; p < max; ++p) {
        dump_permutation(n, p);
    }
}
查看更多
forever°为你锁心
5楼-- · 2019-02-09 10:52

There are n! permutations of n elements. Storing which one you're using requires at least log(n!) / log(2) bits. By Stirling's approximation, this takes roughly n log(n) / log (2) bits.

Explicitly storing one index takes log(n) / log(2) bits. Storing all n, as in an array of indices takes n times as many, or again n log(n) / log(2). Information-theoretically, there is no better way than explicitly storing the permutation.

In other words, the index you pass in of what permutation in the set you want takes the same asymptotic storage space as just writing out the permutation. If, for, example, you limit the index of the permutation to 32 bit values, you can only handle permutations of up to 12 elements. 64 bit indices only get you up to 20 elements.

As the index takes the same space as the permutation would, either change your representation to just use the permutation directly, or accept unpacking into an array of size N.

查看更多
再贱就再见
6楼-- · 2019-02-09 11:01

Is it possible to index a set of permutations without previously computing and storing the whole thing in memory? I tried something like this before and didn't find a solution - I think it is impossible (in the mathematical sense).

Disclaimer: I may have misunderstood your question...

查看更多
Deceive 欺骗
7楼-- · 2019-02-09 11:11

If you are wanting a function that takes up less stack space, then you should look into using an iterated version, rather than a function. You can also use a datastructure like a TreeMap, and have it stored on disk, and read on an as needed basis.

X(n+1) = Routine(Xn, max, permutation number)
for(i = n; i > 0; i--)
 {
   int temp = Map.lookup(i) 
   otherfun(temp,max,perm)
 }
查看更多
登录 后发表回答