How do I make a permutation function that accepts

2020-07-22 18:40发布

问题:

I am looking for a function that accepts 1 number from an interval 0,1....N and returns a permuted value from the same interval.

An example for 0, 1, 2, 3, 4, 5 and f(x) would be:

f(0)=5;
f(1)=1;
f(2)=0;
f(3)=4;
f(4)=2;
f(5)=3;

From what I researched/understood this is a cyclic group where f(x) is what its all based upon. If have found the function f(x) = 911 * x % N to be an example of what I want, however, patterns come up when using this. 911 is a large prime number and by changing that, I get another permutations but still, patters arise. My wish is for the results to be random-ish.

I know I can pregenerate the permutation using something like [ 0, 1, 2...N ].shuffle() or something but in my case I cannot do such a thing.

I have a service which takes for input a numeric value and returns the permuted value/position. N is set serverside, so is the function I am looking for.

回答1:

Please keep in mind, that the algorithm I describe here is based on the list [1, 2, ... N-1] (with length N-1). If you insist using the list [0, 1, ..., N] (with length N+1), please apply the needed minor modifications. Moreover, for brevity, I am using the % operand in a slightly different manner than most programming languages do: a % b takes values between 1 and b, not between 0 and b - 1, but of course the main idea behind is not changed, so the value of a % b is the integer between 1 and b, which is congruent with a, modulo b.

If you read this through, it will be obvious for you, that the shuffle generated is not random at all. However, with well chosen parameters, the pattern won't be easy to recognize, (my basic idea of modular exponentiation comes from cryptography, where it is important to have non-recognizable patterns and non-revertable functions).

And this is much more the language-agnostic description of the algorithm, than an actual programming solution. I do not go into details of effective implementations and pitfalls you may come across. I hope it still helps. I also coded some parts of this in python, so I can give further help and even share my code if it is needed, but that would need some completing and refactoring before.

Using exponentiation instead of multiplication to get rid of patterns

Your initial trial of f(x) = t * x % N (where you chose t to be 911) shows some patterns because multiplication holds linearity (in the 'modular' meaning of it).

You can give much more random feel if you use exponentiation instead of multiplication. Something like f(x) = t ^ x % N. However, t has to be chosen wisely (as it was in the multiplication case, to be a coprime to N), and the output given by this formula won't provide distinct numbers for distinct x values only in the case of N is prime.

University level math is coming, bear with me, I will try to keep it simple.

We will need to use primitive roots. The related Wikipedia article may help a lot, but the basic idea is that the remainders of powers of a well chosen base take all the values between 1 and N-1. For example

3^1 = 3
3^2 = 9 = 2 (mod 7)
3^3 = 27 = 6 (mod 7)
3^4 = 81 = 4 (mod 7)
3^5 = 243 = 5 (mod 7)
3^6 = 729 = 1 (mod 7)

are all different (from this point, the values are repeating from the beginning: 3^7 = 3^1 (mod 7), 3^8 = 3^2 (mod 7), and so on).

So, if your N is 7, then 3 will work fine to be t. You can use f(x) = (3 ^ x) % 7 for x values between 1 and 6.

This results in the following f:

f(1) = 3
f(2) = 2
f(3) = 6
f(4) = 4
f(5) = 5
f(6) = 1

Introducing a shift, providing some additional random effect

If you are playing with this a little bit, you can notice, that N-1 is always shuffled to 1. If you want to avoid this, we can go a step further, and choose an arbitrary number k to add after the exponentiation. That is, using g(x) = (f(x) + k) % (N-1) = ((t ^ x) % N + k) % (N-1), in our example let k be 2, resulting in the permutation:

g(1) = 5
g(2) = 4
g(3) = 2
g(4) = 6
g(5) = 1
g(6) = 3

How to choose the base

Now you get the basic feel. But how to use this generally, when N is not 7?

The key to the problem is choosing the parameter t, which was 3 in our example.

Unfortunately this is generally a hard question (mathematician's call it finding a primitive root), and there isn't any easy-to-interpret, out-of-the-box solution I am aware of.

But this is only one part of the problem. Even more sadly, a primitive root won't even work if N is a composite number. For example, if N=6, there isn't any number t for which the expression t^x modulo 6 takes all the values between 1 and 5.

But it is not too hard to solve this part.

What to do with a composite N

If N is composite, we should find a prime P, which is barely larger, and build upon the algorithm based on that one, by replacing out-of-bound numbers with their after-the-shuffle values (and iterate, if needed).

For example, if N is 6, we can choose P to be 7 and use our previously constructed g(x).

g(1) = 5 ok (5<=N-1 holds)                          h(1) = 5
g(2) = 4 ok                                         h(2) = 4
g(3) = 2 ok                                    =>   h(3) = 2
g(4) = 6 too large, using g(g(4)) = g(6) = 3        h(4) = 3
g(5) = 1 ok                                         h(5) = 1

Just to be on the safe side, I give an other example with N=4, where we use our previously calculated solution for P=7.

g(1) = 5, g(5) = 1                                  h(1) = 1
g(2) = 4, g(4) = 6, g(6) = 3                   =>   h(2) = 3
g(3) = 2                                            h(3) = 2

This should be clear now. It is wise to choose P close to N, so there aren't too many recalculations needed for out-of-bound values of g.

Back to finding t

So our only problem left is to find the primitive root which can be used as the base of the exponentiation.

If the maths on the pages I linked before cause some visceral disgust, I have some good news for you: possible good values of t are dense in the interval [2, N-1], so even random guessing may help.

There are some details how to verify effectively if a randomly chosen t is really good for us on the linked pages, but unless you are working with really huge numbers, you can just do the exponentiations and check if the number 1 appears earlier than the (N-1)th power of t (maybe you remember I noted that t^x=1 (mod N) holds always in the case of x=N-1, so an earlier appearance of 1 would break uniqueness).

I would recommend to look for a suitable t around N/2 (meaning the order of magnitude - for P=91367, t=54949 works perfectly). If you choose t to be too low (for example t=2), you can easily spot the pattern caused by exponentiation on some neighbouring x values (2+k, 4+k, 8+k, ... will appear next to each other). If t is too close to N, a similar phenomena may appear if you look at f(x) in consecutive x values of the same parity. A good choice of t should cover these patterns, and end with a result randomish enough.

Summary

So once again, here are the steps of the algorithm

(N given)

find a P prime slightly larger than N

choose an arbitrary number k between 1 and P-1

find t which is a primitive root for P

(for a given x the output shuffle h(x) is)

calculate

f(x) = (t ^ x) % P

calculate

g(x) = (f(x) + k) % (P-1)

calculate

h(x) = g(x)                                       if g(x)<=N-1,
       iterate the calculations with x = g(x)     otherwise


回答2:

This very closely related question (How to generate a predictable shuffling of a sequence without generating the whole sequence in advance?) and its accepted answer (and therein linked blog article and its referenced paper) might also be interesting and should solve your problem.



回答3:

Based on your comment:

Function accepts 1 number =x, math happens, 1 number is returned which is a random position AND UNIQUE within the same interval where x comes from

I probably still don't understand the problem, but I give it another try. I hope it will give you some clue/help anyway!

<?php
$control = array();

$f = array(5,1,0,4,2,3); //Example

$x = 5; //Function accepts 1 number =x


for($n=1;$n<41;$n++) {
    echo doTheMath($control, $f, $x); //Echo just for example
}

function doTheMath(Array &$control, Array &$f, $x) {
    $n = count($f); //Interval count

    //1 number is returned 
    //AND
    //UNIQUE within the same interval where x comes from.

    while (!in_array($random = array_rand($f, 1), $control)) { //Get another random value from array-element if it's not in control-array

        if (!in_array($random, $control)) {
            $control[] = $random;

            //The are no more unique values, stop executing php
            if (count($control) == $n) {
                exit;
            }

            $result = 911 * $random * $n;
            return $result . '<hr />';
        }
    }


    return null;

}
?>

Example output:

10932
0
21864
27330
16398


标签: math