可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

Random question.

I am attempting to create a program which would generate a pseudo-random distribution. I am trying to find the right pseudo-random algorithm for my needs. These are my concerns:

1) I need one input to generate the same output every time it is used.

2) It needs to be random enough that a person who looks at the output from input 1 sees no connection between that and the output from input 2 (etc.), but there is no need for it to be cryptographically secure or truly random.

3)Its output should be a number between 0 and (29^3200)-1, with every possible integer in that range a possible and equally (or close to it) likely output.

4) I would like to be able to guarantee that every possible permutation of sequences of 410 outputs is also a potential output of consecutive inputs. In other words, all the possible groupings of 410 integers between 0 and (29^3200)-1 should be potential outputs of sequential inputs.

5) I would like the function to be invertible, so that I could take an integer, or a series of integers, and say which input or series of inputs would produce that result.

The method I have developed so far is to run the input through a simple halson sequence:

boost::multiprecision::mpz_int denominator = 1;
boost::multiprecision::mpz_int numerator = 0;

while (input>0) {
    denominator *=3;
    numerator = numerator * 3 + (input%3);
    input = input/3;
}

and multiply the result by 29^3200. It meets requirements 1-3, but not 4. And it is invertible only for single integers, not for series (since not all sequences can be produced by it). I am working in C++, using boost multiprecision.

Any advice someone can give me concerning a way to generate a random distribution meeting these requirements, or just a class of algorithms worth researching towards this end, would be greatly appreciated. Thank you in advance for considering my question.

----UPDATE----

Since multiple commenters have focused on the size of the numbers in question, I just wanted to make clear that I recognize the practical problems that working with such sets poses but in asking this question I'm interested only in the theoretical or conceptual approach to the problem - for example, imagine working with a much smaller set of integers like 0 to 99, and the permutations of sets of 10 of output sequences. How would you design an algorithm to meet these five conditions - 1)input is deterministic, 2)appears random (at least to the human eye), 3)every integer in the range is a possible output, 4)not only all values, but also all permutations of value sequences are possible outputs, 5)function is invertible.

---second update---

with many thanks to @Severin Pappadeux I was able to invert an lcg. I thought I'd add a little bit about what I did to hopefully make it easier for anyone seeing this in the future. First of all, these are excellent sources on inverting modular functions:

https://www.khanacademy.org/computing/computer-science/cryptography/modarithmetic/a/modular-inverses

https://www.khanacademy.org/computer-programming/discrete-reciprocal-mod-m/6253215254052864

If you take the equation next=ax+c%m, using the following code with your values for a and m will print out the euclidean equations you need to find ainverse, as well as the value of ainverse:

    int qarray[12];
    qarray[0]=0;
    qarray[1]=1;
    int i =2;
    int reset = m;
    while (m % a >0) {
      int remainder=m%a;
      int quotient=m/a;
      std::cout << m << " = " << quotient << "*" << a << " + " << remainder << "\n";
      qarray[i] =qarray[i-2]-(qarray[i-1]*quotient);
      m=a;
      a=remainder;
      i++;
  }
if (qarray[i-1]<0) {qarray[i-1]+=reset;}
std::cout << qarray[i-1] << "\n";

The other thing it took me a while to figure out is that if you get a negative result, you should add m to it. You should add a similar term to your new equation:

prev = (ainverse(next-c))%m;
if (prev<0) {prev+=m;}

I hope that helps anyone who ventures down this road in the future.

回答1:

Ok, I'm not sure if there is a general answer, so I would concentrate on random number generator having, say, 64bit internal state/seed, producing 64bit output and having 2^64-1 period. In particular, I would look at linear congruential generator (aka LCG) in the form of

next = (a * prev + c) mod m

where a and m are primes to each other

So:

1) Check

2) Check

3) Check (well, for 64bit space of course)

4) Check (again, except 0 I believe, but each and every permutation of 64bits is output of LCG starting with some seed)

5) Check. LCG is known to be reversible, i.e. one could get

prev = (next - c) * a_inv mod m

where a_inv could be computed from a, m using Euclid's algorithm

Well, if it looks ok to you, you could try to implement LCG in your 15546bits space

UPDATE

And quick search shows reversible LCG discussion/code here

Reversible pseudo-random sequence generator

回答2:

In your update, "appears random (to the human eye)" is the phrasing you use. The definition of "appears random" is not a well agreed upon topic. There are varying degrees of tests for "randomness."

However, if you're just looking to make it appear random to the human eye, you can just use ring multiplication.

Start with the idea of generating N! values between 0 and M (N>=410, M>=29^3200)
Group this together into one big number. we're going to generate a single number ranging from 0 to *M^N!. If we can show that the pseudorandom number generator generates every value from 0 to M^N!, we guarantee your permutation rule.
Now we need to make it "appear random." To the human eye, Linear Congruent Generators are enough. Pick a LCG with a period greater than or equal to 410!*M^N satisfying the rules to ensure a complete period. Easiest way to ensure fairness is to pick a LCG in the form x' = (ax+c) mod M^N!

That'll do the trick. Now, the hard part is proving that what you did was worth your time. Consider that the period of just a 29^3200 long sequence is outside the realm of physical reality. You'll never actually use it all. Ever. Consider that a superconductor made of Josephine junctions (10^-12kg processing 10^11bits/s) weighing the mass of the entire universe 3*10^52kg) can process roughly 10^75bits/s. A number that can count to 29^3200 is roughly 15545 bits long, so that supercomputer can process roughly 6.5x10^71 numbers/s. This means it will take roughly 10^4600s to merely count that high, or somewhere around 10^4592 years. Somewhere around 10^12 years from now, the stars are expected to wink out, permanently, so it could be a while.

回答3:

There are M**N sequences of N numbers between 0 and M-1. You can imagine writing all of them one after the other in a (pseudorandom) sequence and placing your read pointer randomly in the resulting loop of N*(M**N) numbers between 0 and M-1...

def output(input):
    total_length = N*(M**N)
    index = input % total_length
    permutation_index = shuffle(index / N, M**N)
    element = input % N
    return (permutation_index / (N**element)) % M

Of course for every permutation of N elements between 0 and M-1 there is a sequence of N consecutive inputs that produces it (just un-shuffle the permutation index). I'd also say (just using symmetry reasoning) that given any starting input the output of next N elements is equally probable (each number and each sequence of N numbers is equally represented in the total period).