I'm need a pseudo-random generator which takes a number as input and returns another number witch is reproducible and seems to be random.
- Each input number should match to exactly one output number and vice versa
- same input numbers always result in same output numbers
- sequential input numbers that are close together (eg. 1 and 2) should produce completely different output numbers (eg. 1 => 9783526, 2 => 283)
It must not be perfect, it's just to create random but reproducible test data.
I use C#.
I wrote this funny piece of code some time ago which produced something random.
public static long Scramble(long number, long max)
{
// some random values
long[] scramblers = { 3, 5, 7, 31, 343, 2348, 89897 };
number += (max / 7) + 6;
number %= max;
// shuffle according to divisibility
foreach (long scrambler in scramblers)
{
if (scrambler >= max / 3) break;
number = ((number * scrambler) % max)
+ ((number * scrambler) / max);
}
return number % max;
}
I would like to have something better, more reliable, working with any size of number (no max argument).
Could this probably be solved using a CRC algorithm? Or some bit shuffling stuff.
A tausworthe generator is simple to implement and pretty fast. The following pseudocode implementation has full cycle (2**31 - 1, because zero is a fixed point):
I don't know C#, but I'm assuming it has XOR (
^
) and bit shift (<<
,>>
) operators as in C.Set an initial seed value, and invoke with
seed = tausworthe(seed)
.You (maybe) can do this easily in C# using the Random class:
Since you're explicitly seeding Random with the input, you will get the same output every time given the same input value.
The first two rules suggest a fixed or input-seeded permutation of the input, but the third rule requires a further transform.
Is there any further restriction on what the outputs should be, to guide that transform? - e.g. is there an input set of output values to choose from?
If the only guide is "no max", I'd use the following...
Apply a hash algorithm to the whole input to get the first output item. A CRC might work, but for more "random" results, use a crypto hash algorithm such as MD5.
Use a next permutation algorithm (plenty of links on Google) on the input.
Repeat the hash-then-next-permutation until all required outputs are found.
The next permutation may be overkill though, you could probably just increment the first input (and maybe, on overflow, increment the second and so on) before redoing the hash.
For crypto-style hashing, you'll need a key - just derive something from the input before you start.
I remove the microsoft code from this answer, the GNU code file is a lot longer but basically it contains this from http://cs.uccs.edu/~cs591/bufferOverflow/glibc-2.2.4/stdlib/random_r.c :
for your purpose, the seed is state[0] so it would look more like