I'm reading c standard library rand() function implementation with glibc source code.
stdlib/random_r.c, line 359
int
__random_r (buf, result)
struct random_data *buf;
int32_t *result;
{
int32_t *state;
if (buf == NULL || result == NULL)
goto fail;
state = buf->state;
if (buf->rand_type == TYPE_0)
{
int32_t val = state[0];
val = ((state[0] * 1103515245) + 12345) & 0x7fffffff;
state[0] = val;
*result = val;
}
else
{
int32_t *fptr = buf->fptr;
int32_t *rptr = buf->rptr;
int32_t *end_ptr = buf->end_ptr;
int32_t val;
val = *fptr += *rptr;
/* Chucking least random bit. */
*result = (val >> 1) & 0x7fffffff;
++fptr;
if (fptr >= end_ptr)
{
fptr = state;
++rptr;
}
else
{
++rptr;
if (rptr >= end_ptr)
rptr = state;
}
buf->fptr = fptr;
buf->rptr = rptr;
}
return 0;
fail:
__set_errno (EINVAL);
return -1;
}
I don't understand how random_r generate random number when (buf->rand_type != TYPE_0)
, anyone please explain? Thanks.
glibc
rand()
has two different generator implementations:
A simple linear congruential generator (LCG), defined by the following equation:
val = ((state * 1103515245) + 12345) & 0x7fffffff
(& 0x7fffffff
throws away the least random most significant bit)
This is a very simple, single state LCG. It has some drawbacks. The most important one is that, because it is a single state generator, it does not generate a fully pseudorandom number on each separate rand()
call. What it really does is that it traverses the whole range (2^31) in a pseudorandom order. This has a meaningful implication: when you obtain some number it means that you will not obtain that number again in the present period. You will obtain that number again exactly in the next 2^31 rand()
call, no sooner, no later.
This generator is called the TYPE_0
in the glibc
source.
A slightly more advanced, additive feedback generator. That generator has many states, which means that it does not have the "traverse property" described above. You can get the same number twice (or more times) during the same period.
You can find an excellent description of that algorithm here.
This generator is called the TYPE_1
, TYPE_2
, TYPE_3
or TYPE_4
in the glibc
source.
Coming back to your question, that is how it generates values:
seeding_stage() // (code omitted here, see the description from above link)
for (i=344; i<MAX; i++)
{
r[i] = r[i-31] + r[i-3];
val = ((unsigned int) r[i]) >> 1;
}
The code after the else
in your question is simply the above code, but written in a different way (using pointers to the array containing previous values).
Which generator is used depends on the size of the initial state set with the initstate()
function. The first (LCG) generator is used only when state size is 8 bytes. When it is bigger, the second generator is used. When you set your seed using srand()
the size of the state is 128 bytes by default, so the second generator is used. Everything is written in comments in the glibc
source file referenced by you in your question.
In case anyone else needs a simple reimplementation of the GNU C Library's srand()/rand() functions, this C# class reproduces the generated random numbers exactly. The unchecked keyword is to explicitly allow integer overflow. (Based on Piotr Jurkiewicz's answer.)
public class GnuRand
{
private uint[] r;
private int n;
public GnuRand(uint seed)
{
r = new uint[344];
unchecked
{
r[0] = seed;
for (int i = 1; i < 31; i++)
{
r[i] = (uint)((16807 * (ulong)r[i - 1]) % 2147483647);
}
for (int i = 31; i < 34; i++)
{
r[i] = r[i - 31];
}
for (int i = 34; i < 344; i++)
{
r[i] = r[i - 31] + r[i - 3];
}
}
n = 0;
}
public int Next()
{
unchecked
{
uint x = r[n % 344] = r[(n + 313) % 344] + r[(n + 341) % 344];
n = (n + 1) % 344;
return (int)(x >> 1);
}
}
}
The two implementations are exactly the same, except that they use different random data.
The TYPE_0
always uses the magic numbers 1103515245
and 12345
, together with the current state.
Otherwise, it uses magic numbers taken from a pool of random data. Each time it is called it walks a bit further through the pool. As it goes it replaces the data with new pseudo-random numbers, based on the original ones, so that it gets new numbers when it wraps around and starts the walk again.