This function is from java.util.Random
. It returns a pseudorandom int
uniformly distributed between 0
and the given n
. Unfortunately I did not get it.
public int nextInt(int n) {
if (n <= 0)
throw new IllegalArgumentException("n must be positive");
if ((n & -n) == n) // i.e., n is a power of 2
return (int)((n * (long)next(31)) >> 31);
int bits, val;
do {
bits = next(31);
val = bits % n;
} while (bits - val + (n-1) < 0);
return val;
}
My questions are:
- Why does it treat the case where
n
is a power of two specially ? Is it just for performance ?
- Why doest it reject numbers that
bits - val + (n-1) < 0
?
next
generates random bits.
When n
is a power of 2, a random integer in that range can be generated just by generating random bits (I assume that always generating 31 and throwing some away is for reproducibility). This code path is simpler and I guess it's a more commonly used case so it's worth making a special "fast path" for this case.
When n
isn't a power of 2, it throws away numbers at the "top" of the range so that the random number is evenly distributed. E.g. imagine we had n=3
, and imagine we were using 3 bits rather than 31 bits. So bits
is a randomly generated number between 0 and 7. How can you generate a fair random number there? Answer: if bits
is 6 or 7, we throw it away and generate a new one.
It does this in order to assure an uniform distribution of values between 0
and n
. You might be tempted to do something like:
int x = rand.nextInt() % n;
but this will alter the distribution of values, unless n is a divisor of 2^31
, i.e. a power of 2. This is because the modulo operator would produce equivalence classes whose size is not the same.
For instance, let's suppose that nextInt()
generates an integer between 0 and 6 inclusive and you want to draw 0,1 or 2. Easy, right?
int x = rand.nextInt() % 3;
No. Let's see why:
0 % 3 = 0
1 % 3 = 1
2 % 3 = 2
3 % 3 = 0
4 % 3 = 1
5 % 3 = 2
6 % 3 = 0
So you have 3 values that map on 0 and only 2 values that map on 1 and 2. You have a bias now, as 0 is more likely to be returned than 1 or 2.
As always, the javadoc documents this behaviour:
The hedge "approximately" is used in the foregoing description only
because the next method is only approximately an unbiased source of
independently chosen bits. If it were a perfect source of randomly
chosen bits, then the algorithm shown would choose int values from the
stated range with perfect uniformity.
The algorithm is slightly tricky. It rejects values that would result
in an uneven distribution (due to the fact that 2^31 is not divisible
by n). The probability of a value being rejected depends on n. The
worst case is n=2^30+1, for which the probability of a reject is 1/2,
and the expected number of iterations before the loop terminates is 2.
The algorithm treats the case where n is a power of two specially: it
returns the correct number of high-order bits from the underlying
pseudo-random number generator. In the absence of special treatment,
the correct number of low-order bits would be returned. Linear
congruential pseudo-random number generators such as the one
implemented by this class are known to have short periods in the
sequence of values of their low-order bits. Thus, this special case
greatly increases the length of the sequence of values returned by
successive calls to this method if n is a small power of two.
The emphasis is mine.