I have two questions regarding implementation of Random
class in .NET Framework 4.6 (code available here):
What is the rationale for setting Seed
argument to 1
at the end of the constructor? It seems to be copy-pasted from Numerical Recipes in C (2nd Ed.) where it made some sense, but it doesn't have any in C#.
It is directly stated in the book (Numerical Recipes in C (2nd Ed.)) that inextp
field is set to value 31
because:
The constant 31 is special; see Knuth.
However, in the .NET implementation this field is set to value 21
. Why? The rest of a code seems to closely follow the code from book except for this detail.
Regarding the intexp
issue, this is a bug, one which Microsoft has acknowledged and refused to fix due to backwards compatibility concerns.
Indeed, you have discovered a genuine problem with the Random implementation.
We have discussed it within the team and with some of our partners and concluded that we unfortunately cannot fix the problem right now. The reason is that some applications rely on the fact that when initialised with the same seed, the generator produces the same pseudo random sequence. Even if the change is for the better, it will break the applications that made this assumption once they have migrated to the “fixed” version.
For some more context:
A while back I fully analysed this implementation. I found a few differences.
A the first one (perfectly fine) is a different large value (MBIG
). Numerical Recipies claims that Knuth makes it clear that any large value should work, so that is not an issue, and Microsoft reasonably chose to use the largest value of a 32 bit integer.
The second one was that constant, you mentioned. That one is a big deal. In the minimum it will substantially decrease period. There have been reports that the effects are actually worse than that.
But then comes one other particularly nasty difference. It is literally guarenteed to bias the output (since it does so directly), and will also likely affect the period of the RNG.
So what is this second issue? When .NET first came out, Microsoft did not realize that the RNG they coded was inclusive at both ends, and they documented it as exclusive at the maximum end. To fix this, the security team added a rather evil line of code: if (retVal == MBIG) retVal--;
. This is very unfortunately as the correct fix would literally be only 4 added characters (plus whitespace).
The correct fix would have been to change MBIG
to int.MaxValue-1
, but switch Sample()
to use MBIG+1
(i.e. to keep using int.MaxValue
). That would guarantee the that Sample has the range [0.0, 1.0) without introducing any bias, and only changes the value of MBIG
which Numerical Recipies said Knuth said is perfectly fine.