From this question: Random number generator which gravitates numbers to any given number in range? I did some research since I've come across such a random number generator before. All I remember was the name "Mueller", so I guess I found it, here:
I can find numerous implementations of it in other languages, but I can't seem to implement it correctly in C#.
This page, for instance, The Box-Muller Method for Generating Gaussian Random Numbers says that the code should look like this (this is not C#):
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <time.h>
double gaussian(void)
{
static double v, fac;
static int phase = 0;
double S, Z, U1, U2, u;
if (phase)
Z = v * fac;
else
{
do
{
U1 = (double)rand() / RAND_MAX;
U2 = (double)rand() / RAND_MAX;
u = 2. * U1 - 1.;
v = 2. * U2 - 1.;
S = u * u + v * v;
} while (S >= 1);
fac = sqrt (-2. * log(S) / S);
Z = u * fac;
}
phase = 1 - phase;
return Z;
}
Now, here's my implementation of the above in C#. Note that the transform produces 2 numbers, hence the trick with the "phase" above. I simply discard the second value and return the first.
public static double NextGaussianDouble(this Random r)
{
double u, v, S;
do
{
u = 2.0 * r.NextDouble() - 1.0;
v = 2.0 * r.NextDouble() - 1.0;
S = u * u + v * v;
}
while (S >= 1.0);
double fac = Math.Sqrt(-2.0 * Math.Log(S) / S);
return u * fac;
}
My question is with the following specific scenario, where my code doesn't return a value in the range of 0-1, and I can't understand how the original code can either.
- u = 0.5, v = 0.1
- S becomes
0.5*0.5 + 0.1*0.1
=0.26
- fac becomes ~
3.22
- the return value is thus ~
0.5 * 3.22
or ~1.6
That's not within 0 .. 1
.
What am I doing wrong/not understanding?
If I modify my code so that instead of multiplying fac
with u
, I multiply by S
, I get a value that ranges from 0 to 1, but it has the wrong distribution (seems to have a maximum distribution around 0.7-0.8 and then tapers off in both directions.)
This is a monte carlo method so you can't clamp the result, but what you can do is ignore samples.
I am no mathematician, or statistician, but if I think about this I would not expect a Gaussian distribution to return numbers in an exact range. Given your implementation the mean is 0 and the standard deviation is 1 so I would expect values distributed on the bell curve with 0 at the center and then reducing as the numbers deviate from 0 on either side. So the sequence would definitely cover both +/- numbers.
Then since it is statistical, why would it be hard limited to -1..1 just because the std.dev is 1? There can statistically be some play on either side and still fulfill the statistical requirement.
Your code is fine. Your mistake is thinking that it should return values exclusively within
[0, 1]
. The (standard) normal distribution is a distribution with nonzero weight on the entire real line. That is, values outside of[0, 1]
are possible. In fact, values within[-1, 0]
are just as likely as values within[0, 1]
, and moreover, the complement of[0, 1]
has about 66% of the weight of the normal distribution. Therefore, 66% of the time we expect a value outside of[0, 1]
.Also, I think this is not the Box-Mueller transform, but is actually the Marsaglia polar method.
I think the function returns polar coordinates. So you need both values to get correct results.
Also, Gaussian distribution is not between
0 .. 1
. It can easily end up as 1000, but probability of such occurrence is extremely low.The uniform random variate is indeed within 0..1, but the gaussian random variate (which is what Box-Muller algorithm generates) can be anywhere on the real line. See wiki/NormalDistribution for details.