I want to generate random numbers with a range (n to m, eg 100 to 150), but instead of purely random I want the results to be based on the normal distribution.
By this I mean that in general I want the numbers "clustered" around 125.
I've found this random number package that seems to have a lot of what I need: http://codeproject.com/KB/recipes/Random.aspx
It supports a variety of random generators (include mersiene twister) and can apply the generator to a distribution.
But I'm confused, if I use a normal distribution generator the random numbers are from roughly -6 to +8 (apparently the true range is float.min to float.max).
How do a scale that to my required range?
A standard normal distribution has mean 0 and standard deviation of 1; if you want to make a distribution with mean m
and deviation s
, simply multiply by s
and then add m
. Since the normal distribution is theoretically infinite, you can't have a hard cap on your range e.g. (100 to 150) without explicitly rejecting numbers that fall outside of it, but with an appropriate choice of deviation you can be assured that (e.g.) 99% of your numbers will be within the range.
About 99.7% of a population is within +/- 3 standard deviations, so if you pick yours to be about (25/3)
, it should work well.
So you want something like: (normal * 8.333) + 125
For the sake of interest, it's pretty straightforward to generate normally distributed random numbers from a uniform RNG (though it must be done in pairs):
Random rng = new Random();
double r = Math.Sqrt(-2 * Math.Log(rng.NextDouble()));
double θ = 2 * Math.Pi * rng.NextDouble();
double x = r * Math.Cos(θ);
double y = r * Math.Sin(θ);
x
and y
now contain two independent, normally distributed random numbers with mean 0 and variance 1. You can scale and translate them as necessary to get the range you want (as interjay explains).
Explanation:
This method is called the Box–Muller transform. It uses the property of the two-dimensional unit Gaussian that the density value itself, p = exp(-r^2/2)
, is uniformly distributed between 0
and 1
(normalisation constant removed for simplicity).
Since you can generate such a value easily using a uniform RNG, you end up with a circular contour of radius r = sqrt(-2 * log(p))
. You can then generate a second uniform random variate between 0
and 2*pi
to give you an angle θ
that defines a unique point on your circular contour. Finally, you can generate two i.i.d. normal random variates by transforming from polar coordinates (r, θ)
back into cartesian coordinates (x, y)
.
This property – that p
is uniformly distributed – doesn't hold for other dimensionalities, which is why you have to generate exactly two normal variates at a time.
tzaman's answer is correct, but when using the library you linked there is a simpler way than performing the calculation yourself: The NormalDistribution
object has writable properties Mu
(meaning the mean) and Sigma
(standard deviation). So going by tzaman's numbers, set Mu
to 125 and Sigma
to 8.333.
This may be too simplistic for your needs, but a quick & cheap way to get a random number with a distribution that's weighted toward the center is to simply add 2 (or more) random numbers.
Think of when you roll two 6-sided dice and add them. The sum is most often 7, then 6 and 8, then 5 and 9, etc. and only rarely 2 or 12.
Here's an other algoritm that doesn't need to calculate Sin/Cos, nor does it need to know Pi. Don't ask me about the theoretical background. I've found it somewhere once and it's what I've been using since. I suspect it's some kind of normalisation of the same Box-Muller transform that @Will Vousden mentions. It also produces results in pairs.
The example is VBscript; easy enough to convert into any other language.
Sub calcRandomGauss (byref y1, byref y2)
Dim x1, x2, w
Do
x1 = 2.0 * Rnd() - 1.0
x2 = 2.0 * Rnd() - 1.0
w = x1 * x1 + x2 * x2
Loop While w >= 1.0 Or w = 0 'edited this line, thanks Richard
w = Sqr((-2.0 * Log(w)) / w )
y1 = x1 * w
y2 = x2 * w
End Sub
A different approach to this problem uses the beta distribution (which does have a hard range, unlike the normal distribution) and involves choosing the appropriate parameters such that the distribution has the given mean and standard deviation (square root of variance). See this question.