IOK so I want to be able to pick values from a normal distribution that only ever fall between 0 and 1. In some cases I want to be able to basically just return a completely random distribution, and in other cases I want to return values that fall in the shape of a gaussian.
At the moment I am using the following function:
def blockedgauss(mu,sigma):
while True:
numb = random.gauss(mu,sigma)
if (numb > 0 and numb < 1):
break
return numb
It picks a value from a normal distribution, then discards it if it falls outside of the range 0 to 1, but I feel like there must be a better way of doing this.
I came across this post while searching for a way to return a series of values sampled from a normal distribution truncated between zero and 1 (i.e. probabilities). To help anyone else who has the same problem, I just wanted to note that scipy.stats.truncnorm has the built-in capability ".rvs".
So, if you wanted 100,000 samples with a mean of 0.5 and standard deviation of 0.1:
This gives a behavior very similar to numpy.random.normal, but within the bounds desired. Using the built-in will be substantially faster than looping to gather samples, especially for large values of N.
It sounds like you want a truncated normal distribution. Using scipy, you could use
scipy.stats.truncnorm
to generate random variates from such a distribution:The top figure shows the truncated normal distribution, the lower figure shows the normal distribution with the same mean
mu
and standard deviationsigma
.I have made an example script by the following. It shows how to use the APIs to implement the functions we wanted, such as generate samples with known parameters, how to compute CDF, PDF, etc. I also attach an image to show this.
In case anybody wants a solution using numpy only, here is a simple implementation using a normal function and a clip (the MacGyver's approach):
EDIT: do NOT use this!! this is how you shouldn't do it!! for instance,
a = truncated_normal(np.zeros(10000), 1, -10, 10)
may look like it works, but
b = truncated_normal(np.zeros(10000), 100, -1, 1)
will definitely not draw a truncated normal, as you can see in the following histogram:
Sorry for that, hope nobody got hurt! I guess the lesson is, don't try to emulate MacGyver at coding... Cheers,
Andres