I'm looking for an efficient way to generate random floating-point numbers on the open-open interval (0,1). I currently have an RNG that generates random integers on the closed-closed interval of [0, (2^32)-1]. I've already created a half-open floating point RNG on the interval [0,1) by simply multiplying my result from the integer RNG by 1/((2^32)-1) rather than dividing by (2^32)-1 since it's inefficient.
The way I'm currently going about generating numbers on the interval (0,1) is with a conditional statement like the one below:
float open_open_flt = (closed_open_flt==0) ? closed_open_flt : FLT_MIN;
Unfortunately, this is rather inefficient since it is control code and I feel like it introduces some bias.
Can anybody suggest an alternative?
Given a sample
x
selected randomly from [0, 232), I propose using:Reasoning:
x
produces slightly less than 1-2-25 before rounding, so it is rounded to the largestfloat
less than 1, which is 1-2-24. If we made it any larger, some values would round to 1, which we do not want. If we made it smaller, fewer values would round to 1-2-24, so it would be less represented than we desire (more on this below).x
produces 2-25. This produces some symmetry: The distribution is compelled to stop at the high side 1-2-25 before rounding, as explained above, so we make it symmetric on the bottom side, stopping at 0+2-25. To some extent, it is as if we are binning the real number line in bins of width 2-24 and then removing the bins centered on 0 and 1 (which extend 2-25 to either side of those numbers).float
values show up in the bins, because the resolution offloat
varies. It is finer near 0 and coarser near 1. With this arrangement, each bin is about uniformly represented, but the lower bins will have more samples with lower probability each. The overall distribution remains uniform.As you can see, the floating-point format forces some irregularities in a distribution from 0 to 1. This issue has been raised in other Stack Overflow questions but never thoroughly discussed, to my knowledge. Whether it suits your purposes to leave these irregularities as described above depends on your application.
Potential variations:
float
format is finer.double
. Then there is a 1-1 map from originalx
values to floating-point values, and you can likely get as close to 0 and 1 as desired.Also, contrary to ElKamina’s answer, floating-point comparison (even to zero) is not generally faster than addition. Comparison requires branching on the result, which is an issue in many modern CPUs.
Since the probability of actually observing 0 probability is very small, and checking if a number is equal to 0 is least expensive (as compared to addition or multiplication), I would regenerate the random number repeatedly until it is not equal to 0.
You are already there.
The smallest distance between two floats your current generator produces is 1/(2^32).
So, your generator is efectively producing [0,1-1/(2^32)].
1/(2^32) is greater than FLT_MIN.
Thus if you add FLT_MIN to your generator,
you'll get [FLT_MIN,1-(1/(2^32))+FLT_MIN], which works as a (0,1) generator.
This means that your generator 'tries' to produce 2^32 different values. Problem is, float type is 4 bytes long, thus having less than 2^32 distinct defined values overall. To be precise, there can be only 2^23 values on interval [1/2, 1). Depending on what you need it may be a problem or not.
You may want to use lagged Fibonacci generator (wiki) with iteration
This already produces numbers from [0,1), given that initial values belong to that interval and may be good enough for your purposes.