I implemented this function to generate a poisson random variable
typedef long unsigned int luint;
luint poisson(luint lambda) {
double L = exp(-double(lambda));
luint k = 0;
double p = 1;
do {
k++;
p *= mrand.rand();
} while( p > L);
return (k-1);
}
where mrand is the MersenneTwister random number generator. I find that, as I increase lambda, the expected distribution is going to be wrong, with a mean that saturates at around 750. Is it due to numerical approximations or did I make any mistakes?
In situations like these, you don't need to invoke the random number generator more than once. All you need is a table of cumulative probabilities:
Then generate a random number
0 <= r < 1
, and take the first integerX
such thatc[X] > r
. You can find thisX
with a binary search.To generate this table, we need the individual probabilities
If
lambda
is large, this becomes wildly inaccurate, as you have found. But we can use a trick here: start at (or near) the largest value, withk = floor[lambda]
, and pretend for the moment thatp[k]
is equal to1
. Then calculatep[i]
fori > k
using the recurrence relationand for
i < k
usingThis ensures that the largest probabilities have the greatest possible precision.
Now just calculate
c[i]
usingc[i+1] = c[i] + p[i+1]
, up to the point wherec[i+1]
is the same asc[i]
. Then you can normalise the array by dividing by this limiting valuec[i]
; or you can leave the array as it is, and use a random number0 <= r < c[i]
.See: http://en.wikipedia.org/wiki/Inverse_transform_sampling
From another question I asked earlier, it seems you could also approximate
poisson(750)
aspoisson(375) + poisson(375)
.If you go the "existing library" route, your compiler may already support the C++11 std::random package. Here is how you use it:
I've used it two ways above:
I tried to imitate your existing interface.
If you create a std::poisson_distribution with a mean, it is more efficient to use that distribution over and over for the same mean (as done in main()).
Here is sample output for me:
Since you only use
L
in the expression(p>L)
, you're essentially testing for(log(p) > -lambda)
. That's not a very helpful transformation. Sure, you don't need exp(-750) anymore, but you'll just overflowp
instead.Now,
p
is just Π(mrand.rand()), and log(p) is log(Π(mrand.rand())) is Σ(log(mrand.rand()). That gives you the necessary transformation:double
has only 11 bits of exponent, but a 52 bits mantissa. Therefore this is a massive increase in numerical stability. The price paid is that you need alog
on every iteration, instead of a singleexp
up front.exp(-750) is a very small number, very close to the smallest possible double, so your issue is numerical. In any case, your complexity will be linear in lambda, so the algorithm isn't very efficient for high lambda. Unless you have a great reason to code this yourself, using an existing library implementation probably makes sense, as these numerical algorithms tend to be touchy precisely for the precision issues you're encountering.