I am generating binomially distributed random numbers using STL's 'random'. It becomes extremely slow when the range is big. For the range 40 it takes 12 seconds to generate 100 numbers. For bigger ranges time increases dramatically (I need ranges around 10000). It does not seem to depend on the probability parameter. I am using g++ 4.5.0.
#include <iostream>
#include <random>
using namespace std;
vector<int> v;
default_random_engine gen(123);
binomial_distribution<int> rbin(40,0.7);
int main(){
v.reserve(2000);
for(int i=0; i<100;++i){
v.push_back(rbin(gen));
}
}
Output:
50.~/.../fs/> g++ -std=c++0x q.cpp
51.~/.../fs/> time ./a.out
real 0m12.102s
user 0m12.094s
sys 0m0.002s
52.~/.../fs/>
I could use Normal approximation, but it is bad for extreme values of probability parameter.
Update:
With '-O3' option time becomes ~2 seconds. With g++ 4.6.3 the problem disappears entirely -- there is hardly any dependence of time on the range, and generation of 100 numbers takes 5ms.
For large ranges, libstdc++ will use an efficient rejection algorithm (after Devroye, L. Non-Uniform Random Variates Generation), but only if C99 TR1 math is available (
_GLIBCXX_USE_C99_MATH_TR1
). Otherwise, it will fall back to a simple waiting time method, which will have performance linear in the range.I'd suggest checking the value of
_GLIBCXX_USE_C99_MATH_TR1
and whether performance improves on more recent versions of g++.You should be sure to enable optimization when performance matters.
Also you should take a look at the available random number engines and ensure you're using one that meets you performance/size/quality requirements.
If the problem truly is that
std::binomial_distribution::operator()
is not performing adequately you may have to use a different standard library implementation, or an alternative implementation ofstd::binomial_distribution
. boost should have an alternative implementation of<random>
which you should be able to use without too much trouble, libc++ also has an alternative implementation, but it will be more difficult to use because you'd have to replace the entire standard library implementation.