What are the advantages of using uniform_int_distr

2019-03-14 08:08发布

问题:

According to following results, generating uniform random integers between two numbers using % operation is almost 3 times faster than using std::uniform_int_distribution: Is there any good reason to use std::uniform_int_distribution?

Code:

#include <iostream>
#include <functional>
#include <vector>
#include <algorithm>
#include <random>

#include <cstdio>
#include <cstdlib>

using namespace std;

#define N 100000000

int main()
{

clock_t tic,toc;

for(int trials=0; trials<3; trials++)
{
    cout<<"trial: "<<trials<<endl;

    // uniform_int_distribution
    {
        int res = 0;
        mt19937 gen(1);
        uniform_int_distribution<int> dist(0,999);

        tic = clock();
        for(int i=0; i<N; i++)
        {
            int r = dist(gen);
            res += r;
            res %= 1000;
        }
        toc = clock();
        cout << "uniform_int_distribution: "<<(float)(toc-tic)/CLOCKS_PER_SEC << endl;
        cout<<res<<" "<<endl;

    }

    // simple modulus operation
    {
        int res = 0;
        mt19937 gen(1);

        tic = clock();
        for(int i=0; i<N; i++)
        {
            int r = gen()%1000;
            res += r;
            res %= 1000;
        }
        toc = clock();
        cout << "simple modulus operation: "<<(float)(toc-tic)/CLOCKS_PER_SEC << endl;
        cout<<res<<" "<<endl;

    }

    cout<<endl;
}

}

Output:

trial: 0
uniform_int_distribution: 2.90289
538 
simple modulus operation: 1.0232
575 

trial: 1
uniform_int_distribution: 2.86416
538 
simple modulus operation: 1.01866
575 

trial: 2
uniform_int_distribution: 2.94309
538 
simple modulus operation: 1.01809
575 

回答1:

You will get statistical bias when you use modulo (%) to map the range of e.g. rand() to another interval.

E.g suppose rand() maps uniformly (without bias) to [0, 32767] and you want to map to [0,4] doing rand() % 5. Then the values 0, 1, and 2 will on average be produced 6554 out of 32768 times, but the values 3 and 4 only 6553 times (so that 3 * 6554 + 2 * 6553 = 32768).

The bias is small (0.01%) but depending on your application that could prove fatal. Watch Stephan T. Lavavej's talk "rand() considered harmful" for more details.