Random Engine Differences

2020-02-16 12:10发布

问题:

The C++11 standard specifies a number of different engines for random number generation: linear_congruential_engine, mersenne_twister_engine, subtract_with_carry_engine and so on. Obviously, this is a large change from the old usage of std::rand.

Obviously, one of the major benefits of (at least some) of these engines is the massively increased period length (it's built into the name for std::mt19937).

However, the differences between the engines is less clear. What are the strengths and weaknesses of the different engines? When should one be used over the other? Is there a sensible default that should generally be preferred?

回答1:

From the explanations below, linear engine seems to be faster but less random while Marsenne Twister has a higher complexity and randomness. Subtract-with-carry random number engine is an improvement to the linear engine and it is definitelly more random. In the last reference, it is stated that Mersenne Twister has higher complexity than the Subtract-with-carry random number engine

Linear congruential random number engine

A pseudo-random number generator engine that produces unsigned integer numbers.

This is the simplest generator engine in the standard library. Its state is a single integer value, with the following transition algorithm:

x = (ax+c) mod m

Where x is the current state value, a and c are their respective template parameters, and m is its respective template parameter if this is greater than 0, or numerics_limits::max() plus 1, otherwise.

Its generation algorithm is a direct copy of the state value.

This makes it an extremely efficient generator in terms of processing and memory consumption, but producing numbers with varying degrees of serial correlation, depending on the specific parameters used.

The random numbers generated by linear_congruential_engine have a period of m. http://www.cplusplus.com/reference/random/linear_congruential_engine/

Mersenne twister random number engine

A pseudo-random number generator engine that produces unsigned integer numbers in the closed interval [0,2^w-1].

The algorithm used by this engine is optimized to compute large series of numbers (such as in Monte Carlo experiments) with an almost uniform distribution in the range.

The engine has an internal state sequence of n integer elements, which is filled with a pseudo-random series generated on construction or by calling member function seed.

The internal state sequence becomes the source for n elements: When the state is advanced (for example, in order to produce a new random number), the engine alters the state sequence by twisting the current value using xor mask a on a mix of bits determined by parameter r that come from that value and from a value m elements away (see operator() for details).

The random numbers produced are tempered versions of these twisted values. The tempering is a sequence of shift and xor operations defined by parameters u, d, s, b, t, c and l applied on the selected state value (see operator()).

The random numbers generated by mersenne_twister_engine have a period equivalent to the mersenne number 2^((n-1)*w)-1.http://www.cplusplus.com/reference/random/mersenne_twister_engine/

Subtract-with-carry random number engine

A pseudo-random number generator engine that produces unsigned integer numbers.

The algorithm used by this engine is a lagged fibonacci generator, with a state sequence of r integer elements, plus one carry value. http://www.cplusplus.com/reference/random/subtract_with_carry_engine/

Lagged Fibonacci generators have a maximum period of (2k - 1)*^(2M-1) if addition or subtraction is used. The initialization of LFGs is a very complex problem. The output of LFGs is very sensitive to initial conditions, and statistical defects may appear initially but also periodically in the output sequence unless extreme care is taken. Another potential problem with LFGs is that the mathematical theory behind them is incomplete, making it necessary to rely on statistical tests rather than theoretical performance. http://en.wikipedia.org/wiki/Lagged_Fibonacci_generator

And finally: The choice of which engine to use involves a number of tradeoffs: the linear congruential engine is moderately fast and has a very small storage requirement for state. The lagged Fibonacci generators are very fast even on processors without advanced arithmetic instruction sets, at the expense of greater state storage and sometimes less desirable spectral characteristics. The Mersenne twister is slower and has greater state storage requirements but with the right parameters has the longest non-repeating sequence with the most desirable spectral characteristics (for a given definition of desirable). in http://en.cppreference.com/w/cpp/numeric/random



回答2:

I think that the point is that random generators have different properties, which can make them more suitable or not for a given problem.

  • The period length is one of the properties.
  • The quality of the random numbers can also be important.
  • The performance of the generator can also be an issue.

Depending on your need, you might take one generator or another one. E.g., if you need fast random numbers but do not really care for the quality, an LCG might be a good option. If you want better quality random numbers, the Mersenne Twister is probably a better option.

To help you making your choice, there are some standard tests and results (I definitely like the table p.29 of this paper).


EDIT: From the paper,

  1. The LCG (LCG(***) in the paper) family are the fastest generators, but with the poorest quality.
  2. The Mersenne Twister (MT19937) is a little bit slower, but yields better random numbers.
  3. The substract with carry ( SWB(***), I think) are way slower, but can yield better random properties when well tuned.


回答3:

As the other answers forget about ranlux, here is a small note by an AMD developer that recently ported it to OpenCL:

https://community.amd.com/thread/139236

RANLUX is also one of very few (the only one I know of actually) PRNGs that has a underlying theory explaining why it generates "random" numbers, and why they are good. Indeed, if the theory is correct (and I don't know of anyone who has disputed it), RANLUX at the highest luxury level produces completely decorrelated numbers down to the last bit, with no long-range correlations as long as we stay well below the period (10^171). Most other generators can say very little about their quality (like Mersenne Twister, KISS etc.) They must rely on passing statistical tests.

Physicists at CERN are fan of this PRNG. 'nuff said.



回答4:

Some of the information in these other answers conflicts with my findings. I've run tests on Windows 8.1 using Visual Studio 2013, and consistently I've found mersenne_twister_engine to be but higher quality and significantly faster than either linear_congruential_engine or subtract_with_carry_engine. This leads me to believe, when the information in the other answers are taken into account, that the specific implementation of an engine has a significant impact on performance.

This is of great surprise to nobody, I'm sure, but it's not mentioned in the other answers where mersenne_twister_engine is said to be slower. I have no test results for other platforms and compilers, but with my configuration, mersenne_twister_engine is clearly the superior choice when considering period, quality, and speed performance. I have not profiled memory usage, so I cannot speak to the space requirement property.

Here's the code I'm using to test with (to make portable, you should only have to replace the windows.h QueryPerformanceXxx() API calls with an appropriate timing mechanism):

// compile with: cl.exe /EHsc
#include <random> 
#include <iostream>
#include <windows.h>

using namespace std;

void test_lc(const int a, const int b, const int s) {
    /*
    typedef linear_congruential_engine<unsigned int, 48271, 0, 2147483647> minstd_rand;
    */
    minstd_rand gen(1729);

    uniform_int_distribution<> distr(a, b);

    for (int i = 0; i < s; ++i) {
        distr(gen);
    }
}

void test_mt(const int a, const int b, const int s) {
    /*
    typedef mersenne_twister_engine<unsigned int, 32, 624, 397,
    31, 0x9908b0df,
    11, 0xffffffff,
    7, 0x9d2c5680,
    15, 0xefc60000,
    18, 1812433253> mt19937;
    */
    mt19937 gen(1729);

    uniform_int_distribution<> distr(a, b);

    for (int i = 0; i < s; ++i) {
        distr(gen);
    }
}

void test_swc(const int a, const int b, const int s) {
    /*
    typedef subtract_with_carry_engine<unsigned int, 24, 10, 24> ranlux24_base;
    */
    ranlux24_base gen(1729);

    uniform_int_distribution<> distr(a, b);

    for (int i = 0; i < s; ++i) {
        distr(gen);
    }
}

int main()
{
    int a_dist = 0;
    int b_dist = 1000;

    int samples = 100000000;

    cout << "Testing with " << samples << " samples." << endl;

    LARGE_INTEGER ElapsedTime;
    double        ElapsedSeconds = 0;

    LARGE_INTEGER Frequency;
    QueryPerformanceFrequency(&Frequency);
    double TickInterval = 1.0 / ((double) Frequency.QuadPart);

    LARGE_INTEGER StartingTime;
    LARGE_INTEGER EndingTime;
    QueryPerformanceCounter(&StartingTime);
    test_lc(a_dist, b_dist, samples);
    QueryPerformanceCounter(&EndingTime);
    ElapsedTime.QuadPart = EndingTime.QuadPart - StartingTime.QuadPart;
    ElapsedSeconds = ElapsedTime.QuadPart * TickInterval;
    cout << "linear_congruential_engine time: " << ElapsedSeconds << endl;

    QueryPerformanceCounter(&StartingTime);
    test_mt(a_dist, b_dist, samples);
    QueryPerformanceCounter(&EndingTime);
    ElapsedTime.QuadPart = EndingTime.QuadPart - StartingTime.QuadPart;
    ElapsedSeconds = ElapsedTime.QuadPart * TickInterval;
    cout << "   mersenne_twister_engine time: " << ElapsedSeconds << endl;

    QueryPerformanceCounter(&StartingTime);
    test_swc(a_dist, b_dist, samples);
    QueryPerformanceCounter(&EndingTime);
    ElapsedTime.QuadPart = EndingTime.QuadPart - StartingTime.QuadPart;
    ElapsedSeconds = ElapsedTime.QuadPart * TickInterval;
    cout << "subtract_with_carry_engine time: " << ElapsedSeconds << endl;
}

Output:

Testing with 100000000 samples.
linear_congruential_engine time: 10.0821
   mersenne_twister_engine time: 6.11615
subtract_with_carry_engine time: 9.26676


回答5:

I just saw this answer from Marnos and decided to test it myself. I used std::chono::high_resolution_clock to time 100000 samples 100 times to produce an average. I measured everything in std::chrono::nanoseconds and ended up with different results:

std::minstd_rand had an average of 28991658 nanoseconds

std::mt19937 had an average of 29871710 nanoseconds

ranlux48_base had an average of 29281677 nanoseconds

This is on a Windows 7 machine. Compiler is Mingw-Builds 4.8.1 64bit. This is obviously using the C++11 flag and no optimisation flags.

When I turn on -O3 optimisations, the std::minstd_rand and ranlux48_base actually run faster than what the implementation of high_precision_clock can measure; however std::mt19937 still takes 730045 nanoseconds, or 3/4 of a second.

So, as he said, it's implementation specific, but at least in GCC the average time seems to stick to what the descriptions in the accepted answer say. Mersenne Twister seems to benefit the least from optimizations, whereas the other two really just throw out the random numbers unbelieveably fast once you factor in compiler optimizations.

As an aside, I'd been using Mersenne Twister engine in my noise generation library (it doesn't precompute gradients), so I think I'll switch to one of the others to really see some speed improvements. In my case, the "true" randomness doesn't matter.

Code:

#include <iostream>
#include <chrono>
#include <random>

using namespace std;
using namespace std::chrono;

int main()
{
    minstd_rand linearCongruentialEngine;
    mt19937 mersenneTwister;
    ranlux48_base subtractWithCarry;
    uniform_real_distribution<float> distro;

    int numSamples = 100000;
    int repeats = 100;

    long long int avgL = 0;
    long long int avgM = 0;
    long long int avgS = 0;

    cout << "results:" << endl;

    for(int j = 0; j < repeats; ++j)
    {
        cout << "start of sequence: " << j << endl;

        auto start = high_resolution_clock::now();
        for(int i = 0; i < numSamples; ++i)
            distro(linearCongruentialEngine);
        auto stop = high_resolution_clock::now();
        auto L = duration_cast<nanoseconds>(stop-start).count();
        avgL += L;
        cout << "Linear Congruential:\t" << L << endl;

        start = high_resolution_clock::now();
        for(int i = 0; i < numSamples; ++i)
            distro(mersenneTwister);
        stop = high_resolution_clock::now();
        auto M = duration_cast<nanoseconds>(stop-start).count();
        avgM += M;
        cout << "Mersenne Twister:\t" << M << endl;

        start = high_resolution_clock::now();
        for(int i = 0; i < numSamples; ++i)
            distro(subtractWithCarry);
        stop = high_resolution_clock::now();
        auto S = duration_cast<nanoseconds>(stop-start).count();
        avgS += S;
        cout << "Subtract With Carry:\t" << S << endl;
    }

    cout << setprecision(10) << "\naverage:\nLinear Congruential: " << (long double)(avgL/repeats)
    << "\nMersenne Twister: " << (long double)(avgM/repeats)
    << "\nSubtract with Carry: " << (long double)(avgS/repeats) << endl;
}


回答6:

In general, mersenne twister is the best (and fastest) RNG, but it requires some space (about 2.5 kilobytes). Which one suits your need depends on how many times you need to instantiate the generator object. (If you need to instantiate it only once, or a few times, then MT is the one to use. If you need to instantiate it millions of times, then perhaps something smaller.)

Some people report that MT is slower than some of the others. According to my experiments, this depends a lot on your compiler optimization settings. Most importantly the -march=native setting may make a huge difference, depending on your host architecture.

I ran a small program to test the speed of different generators, and their sizes, and got this:

std::mt19937 (2504 bytes): 1.4714 s
std::mt19937_64 (2504 bytes): 1.50923 s
std::ranlux24 (120 bytes): 16.4865 s
std::ranlux48 (120 bytes): 57.7741 s
std::minstd_rand (4 bytes): 1.04819 s
std::minstd_rand0 (4 bytes): 1.33398 s
std::knuth_b (1032 bytes): 1.42746 s


回答7:

Its a trade-off really. A PRNG like Mersenne Twister is better because it has extremely large period and other good statistical properties.

But a large period PRNG takes up more memory (for maintaining the internal state) and also takes more time for generating a random number (due to complex transitions and post processing).

Choose a PNRG depending on the needs of your application. When in doubt use Mersenne Twister, its the default in many tools.



标签: c++ random c++11