is boost::random::uniform_real_distribution suppos

2019-05-11 22:01发布

问题:

The following code produces different output on x86 32bit vs 64bit processors.

Is it supposed to be this way? If I replace it with std::uniform_real_distribution and compile with -std=c++11 it produces the same output on both processors.

#include <iostream>
#include <boost/random/mersenne_twister.hpp>
#include <boost/random/uniform_real_distribution.hpp>

int main()
{
    boost::mt19937 gen;
    gen.seed(4294653137UL);
    std::cout.precision(1000);
    double lo = - std::numeric_limits<double>::max() / 2 ;
    double hi = + std::numeric_limits<double>::max() / 2 ;
    boost::random::uniform_real_distribution<double> boost_distrib(lo, hi);
    std::cout << "lo " << lo << '\n';
    std::cout << "hi " << hi << "\n\n";
    std::cout << "boost distrib gen " << boost_distrib(gen) << '\n';
}

回答1:

BTW, you could have written boost::mt19937 gen(4294653137UL); to avoid seeding with the default seed (5489) in the default constructor. Your code has to loop over all 624 uint32_t elements of the generator's internal state twice.


The generator is always fine, and works the same on any machine. The difference only comes from using floating-point to map it to a uniform_real_distribution.

g++ -m32 -msse2 -mfpmath=sse produces identical output to all the other compilers. 32 vs 64bit is different because 64bit uses SSE for float math, so double temporaries are always 64bit. 32bit x86 defaults to using the legacy x87 FPU, where everything is 80bit internally, and only rounded down to 64bit double when storing to memory.

Note that bit-identical FP results in genral is NOT guaranteed with different compilers even on the same platform.

32bit clang still uses SSE math by default, so it gets identical results to 64bit clang or 64bit g++. Telling g++ to do the same solves the problem. -mfpmath=sse tells it to do calculations with SSE (although it doesn't change the ABI, so floating point return values are still in x87 st(0).) -msse2 tells g++ to assume the target machine supports SSE and SSE2. (sse2 added double-precision to sse's single-precision. SSE2 is baseline in the x86-64 architecture, and used to pass/return FP args in the 64bit ABI.)

Without SSE, you could (but don't) use -ffloat-store to precisely follow the C standard and round intermediate results to 32 or 64bits by storing and re-loading them. This adds about 6 cycles of latency to every FP math instruction. (Compared to 3 cycle FP add, 5 cycle FP mul on Intel Haswell.) So don't do this, you'll get horrible code.


debugging steps: I tried it out on Ubuntu 15.10, with g++ 5.2, clang-3.5, and clang-3.8 (from http://llvm.org/apt/).

for i in ./boost-random-seedint*; do echo -ne "$i:\t" ; $i|md5sum ;done
./boost-random-seedint-g++32:           53d99523ca2afeac428eae2c89e69974  -
./boost-random-seedint-g++64:           a59f08c0bc22b8753c474db077b809bd  -
./boost-random-seedint-clang3.5-32:     a59f08c0bc22b8753c474db077b809bd  -
./boost-random-seedint-clang3.5-64:     a59f08c0bc22b8753c474db077b809bd  -
./boost-random-seedint-clang3.8-32:     a59f08c0bc22b8753c474db077b809bd  -
./boost-random-seedint-clang3.8-64:     a59f08c0bc22b8753c474db077b809bd  -

So the only outlier is 32bit g++. All the other outputs have the same hash

Compiler options:

clang++-3.8 -m32 -O1 -g boost-random-seedint.cpp -o boost-random-seedint-clang3.8-32  # and similiar
g++ -m32 -Og -g boost-random-seedint.cpp -o boost-random-seedint32

clang doesn't have a -Og. 32bit g++ with -O0 and -O3 make binaries that give the same output as the one from -Og.


Debugging the 32 and 64bit binaries: their state arrays are identical after the default seed and after the call to gen.seed(4294653137UL).