I've encountered something a little confusing while trying to deal with a floating-point arithmetic problem.
First, the code. I've distilled the essence of my problem into this example:
#include <iostream>
#include <iomanip>
using namespace std;
typedef union {long long ll; double d;} bindouble;
int main(int argc, char** argv) {
bindouble y, z, tau, xinum, xiden;
y.d = 1.0d;
z.ll = 0x3fc5f8e2f0686eee; // double 0.17165791262311053
tau.ll = 0x3fab51c5e0bf9ef7; // double 0.053358253178712838
// xinum = double 0.16249854626123722 (0x3fc4ccc09aeb769a)
xinum.d = y.d * (z.d - tau.d) - tau.d * (z.d - 1);
// xiden = double 0.16249854626123725 (0x3fc4ccc09aeb769b)
xiden.d = z.d * (1 - tau.d);
cout << hex << xinum.ll << endl << xiden.ll << endl;
}
xinum
and xiden
should have the same value (when y == 1
), but because of floating-point roundoff error they don't. That part I get.
The question came up when I ran this code (actually, my real program) through GDB to track down the discrepancy. If I use GDB to reproduce the evaluations done in the code, it gives a different result for xiden:
$ gdb mathtest
GNU gdb (Gentoo 7.5 p1) 7.5
...
This GDB was configured as "x86_64-pc-linux-gnu".
...
(gdb) break 16
Breakpoint 1 at 0x4008ef: file mathtest.cpp, line 16.
(gdb) run
Starting program: /home/diazona/tmp/mathtest
...
Breakpoint 1, main (argc=1, argv=0x7fffffffd5f8) at mathtest.cpp:16
16 cout << hex << xinum.ll << endl << xiden.ll << endl;
(gdb) print xiden.d
$1 = 0.16249854626123725
(gdb) print z.d * (1 - tau.d)
$2 = 0.16249854626123722
You'll notice that if I ask GDB to calculate z.d * (1 - tau.d)
, it gives 0.16249854626123722 (0x3fc4ccc09aeb769a), whereas the actual C++ code that calculates the same thing in the program gives 0.16249854626123725 (0x3fc4ccc09aeb769b). So GDB must be using a different evaluation model for floating-point arithmetic. Can anyone shed some more light on this? How is GDB's evaluation different from my processor's evaluation?
I did look at this related question asking about GDB evaluating sqrt(3)
to 0, but this shouldn't be the same thing because there are no function calls involved here.
Could be because the x86 FPU works in registers to 80 bits accuracy, but rounds to 64 bits when the value is stored to memory. GDB will be storing to memory on every step of the (interpreted) computation.
GDB's runtime expression evaluation system is certainly not guaranteed to execute the same effective machine code for your floating point operations as the optimized and reordered machine code generated by your compiler to calculate the result of the same symbolic expression. Indeed, it is guaranteed not to execute the same machine code to calculate the value of the given expression z.d * (1 - tau.d)
, as this may be considered a subset of your program for which the isolated expression evaluation is performed at runtime in some arbitrary, "symbolically correct" way.
Floating-point code generation and realization of its output by the CPU is particularly prone to symbolic inconsistency with other implementations (such as a runtime expression evaluator) due to optimization (substitution, reordering, subexpression elimination, etc.), choice of instructions, choice of register allocation, and the floating-point environment. If your snippet contains many automatic variables in temporary expressions (as yours does), the code generation has an especially large amount of freedom with even zero optimization passes, and with that freedom comes the chance of-- in this case-- losing precision in the least-significant bit in a manner that appears inconsistent.
You won't get much insight into why GDB's runtime evaluator executed whatever instructions that it did w/o deep insight into the GDB source code, build settings, and its own compile-time generated code.
You could peak at the generated assembly for your procedure for an idea of how the final stores into z
, tau
, and [in contrast] xiden
work. The data flow for the floating-point operations leading to those stores is probably not as it seems.
Much easier, try making the code generation more deterministic by disabling all compiler optimization (e.g., -O0
on GCC) and rewriting the floating-point expressions to use no temporaries / automatic variables. Then break on every line in GDB and compare.
I wish that I could tell you exactly why that least-significant bit of the mantissa is flipped, but the truth is, the processor doesn't even "know" why something carried a bit and something else did not due to, e.g., order of evaluation without a complete instruction and data trace of both your code and GDB itself.
Its not GDB vs the processor, it's the memory vs the processor. The x64 processor stores more bits of accuracy than the memory actually holds (80ish vs 64 bits). As long as it stays in the CPU and registers, it retains 80ish bits of accuracy, but when it gets sent to memory will determine when and therefore how it gets rounded. If GDB sends all intermittent calculation results out of the CPU (I have no idea if this is the case, or anywhere close), it will do the rounding at every step, which leads to slightly different results.