Do FMA (fused multiply-add) instructions always pr

2020-07-06 03:20发布

问题:

I have this assembly (AT&T syntax):

mulsd   %xmm0, %xmm1
addsd   %xmm1, %xmm2

I want to replace it with:

vfmadd231sd %xmm0, %xmm1, %xmm2

Will this transformation always leave equivalent state in all involved registers and flags? Or will the result floats differ slightly in someway? (If they differ, why is that?)

(About the FMA instructions: http://en.wikipedia.org/wiki/FMA_instruction_set)

回答1:

No. In fact, a major part of the benefit of fused multiply-add is that it does not (necessarily) produce the same result as a separate multiply and add.

As a (somewhat contrived) example, suppose that we have:

double a = 1 + 0x1.0p-52 // 1 + 2**-52
double b = 1 - 0x1.0p-52 // 1 - 2**-52

and we want to compute a*b - 1. The "mathematically exact" value of a*b - 1 is:

(1 + 2**-52)(1 - 2**-52) - 1 = 1 + 2**-52 - 2**52 - 2**-104 - 1 = -2**-104

but if we first compute a*b using multiplication it rounds to 1.0, so the subsequent subtraction of 1.0 produces a result of zero.

If we use fma(a,b,-1) instead, we eliminate the intermediate rounding of the product, which allows us to get the "real" answer, -1.0p-104.

Please note that not only do we get a different result, but different flags have been set as well; a separate multiply and subtract sets the inexact flag, whereas the fused multiply-add does not set any flags.