Is SSE floating-point arithmetic reproducible?

2020-01-27 04:37发布

The x87 FPU is notable for using an internal 80-bit precision mode, which often leads to unexpected and unreproducible results across compilers and machines. In my search for reproducible floating-point math on .NET, I discovered that both major implementations of .NET (Microsoft's and Mono) emit SSE instructions rather than x87 in 64-bit mode.

SSE(2) uses strictly 32-bit registers for 32-bit floats, and strictly 64-bit registers for 64-bit floats. Denormals can optionally be flushed to zero by setting the appropriate control word.

It would therefore appear that SSE does not suffer from the precision-related issues of x87, and that the only variable is the denormal behavior, which can be controlled.

Leaving aside the matter of transcendental functions (which are not natively provided by SSE unlike x87), does using SSE guarantee reproducible results across machines and compilers? Could compiler optimizations, for instance, translate into different results? I found some conflicting opinions:

If you have SSE2, use it and live happily ever after. SSE2 supports both 32b and 64b operations and the intermediate results are of the size of the operands. - Yossi Kreinin, http://www.yosefk.com/blog/consistency-how-to-defeat-the-purpose-of-ieee-floating-point.html

...

The SSE2 instructions (...) are fully IEEE754-1985 compliant, and they permit better reproducibility (thanks to the static rounding precision) and portability with other platforms. Muller et aliis, Handbook of Floating-Point Arithmetic - p.107

however:

Also, you can't use SSE or SSE2 for floating point, because it's too under-specified to be deterministic. - John Watte http://www.gamedev.net/topic/499435-floating-point-determinism/#entry4259411

2条回答
神经病院院长
2楼-- · 2020-01-27 05:23

SSE is fully specified*. Muller is an expert in floating point arithmetic; who are you going to trust, him or some guy on a gamedev forum?

(*) there are actually a few exceptions for non-IEEE-754 operations like rsqrtss, where Intel never fully specified the behavior, but that doesn't effect the IEEE-754 basic operations, and more importantly their behavior can't actually change at this point because it would break binary compatibility for too many things, so they're as good as specified.

查看更多
Juvenile、少年°
3楼-- · 2020-01-27 05:26

As Stephen noted, results produced by a given piece of SSE assembly code will be reproducible; you feed the same code the same input and you get the same output at the end. (That is, John Watte's quote is flat-out wrong.)

You threw the word "compilers" in there, though. That's a different ball game entirely. Many compilers are still quite bad at preserving the correctness of floating-point code. (The ATLAS errata page makes mention that clang "fails to produce correct code for some operations.") If you use special functions in your code, you're also, to some extent, at the mercy of whoever implemented your math library.

查看更多
登录 后发表回答