-->

Why do x86 FP compares set CF like unsigned intege

2020-04-13 04:17发布

问题:

The following documentation is provided in the Intel Instruction Reference for the COMISD instruction:

Compares the double-precision floating-point values in the low quadwords of operand 1 (first operand) and operand 2 (second operand), and sets the ZF, PF, and CF flags in the EFLAGS register according to the result (unordered, greater than, less than, or equal).

The CF's flag point is not really clear here since it is related to arithmetic operations on unsigned integers. By contrast, the documentation is concerned about floating point which are signed by definition. I ran a few experiments, like

mov rax, 0x123
movq xmm0, rax

mov rax, 0x124
movq xmm1, rax

ucomisd xmm0, xmm1 ;CF is set here like if
                    ;we would compare uints 0x123 and 0x124

So the instruction treats the operands as unsigned integers when setting Carry Flag up while the operands are double-precision floating points?

To me it looks a bit strange.

回答1:

Modern x86 SSE/AVX scalar FP compares set EFLAGS the same way as original 8086 + 8087
fcom + fstsw ax 1 + sahf.

  • fcom since 8086
  • fcomi new in PPro, sets EFLAGS directly
  • [u]comis[sd] new in SSE/SSE2, also sets EFLAGS directly.

After ruling out "unordered", the "above" (>), "below" (<), and "equal" (==) conditions for jcc/setcc/cmovcc/fcmovcc all have the appropriate semantic meaning. (And combinations of them like jae.)

Keeping the flag-setting the same made it easier for programmers and compiler-developers to drop in scalar SSE code, in place of scalar x87 code, without having to redo any logic about which way unordered compares (PF=ZF=CF=1) would go. Tricks like ja (CF==0) being taken only for > (not for unordered, equal, or below) still work identically with the same branches.

See http://www.ray.masmcode.com/tutorial/fpuchap7.htm for x87 FP comparisons. Also related: x86 assembler: floating point compare for more about flag-setting and how you can sometimes get away without a jp to rule out the unordered case.

Note that the packed-compare instructions like cmppd and cmpsd that produce a mask still use lt for less than in the names of their comparison predicates. (Since AVX, there are more detailed predicate names like LT_OQ (QNaN isn't an exception) vs. LT_OS (QNaN has its usual effect) vs. NLT_US (Unordered: also true when the comparison is unordered). Since they have to produce a 0/1 result from each packed comparison, those SIMD compare instructions need a single predicate to check as well as just doing a compare.


Also, unsigned conditions (CF) allow more optimizations. So changing to signed conditions would have been worse.

x86 has more instructions that do things with CF than with any other flag. For example, you can do tmp += (x > 10) with ucomisd / adc eax, 0. If SSE/SSE2 had decided to set SF (and clearing OF), you'd need sets or other setcc to feed an add instruction.


Why did x87 use ja/jbe instead of jg/jle?

OF is outside the low 8 bits of FLAGS so sahf can't set it. And popf to set the whole FLAGS register could set or clear other non-condition FLAGS like IF (interrupts enabled) or TF (single-step trap after every instruction). Plus being generally less convenient to use because of modifying SP.

Signed flag-conditions are based on SF!=OF or SF==OF, so it was impossible for the original 8086 FP branching mechanism to have used signed conditions. Instead, they lined up the C0, C2, and C3 bits in the FP status word with CF, PF, and ZF in FLAGS. This answer has an ASCII-art diagram.


Footnote 1: Actually fstsw ax was new in 286, according to NASM's appendix B. In actual 8086+8087 code, you'd use something like fstsw [bp-2] / mov ax, [bp-2] / sahf or whatever scratch space you wanted to use.


So the instruction treats the operands as unsigned integers

No, definitely not. They are interpreted as sign/magnitude IEEE binary64 FP bit-patterns.

Unsigned integer comparison would give a different result for negative floating point numbers: High bit set => higher unsigned integer, but represents a negative FP value.

With the high bit set, 0x8...4 is unsigned-integer above 0x8...3, but as an FP bit pattern, it represents a more-negative (lower) number.

Forget about the "unsigned" association of the "above" / "below" conditions when using them for FP. That's just what x86 calls the conditions that test the carry flag.

FP comparisons set the carry flag by a completely different mechanism than actual integer subtraction.



回答2:

You should look at the "Operation" section found lower on the page. To summarize:

  • PF is set when unordered, clear if ordered
  • ZF is set if zero or unordered, clear if nonzero
  • CF is set if less than or unordered, clear if greater than

And yes, the bit-patterns are interpreted as IEEE 754 binary64 double-precision floating point numbers.

0x123 and 0x124 are the bit-patterns for positive sub-normal values very close to 0.0. If DAZ was set in MXCSR, it would be interepreted as exactly 0.0. But by default subnormals are handled as per IEEE754.


If you want to test that it's really a sign/magnitude FP compare, instead of an unsigned integer compare, test with negative FP values. (High bit set => higher unsigned integer, but represents a more-negative lower FP value).

Fun fact: other than the sign bit, the biased-exponent representation that IEEE FP uses does make it possible to compare the integer bit-patterns of FP numbers. And to implement nextafter as an integer increment of the bit-pattern, after checking the sign to figure out whether to add +1 or -1.