What does ordered / unordered comparison mean?

2019-01-23 11:13发布

Looking at the SSE operators

CMPORDPS - ordered compare packed singles
CMPUNORDPS - unordered compare packed singles

What do ordered and unordered mean? I looked for equivalent instructions in the x86 instruction set, and it only seems to have unordered (FUCOM).

标签: assembly x86 sse
4条回答
来,给爷笑一个
2楼-- · 2019-01-23 11:49

TL:DR: Unordered is a relation two FP values can have. The "Unordered" in FUCOM means it doesn't raise an FP exception when the comparison result is unordered, while FCOM does. This is the same as the distinction between OQ and OS cmpps predicates


ORD and UNORD are two choices of predicate for the cmppd / cmpps / cmpss / cmpsd insns (full tables in the cmppd entry which is alphabetically first). That html extract has readable table formatting, but Intel's official PDF original is somewhat better. (See the tag wiki for links).

Two floating point operands are ordered with respect to each other if neither is NaN. They're unordered if either is NaN. i.e. ordered = (x>y) | (x==y) | (x<y);. That's right, with floating point it's possible for none of those things to be true. For more Floating Point madness, see Bruce Dawson's excellent series of articles.

cmpps takes a predicate and produces a vector of results, instead of doing a comparison between two scalars and setting flags so you can check any predicate you want after the fact. So it needs specific predicates for everything you can check.


The scalar equivalent is comiss / ucomiss to set ZF/PF/CF from the FP comparison result (which works like the x87 compare instructions (see the last section of this answer), but on the low element of XMM regs).

To check for unordered, look at PF. If the comparison is ordered, you can look at the other flags to see whether the operands were greater, equal, or less (using the same conditions as for unsigned integers, like jae for Above or Equal).


The COMISS instruction differs from the UCOMISS instruction in that it signals a SIMD floating-point invalid operation exception (#I) when a source operand is either a QNaN or SNaN. The UCOMISS instruction signals an invalid numeric exception only if a source operand is an SNaN.

Normally FP exceptions are masked, so this doesn't actually interrupt your program; it just sets the bit in the MXCSR which you can check later.

This is the same as O/UQ vs. O/US flavours of predicate for cmpps / vcmpps. The AVX version of the cmp[ps][sd] instructions have an expanded choice of predicate, so they needed a naming convention to keep track of them.

The O vs. U tells you whether the predicate is true when the operands are unordered.

The Q vs. S tells you whether #I will be raised if either operand is a Quiet NaN. #I will always be raised if either operand is a Signalling NaN, but those are not "naturally occurring". You don't get them as outputs from other operations, only by creating the bit pattern yourself (e.g. as an error-return value from a function, to ensure detection of problems later).


The x87 equivalent is using fcom or fucom to set the FPU status word -> fstsw ax -> sahf, or preferably fucomi to set EFLAGS directly like comiss.

The U / non-U distinction is the same with x87 instructions as for comiss / ucomiss

查看更多
在下西门庆
3楼-- · 2019-01-23 11:51

An ordered comparison checks if neither operand is NaN. Conversely, an unordered comparison checks if either operand is a NaN.

This page gives some more information on this:

The idea here is that comparisons with NaN are indeterminate. (can't decide the result) So an ordered/unordered comparison checks if this is (or isn't) the case.

double a = 0.;
double b = 0.;

__m128d x = _mm_set1_pd(a / b);     //  NaN
__m128d y = _mm_set1_pd(1.0);       //  1.0
__m128d z = _mm_set1_pd(1.0);       //  1.0

__m128d c0 = _mm_cmpord_pd(x,y);    //  NaN vs. 1.0
__m128d c1 = _mm_cmpunord_pd(x,y);  //  NaN vs. 1.0
__m128d c2 = _mm_cmpord_pd(y,z);    //  1.0 vs. 1.0
__m128d c3 = _mm_cmpunord_pd(y,z);  //  1.0 vs. 1.0
__m128d c4 = _mm_cmpord_pd(x,x);    //  NaN vs. NaN
__m128d c5 = _mm_cmpunord_pd(x,x);  //  NaN vs. NaN

cout << _mm_castpd_si128(c0).m128i_i64[0] << endl;
cout << _mm_castpd_si128(c1).m128i_i64[0] << endl;
cout << _mm_castpd_si128(c2).m128i_i64[0] << endl;
cout << _mm_castpd_si128(c3).m128i_i64[0] << endl;
cout << _mm_castpd_si128(c4).m128i_i64[0] << endl;
cout << _mm_castpd_si128(c5).m128i_i64[0] << endl;

Result:

0
-1
-1
0
0
-1
  • Ordered comparison of NaN and 1.0 gives false.
  • Unordered comparison of NaN and 1.0 gives true.
  • Ordered comparison of 1.0 and 1.0 gives true.
  • Unordered comparison of 1.0 and 1.0 gives false.
  • Ordered comparison of NaN and Nan gives false.
  • Unordered comparison of NaN and NaN gives true.
查看更多
神经病院院长
4楼-- · 2019-01-23 12:01

Perhaps this page on Visual C++ intrinsics can be of help? :)

CMPORDPS

r0 := (a0 ord? b0) ? 0xffffffff : 0x0
r1 := (a1 ord? b1) ? 0xffffffff : 0x0
r2 := (a2 ord? b2) ? 0xffffffff : 0x0
r3 := (a3 ord? b3) ? 0xffffffff : 0x0

CMPUNORDPS

r0 := (a0 unord? b0) ? 0xffffffff : 0x0
r1 := a1 ; r2 := a2 ; r3 := a3
查看更多
Melony?
5楼-- · 2019-01-23 12:05

This Intel guide: http://intel80386.com/simd/mmx2-doc.html contains examples of the two which are fairly straight-forward:

CMPORDPS Compare Ordered Parallel Scalars

Opcode Cycles Instruction 0F C2 .. 07 2 (3) CMPORDPS xmm reg,xmm reg/mem128

CMPORDPS op1, op2

op1 contains 4 single precision 32-bit floating point values op2 contains 4 single precision 32-bit floating point values

op1[0] = (op1[0] != NaN) && (op2[0] != NaN)
op1[1] = (op1[1] != NaN) && (op2[1] != NaN)
op1[2] = (op1[2] != NaN) && (op2[2] != NaN)
op1[3] = (op1[3] != NaN) && (op2[3] != NaN)

TRUE  = 0xFFFFFFFF
FALSE = 0x00000000

CMPUNORDPS Compare Unordered Parallel Scalars

Opcode Cycles Instruction 0F C2 .. 03 2 (3) CMPUNORDPS xmm reg,xmm reg/mem128

CMPUNORDPS op1, op2

op1 contains 4 single precision 32-bit floating point values op2 contains 4 single precision 32-bit floating point values

op1[0] = (op1[0] == NaN) || (op2[0] == NaN)
op1[1] = (op1[1] == NaN) || (op2[1] == NaN)
op1[2] = (op1[2] == NaN) || (op2[2] == NaN)
op1[3] = (op1[3] == NaN) || (op2[3] == NaN)

TRUE  = 0xFFFFFFFF
FALSE = 0x00000000

The difference is AND (ordered) vs OR (unordered).

查看更多
登录 后发表回答