I'm trying to write GCC inline asm for CMPXCHG8B for ia32. No, I cannot use __sync_bool_compare_and_swap
. It has to work with and without -fPIC.
So far the best I've (EDIT: does not work after all, see my own answer below for details) is
register int32 ebx_val asm("ebx")= set & 0xFFFFFFFF;
asm ("lock; cmpxchg8b %0;"
"setz %1;"
: "+m" (*a), "=q" (ret), "+A" (*cmp)
: "r" (ebx_val), "c" ((int32)(set >> 32))
: "flags")
However I'm not sure if this is in fact correct.
I cannot do "b" ((int32)(set & 0xFFFFFFFF))
for ebx_val due to PIC, but apparently register asm("ebx")
variable is accepted by the compiler.
BONUS: the ret variable is used for branching, so the code ends up looking like this:
cmpxchg8b [edi];
setz cl;
cmp cl, 0;
je foo;
Any idea how to describe output operands so that it becomes:
cmpxchg8b [edi]
jz foo
?
Thank you.
This is what I have:
It uses the
asm goto
feature, new with gcc 4.5, that allows jumps from inline assembly into C labels. (Oh, I see your comment about having to support old versions of gcc. Oh well. I tried. :-P)How about the following, which seems to work for me in a small test:
If this also gets miscompiled could you please include a small snippet that triggers this behavior?
Regarding the bonus question I don't think it is possible to branch after the assembler block using the condition code from the
cmpxchg8b
instruction (unless you use theasm goto
or similar functionality). From GNU C Language Extensions:EDIT: I Can't find any source that specifies one way or the other whether it is OK to modify the stack while also using the
%N
input values (This ancient link says "You can even push your registers onto the stack, use them, and put them back." but the example doesn't have input).But it should be possible to do without by fixing the values to other registers:
Amazingly enough, the code fragment in the question still gets miscompiled in some circumstances: if the zero-th asm operand is indirectly addressable through EBX (PIC) before the EBX register is set up with
register asm
, then gcc proceeds to load the operand through EBX after it's assigned toset & 0xFFFFFFFF
!This is the code I am trying to make work now: (EDIT: avoid push/pop)
The idea here is to load the operands before clobbering the EBX, also avoid any indirect addressing while setting EBX value for CMPXCHG8B. I fix the hard register ESI for the lower half of operand, because if I didn't, GCC would feel free to reuse any other already taken register if it could prove that the value was equal. The EDI register is saved manually, as simply adding it to the clobbered register list chokes GCC with "impossible reloads", probably due to high register pressure. The PUSH/POP is avoided in saving EDI, as other operands might be ESP-addressed.