I am interested in encapsulating a transactional xbegin and xend inside XBEGIN( ) and XEND( ) functions, in a static assembler lib. However I am unclear how (or if) the stack gets restored to the original xbegin calling state, given an xabort originating at some other stack level (higher or lower). In other words, is the dynamic stack context (including interrupts effects) managed and rolled back as just another part of the transaction?
This assembler approach is needed for a VC++ 2010 build that doesn't have _xbegin( ) and _xend( ) intrinsics supported or available, and x64 builds cannot use _asm { } inlining.
related: See also David Kanter's TSX writeup for some theory on how it works under the hood and how software can benefit from it, and this blog post for some experimental performance numbers on HSW (before the TSX bug was discovered and microcode updates disabled TSX on that hardware.)
The Intel insn ref manual entry for
xbegin
is pretty clear. (See the x86 tag wiki for links to Intel's official PDF, and other stuff.)So the instruction works like a conditional branch, where the branch condition is "did an abort happen before
XEND
?" e.g.:If an abort happens, it's as if all the code that ran after
xbegin
didn't run at all, just a jump to the fallback address witheax
modified.You might want to do something other than just infinite retries on an abort, of course. This isn't meant to be a real example. (This article does have a real example of the kind of logic you might want to use, using intrinsics. It looks like they just test
eax
instead of using thexbegin
as the jump in anif
, unless the compiler optimizes that check. IDK if it's the most efficient way.)What do you mean "interrupts effects"? In current implementations, anything that changes privilege level (like a syscall or interrupt) causes a transaction abort. So ring-level changes never need to be rolled back. The CPU will just abort the transaction when it encounters anything it can't roll back. This means the possible bugs include putting something inside the transaction that always causes an abort, but not that you do something that can't be rolled back.
You might want to try to get the compiler to emit the three-byte
XEND
instruction without a function call, so pushing the return address onto the stack isn't part of the transaction. e.g.I think this does still work in 64bit mode, since the doc mentions
rax
, and it looks like IACA's header file uses__asm _emit
.It'll be safer to put
XEND
in its own wrapper function, too, I guess. You just need a stop-gap until you can upgrade to a compiler with intrinsics, so it doesn't have to be perfect as long as the extra reads/writes from theret
andcall
don't cause too many aborts.