According to the Intel 64 and IA-32 Architectures Software Developer's Manual the LOCK Signal Prefix "ensures that the processor has exclusive use of any shared memory while the signal is asserted". That can be a in the form of a bus or cache lock.
But - and that's the reason I'm asking this question - it isn't clear to me, if this Prefix also provides any memory-barrier.
I'm developing with NASM in a multi-processor environment and need to implement atomic operations with optional acquire and/or release semantics.
So, do I need to use the MFENCE, SFENCE and LFENCE instructions or would this be redundant?
No. From the IA32 manuals (Volume 3A, Chapter 8.2: Memory Ordering):
Therefore, a fence instruction is not needed with locked instructions.
No, there is no need to use instructions
MFENCE, SFENCE and LFENCE
in relation withLOCK
prefix.MFENCE, SFENCE and LFENCE
instruction guarantee visibility of memory in all CPU cores. On instance theMOV
instruction can't be used withLOCK
prefix, so to be sure that result of memory move is visible to all CPU cores we must be sure that CPU cache is flushed to RAM and that we reach with fence instructions.EDIT: more about locked atomic operations from Intel manual:
Problem still occurs when intel_lock1.c (available at URL above) is compiled on linux with GCC 5 or 7 without either of the args '-D_WITH_CLFLUSH_' or '-D_WITH_HLE_' (so that neither CLFLUSH* nor HLE XACQUIRE are used) - the mutex_lock assembler now looks like:
So, I'm trying replacing {L,S}FENCE with MFENCE .
I still don't quite understand how two threads can end up with same -1 *lck value though.