I am working on a multithreded algorithm which reads two shared atomic variables:
std::atomic<int> a(10);
std::atomic<int> b(20);
void func(int key) {
int b_local = b;
int a_local = a;
/* Some Operations on a & b*/
}
The invariant of the algorithm is that b
should be read before reading a
.
The question is, can compiler(say GCC) re-order the instructions so that a
is read before b
? Using explicit memory fences would achieve this but what I want to understand is, can two atomic loads be re-ordered.
Further, after going through Acquire/Release semantics from Herb Sutter's talk(http://herbsutter.com/2013/02/11/atomic-weapons-the-c-memory-model-and-modern-hardware/), I understand that a sequentially consistent system ensures an ordering between acquire(like load) and release(like store). How about ordering between two acquires(like two loads)?
Edit: Adding more info about the code:
Consider two threads T1 & T2 executing:
T1 : reads value of b
, sleeps
T2 : changes value of a
, returns
T1 : wakes up and reads the new value of a
(new value)
Now, consider this scenario with re-ordering:
int a_local =a;
int b_local = b;
T1 : reads value of a
, sleeps
T2 : changes value of a
, returns
T1 : Doesn't know any thing about change in value of a
.
The question is "Can a compiler like GCC re-order two atomic loads`
Description of memory_order_acquire
:
no memory accesses in the current thread can be reordered before this load.
As default memory order when loading b
is memory_order_seq_cst
, which is the strongest one, reading from a
cannot be reordered before reading from b
.
Even weaker memory orders, as in the code below, provide same garantee:
int b_local = b.load(std::memory_order_acquire);
int a_local = a.load(std::memory_order_relaxed);
Yes, they can be reordered since one order is not different from the other and you put no constraints to force any particular order. There is only one relationship between these lines of codes: int b_local = b;
is sequenced before int a_local = a;
but since you have only one thread in your code and 2 lines are independent it is completely irrelevant which line is completed first for the 3rd line of code(whatever that line might be) and, hence compiler might reorder it without a doubt.
So, if you need to force some particular order you need:
2+ threads of execution
Establish a happens before relationship between 2 operations in these threads.
Here's what __atomic_base is doing when you call assignment:
operator __pointer_type() const noexcept
{ return load(); }
_GLIBCXX_ALWAYS_INLINE __pointer_type
load(memory_order __m = memory_order_seq_cst) const noexcept
{
memory_order __b = __m & __memory_order_mask;
__glibcxx_assert(__b != memory_order_release);
__glibcxx_assert(__b != memory_order_acq_rel);
return __atomic_load_n(&_M_p, __m);
}
As per the GCC docs on builtins like __atomic_load_n:
https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
"An atomic operation can both constrain code motion and be mapped to hardware instructions for synchronization between threads (e.g., a fence). To which extent this happens is controlled by the memory orders, which are listed here in approximately ascending order of strength. The description of each memory order is only meant to roughly illustrate the effects and is not a specification; see the C++11 memory model for precise semantics.
__ATOMIC_RELAXED
Implies no inter-thread ordering constraints.
__ATOMIC_CONSUME
This is currently implemented using the stronger __ATOMIC_ACQUIRE memory order because of a deficiency in C++11's semantics for memory_order_consume.
__ATOMIC_ACQUIRE
Creates an inter-thread happens-before constraint from the release (or stronger) semantic store to this acquire load. Can prevent hoisting of code to before the operation.
__ATOMIC_RELEASE
Creates an inter-thread happens-before constraint to acquire (or stronger) semantic loads that read from this release store. Can prevent sinking of code to after the operation.
__ATOMIC_ACQ_REL
Combines the effects of both __ATOMIC_ACQUIRE and __ATOMIC_RELEASE.
__ATOMIC_SEQ_CST
Enforces total ordering with all other __ATOMIC_SEQ_CST operations. "
So, if I'm reading this right, it does "constrain code motion", which I read to mean prevent reordering. But I could be misinterpreting the docs.
Yes, I think it can do reorder in addition to several optimizations.
Please check the following resources:
Atomic vs. Non-Atomic Operations
In case you still concern about this issue, try to use mutexes which for sure will prevent memory reordering.