I am going through the assembly generated by GCC for an ARM Cortex M4, and noticed that atomic_compare_exchange_weak
gets two DMB
instructions inserted around the condition (compiled with GCC 4.9 using -std=gnu11 -O2
):
// if (atomic_compare_exchange_weak(&address, &x, y))
dmb sy
ldrex r0, [r3]
cmp r0, r2
itt eq
strexeq lr, r1, [r3]
cmpeq.w lr, #0
dmb sy
bne.n ...
Since the programming guide to barrier instructions for ARM Cortex M4 states that:
Omitting the DMB or DSB instruction in the examples in Figure 41 and Figure 42 would not cause any error because the Cortex-M processors:
- do not re-order memory transfers
- do not permit two write transfers to be overlapped.
Is there any reason why these instructions couldn't be removed when targetting Cortex M?
I'm not aware of whether Cortex M4 can be used in a multi-cpu/multi-core configuration, but in general:
Presence or lack of reordering memory writes at the hardware level is irrelevant.
Of course I would expect the DMB instruction to be essentially free on chips that don't support SMP, so I'm not sure why you'd want to try to hack it out.
Please note that, based on the question's referencing the code the compiler produces for atomic intrinsics, I'm assuming the context is for synchronization of atomics to make them match the high-level specification, not other uses like IO barriers for MMIO, and the above "never" should not be read as applying to this (unrelated) use (though I suspect, for the reasons you already cited, it doesn't apply to Cortex M4).