I have heard that when dealing with mutexes, the necessary memory barriers are handled by the pthread API itself. I would like to have more details on this matter.
- Are these claimings true, at least on the most common architectures around?
- Does the compiler recognize this implicit barrier, and avoids reordering of operations/read from local registers when generating the code?
- When is the memory barrier applied: after successfully acquiring a mutex AND after releasing it?
The POSIX specification lists the functions that must "synchronize memory with respect to other threads", which includes functions like pthread_mutex_lock()
and pthread_mutex_unlock()
.
In Appendix A.4.11 it is spelt out that functions that "synchronize memory":
...would have to be recognized by advanced compilation systems so that memory operations and calls to these functions are not reordered by optimization; and
...would potentially have to have memory synchronization instructions added, depending on the particular machine.
It is never explicitly specified what kind of memory synchronization instructions are implied - the implicit specification is that if you use a pair of "synchronizing instructions" to ensure that a read in one thread must happen after a write in the other, then your program will operate correctly. This includes both compiler and architectural reordering effects.