clarifications on full memory barriers involved by

2019-02-25 04:41发布

问题:

I have heard that when dealing with mutexes, the necessary memory barriers are handled by the pthread API itself. I would like to have more details on this matter.

Are these claimings true, at least on the most common architectures around?
Does the compiler recognize this implicit barrier, and avoids reordering of operations/read from local registers when generating the code?
When is the memory barrier applied: after successfully acquiring a mutex AND after releasing it?

回答1:

The POSIX specification lists the functions that must "synchronize memory with respect to other threads", which includes functions like pthread_mutex_lock() and pthread_mutex_unlock().

In Appendix A.4.11 it is spelt out that functions that "synchronize memory":

...would have to be recognized by advanced compilation systems so that memory operations and calls to these functions are not reordered by optimization; and
...would potentially have to have memory synchronization instructions added, depending on the particular machine.

It is never explicitly specified what kind of memory synchronization instructions are implied - the implicit specification is that if you use a pair of "synchronizing instructions" to ensure that a read in one thread must happen after a write in the other, then your program will operate correctly. This includes both compiler and architectural reordering effects.