If I have something like this...
volatile long something_global = 0;
long some_public_func()
{
return something_global++;
}
Would it be reasonable to expect this code to not break (race condition) when accessed with multiple threads? If it's not standard, could it still be done as a reasonable assumption about modern compilers?
NOTE: ALL I'm using this for is atomic increment and decrement - nothing fancier.
No - volatile does not mean synchronized. It just means that every access will return the most up-to-date value (as opposed to a copy cached locally in the thread).
Post-increment is not an atomic operation, it is a memory access followed by a memory write. Interleaving two can mean that the value is actually incremented just once.
No, you must use platform-dependent atomic accesses. There are several libraries that abstract these -- GLib provides portable atomic operations that fall back to mutex locks if necessary, and I believe Boost also provides portable atomics.
As I recently learned, for truly atomic access, you need a full memory barrier which volatile
does not provide. All volatile guarantees is that the memory will be re-read at each access and that accesses to volatile
memory will not be reordered. It is possible for the optimizer to re-order some non-volatile access before or after a volatile read/write -- possibly in the middle of your increment! -- so you must use actual atomic operations.
On modern fast multicore processors, there is a significant overhead with atomic instructions due to caching and write buffers.
So compilers won't emit atomic instructions just because you added the volatile
keyword. You need to resort to inline assembly or compiler-specific extensions (e.g. gcc atomic builtins).
I recommend using a library. The easy way is to just take a lock when you want to update the variable. Semaphores will probably be faster if they're appropriate to what you're doing. It seems GLib provides a reasonably efficient implementation.
Windows provides InterlockedIncrement (and InterlockedDecrement) to do what you are asking.
Volatile just prevents optimizations, but atomicity needs more. In x86, instructions must be preceeded by a LOCK prefix, in MIPS the RMW cycle must be surrounded by an LL/SC construct, ...
Your problem is that the C doesn't guarantee atomicity of the increment operators, and in practice, they often won't be atomic. You have to use a library like the Windows API or compiler builtin functions (GCC, MSVC) for that.