Considering the following example:
private int sharedState = 0;
private void FirstThread() {
Volatile.Write(ref sharedState, 1);
}
private void SecondThread() {
int sharedStateSnapshot = Volatile.Read(ref sharedState);
Console.WriteLine(sharedStateSnapshot);
}
Until recently, I was under the impression that, as long as FirstThread()
really did execute before SecondThread()
, this program could not output anything but 1.
However, my understanding now is that:
- Volatile.Write() emits a release fence. This means no preceding load or store (in program order) may happen after the assignment of 1 to
sharedState
.
- Volatile.Read() emits an acquire fence. This means no subsequent load or store (in program order) may happen before the copying of
sharedState
to sharedStateSnapshot
.
Or, to put it another way:
- When
sharedState
is actually released to all processor cores, everything preceding that write will also be released, and,
- When the value in the address
sharedStateSnapshot
is acquired; sharedState
must have been already acquired.
If my understanding is therefore correct, then there is nothing to prevent the acquisition of sharedState
being 'stale', if the write in FirstThread()
has not already been released.
If this is true, how can we actually ensure (assuming the weakest processor memory model, such as ARM or Alpha), that the program will always print 1? (Or have I made an error in my mental model somewhere?)
Your understanding is correct, and it is true that you cannot ensure that the program will always print 1 using these techniques. To ensure your program will print 1, assuming thread 2 runs after thread one, you need two fences on each thread.
The easiest way to achieve that is using the lock
keyword:
private int sharedState = 0;
private readonly object locker = new object();
private void FirstThread()
{
lock (locker)
{
sharedState = 1;
}
}
private void SecondThread()
{
int sharedStateSnapshot;
lock (locker)
{
sharedStateSnapshot = sharedState;
}
Console.WriteLine(sharedStateSnapshot);
}
I'd like to quote Eric Lippert:
Frankly, I discourage you from ever making a volatile field. Volatile fields are a sign that you are doing something downright crazy: you're attempting to read and write the same value on two different threads without putting a lock in place.
The same applies to calling Volatile.Read
and Volatile.Write
. In fact, they are even worse than volatile fields, since they require you to do manually what the volatile
modifier does automatically.
You're right, there's no guarantee that release stores will be immediately visible to all processors. Volatile.Read
and Volatile.Write
give you acquire/release semantics, but no immediacy guarantees.
The volatile
modifier seems to do this though. The compiler will emit an OpCodes.Volatile
IL instruction, and the jitter will tell the processor not to store the variable on any of its registers (see Hans Passant's answer).
But why do you need it to be immediate anyway? What if your SecondThread
happens to run a couple of milliseconds sooner, before the values are actually wrote? Seeing as the scheduling is non-deterministic, the correctness of your program shouldn't depend on this "immediacy" anyway.
Until recently, I was under the impression that, as long as
FirstThread() really did execute before SecondThread(), this program
could not output anything but 1.
As you go on to explain yourself, this impression is wrong. Volatile.Read
simply issues a read operation on its target followed by a memory barrier; the memory barrier prevents operation reordering on the processor executing the current thread but this does not help here because
- There are no operations to reorder (just the single read or write in each thread).
- The race condition across your threads means that even if the no-reorder guarantee applied across processors, it would simply mean that the order of operations which you cannot predict anyway would be preserved.
If my understanding is therefore correct, then there is nothing to
prevent the acquisition of sharedState being 'stale', if the write in
FirstThread() has not already been released.
That is correct. In essence you are using a tool designed to help with weak memory models against a possible problem caused by a race condition. The tool won't help you because that's not what it does.
If this is true, how can we actually ensure (assuming the weakest
processor memory model, such as ARM or Alpha), that the program will
always print 1? (Or have I made an error in my mental model
somewhere?)
To stress once again: the memory model is not the problem here. To ensure that your program will always print 1 you need to do two things:
- Provide explicit thread synchronization that guarantees the write will happen before the read (in the simplest case,
SecondThread
can use a spin lock on a flag which FirstThread
uses to signal it's done).
- Ensure that
SecondThread
will not read a stale value. You can do this trivially by marking sharedState
as volatile
-- while this keyword has deservedly gotten much flak, it was designed explicitly for such use cases.
So in the simplest case you could for example have:
private volatile int sharedState = 0;
private volatile bool spinLock = false;
private void FirstThread()
{
sharedState = 1;
// ensure lock is released after the shared state write!
Volatile.Write(ref spinLock, true);
}
private void SecondThread()
{
SpinWait.SpinUntil(() => spinLock);
Console.WriteLine(sharedState);
}
Assuming no other writes to the two fields, this program is guaranteed to output nothing other than 1.