Do memory barriers guarantee a fresh read in C#?

2019-01-26 16:21发布

问题:

If we have the following code in C#:

int a = 0;
int b = 0;

void A() // runs in thread A
{
    a = 1;
    Thread.MemoryBarrier();
    Console.WriteLine(b);
}

void B() // runs in thread B
{
    b = 1;
    Thread.MemoryBarrier();
    Console.WriteLine(a);
}

The MemoryBarriers make sure that the write instruction takes place before the read. However, is it guaranteed that the write of one thread is seen by the read on the other thread? In other words, is it guaranteed that at least one thread prints 1 or both thread could print 0?

I know that several questions exist already that are relevant to "freshness" and MemoryBarrier in C#, like this and this. However, most of them deal with the write-release and read-acquire pattern. The code posted in this question, is very specific to whether a write is guaranteed to be viewed by a read on top of the fact that the instructions are kept in order.

回答1:

It is not guaranteed to see both threads write 1. It only guarantees the order of read/write operations based on this rule:

The processor executing the current thread cannot reorder instructions in such a way that memory accesses prior to the call to MemoryBarrier execute after memory accesses that follow the call to MemoryBarrier.

So this basically means that the thread for a thread A wouldn't use a value for the variable b read before the barrier's call. But it still cache the value if your code is something like this:

void A() // runs in thread A
{
    a = 1;
    Thread.MemoryBarrier();
    // b may be cached here
    // some work here
    // b is changed by other thread
    // old value of b is being written
    Console.WriteLine(b);
}

The race-condition bugs for a the parallel execution is very hard to reproduce, so I can't provide you a code that will definitely do the scenario above, but I suggest you to use the volatile keyword for the variables being used by different threads, as it works exactly as you want - gives you a fresh read for a variable:

volatile int a = 0;
volatile int b = 0;

void A() // runs in thread A
{
    a = 1;
    Thread.MemoryBarrier();
    Console.WriteLine(b);
}

void B() // runs in thread B
{
    b = 1;
    Thread.MemoryBarrier();
    Console.WriteLine(a);
}


回答2:

It depends on what you mean by "fresh". Thread.MemoryBarrier will force the first read of a variable to be obtained by loading it from its designated memory location. If that's all you mean by "fresh" and nothing more than then the answer is yes. Most programmers operate with a more rigid definition whether they realize it or not and that is where problems and confusion begin. Note that a volatile read via volatile and other similar mechanisms would not produce a "fresh" read under this definition, but would under a different definition. Continue reading to find out how.

I will use a down arrow ↓ to represent a volatile read and an up arrow ↑ to represent a volatile write. Think of the arrow head as pushing away any other reads and writes. The code that generates these memory fences is free to move around as long as no instruction goes up through a down arrow and down through an up arrow. The memory fences (the arrows), however, are locked in place at the spot where they were originally declared in the code. Thread.MemoryBarrier generates a full-fence barrier so it has both read-acquire and release-write semantics.

int a = 0;
int b = 0;

void A() // runs in thread A
{
    register = 1
    a = register
    ↑   // Thread.MemoryBarrier
    ↓   // Thread.MemoryBarrier
    register = b
    jump Console.WriteLine
    use register
    return Console.WriteLine
}

void B() // runs in thread B
{
    register = 1
    b = register
    ↑   // Thread.MemoryBarrier
    ↓   // Thread.MemoryBarrier
    register = a
    jump Console.WriteLine
    use register
    return Console.WriteLine
}

Keep in mind that the C# lines are actually multipart instructions once they get JIT compiled and executed. I have tried to illustrate that somewhat, but in reality the invocation of Console.WriteLine is still going to be far more complex than shown so the time between the read of a or b and their first use could be significant relatively speaking. Because Thread.MemoryBarrier produces an acquire-fence the reads are not allowed to float up and past the call. So the read is "fresh" relative to the Thread.MemoryBarrier call. But, it could be "stale" relative to when it is actually used by the Console.WriteLine call.

Let us now consider what your code might look like if we replaced the Thread.MemoryBarrier call with the volatile keyword.

volatile int a = 0;
volatile int b = 0;

void A() // runs in thread A
{
    register = 1
    ↑              // volatile write
    a = register   
    register = b   
    ↓              // volatile read
    jump Console.WriteLine
    use register
    return Console.WriteLine
}

void B() // runs in thread B
{
    register = 1
    ↑              // volatile write
    b = register   
    register = a   
    ↓              // volatile read
    jump Console.WriteLine
    use register
    return Console.WriteLine
}

Can you spot the change? If you blinked then you missed it. Compare the arrangement of the arrows (memory fences) between the two blocks of code. In the first case (Thread.MemoryBarrier) the reads are not allowed to occur at a point in time prior to the memory barrier. But, in the second case (volatile) the reads can bubble up indefinitely (because there is down arrow pushing them away). In this case one can make a reasonable argument that Thread.MemoryBarrier can produce a "fresher" read if placed before the read than the volatile solution. But, can you still claim the read is "fresh"? Not really because by the time it is used by Console.WriteLine it might not be the latest value anymore.

So what is the point of using volatile you might ask. Because successive reads produce acquire-fence semantics it does guarantee that later reads produce a newer value than the previous read. Consider the following code.

volatile int a = 0;

void A()
{
    register = a;
    ↓               // volatile read
    Console.WriteLine(register);
    register = a;
    ↓               // volatile read
    Console.WriteLine(register);
    register = a;
    ↓               // volatile read
    Console.WriteLine(register);
}

Pay close attention to what can happen here. The lines register = a represent the read. Notice where the ↓ arrow is placed. Because it is placed after the read there is nothing preventing the actual read from floating up. It can actually float up and before the previous Console.WriteLine call. So in this case there is no guarantee that Console.WriteLine is working with the latest value of a. However, it is guaranteed to be working with a newer value than the last time it was called. That is its usefulness in a nutshell. That is why you see a lot of lock-free code spinning in a while loop making sure the previous read of a volatile variable is equal to the current read before assuming its intended operation is successful.

There are a couple of important points I want to make in conclusion.

  • Thread.MemoryBarrier will guarantee that a read appearing after it will return the latest value relative to the barrier. But, by the time you actually make decisions or use that information it may no longer be the latest value anymore.
  • volatile guarantees that the read will return a value that is newer than the previous read of the same variable. At no time does it guarantee that the value is the latest though.
  • The meaning of "fresh" needs to be clearly defined, but can be different from situation to situation and developer to developer. There is no meaning that is anymore correct than anything other as long as it can be formally defined and articulated.
  • It is not an absolute concept. You will find it more useful to define "fresh" in terms of being relative to something else like the generation of a memory barrier or a previous instruction. In other words, "freshness" is a relative concept like how velocities are relative to the observer in Einstein's theory of special relativity.


回答3:

The above answers are largely correct. However, to provide a more concise explanation to your question – "Is it guaranteed that at least one thread prints 1?" – Yes, the pair of memory barriers guarantees that.

Consider the representation below, where --- represents a memory barrier. Instructions can be moved backward or forward, but they may not cross the barrier.

If the A and B methods are called at exactly the same time, you could get two 1s:

|   Thread A   |   Thread B   |
|              |              |
|    a = 1     |    b = 1     |
| ------------ | ------------ |
|    read b    |    read a    |
|              |              |

However, in likelihood, they will be called apart, giving a 0 and a 1:

|   Thread A   |   Thread B   |
|              |              |
|    a = 1     |              |
| ------------ |              |
|    read b    |              |
|              |              |
|              |    b = 1     |
|              | ------------ |
|              |    read a    |

Memory reordering might cause the read and/or write operations on one of the variables to be shifted beyond each other, again causing two 1s:

|   Thread A   |   Thread B   |
|              |              |
|    a = 1     |              |
| ------------ |              |
|              |    b = 1     |
|              |              |
|    read b    |              |
|              | ------------ |
|              |    read a    |

However, there is no way that you could get the read and/or write of both variables to be shifted beyond each other, since the barriers prohibit that. Therefore, it is impossible to get two 0s.

Take the second example above, where b has been read as 0. By the time b was read on thread A, a would already have been written as 1 and made visible to other threads, because of the memory barrier on thread A. However, a could not have been read or cached yet on thread B, because the memory barrier on thread B has not been reached yet, given that b is still 0.