Memory ordering issues

2019-04-21 10:13发布

问题:

I'm experimenting with C++0x support and there is a problem, that I guess shouldn't be there. Either I don't understand the subject or gcc has a bug.

I have the following code, initially x and y are equal. Thread 1 always increments x first and then increments y. Both are atomic integer values, so there is no problem with the increment at all. Thread 2 is checking whether the x is less than y and displays an error message if so.

This code fails sometimes, but why? The issue here is probably memory reordering, but all atomic operations are sequentially consistent by default and I didn't explicitly relax of those any operations. I'm compiling this code on x86, which as far as I know shouldn't have any ordering issues. Can you please explain what the problem is?

#include <iostream>
#include <atomic>
#include <thread>

std::atomic_int x;
std::atomic_int y;

void f1()
{
    while (true)
    {
        ++x;
        ++y;
    }
}

void f2()
{
    while (true)
    {
        if (x < y)
        {
            std::cout << "error" << std::endl;
        }
    }
}

int main()
{
    x = 0;
    y = 0;

    std::thread t1(f1);
    std::thread t2(f2);

    t1.join();
    t2.join();
}

The result can be viewed here.

回答1:

The problem could be in your test:

if (x < y)

the thread could evaluate x and not get around to evaluating y until much later.



回答2:

There is a problem with the comparison:

x < y

The order of evaluation of subexpressions (in this case, of x and y) is unspecified, so y may be evaluated before x or x may be evaluated before y.

If x is read first, you have a problem:

x = 0; y = 0;
t2 reads x (value = 0);
t1 increments x; x = 1;
t1 increments y; y = 1;
t2 reads y (value = 1);
t2 compares x < y as 0 < 1; test succeeds!

If you explicitly ensure that y is read first, you can avoid the problem:

int yval = y;
int xval = x;
if (xval < yval) { /* ... */ }


回答3:

Every now and then, x will wrap around to 0 just before y wraps around to zero. At this point y will legitimately be greater than x.



回答4:

First, I agree with "Michael Burr" and "James McNellis". Your test is not fair, and there's a legitime possibility to fail. However even if you rewrite the test the way "James McNellis" suggests the test may fail.

First reason for this is that you don't use volatile semantics, hence the compiler may do optimizations to your code (that are supposed to be ok in a single-threaded case).

But even with volatile your code is not guaranteed to work.

I think you don't fully understand the concept of memory reordering. Actually memory read/write reorder can occur at two levels:

  1. Compiler may exchange the order of the generated read/write instructions.
  2. CPU may execute memory read/write instructions in arbitrary order.

Using volatile prevents the (1). However you've done nothing to prevent (2) - memory access reordering by the hardware.

To prevent this you should put special memory fence instructions in the code (that are designated for CPU, unlike volatile which is for compiler only).

In x86/x64 there're many different memory fence instructions. Also every instruction with lock semantics by default issues full memory fence.

More information here:

http://en.wikipedia.org/wiki/Memory_barrier