One reader. One writer. Some general questions abo

2019-04-07 09:11发布

问题:

I have a parent and a worker thread that share a bool flag and a std::vector. The parent only reads (i.e., reads the bool or calls my_vector.empty()); the worker only writes.

My questions:

  • Do I need to mutex protect the bool flag?

  • Can I say that all bool read/writes are inherently atomic operations? If you say Yes or No, where did you get your information from?

  • I recently heard about GCC Atomic-builtin. Can I use these to make my flag read/writes atomic without having to use mutexes? What is the difference? I understand Atomic builtins boil down to machine code, but even mutexes boil down to CPU's memory barrier instructions right? Why do people call mutexes an "OS-level" construct?

  • Do I need to mutex protect my std::vector? Recall that the worker thread populates this vector, whereas the parent only calls empty() on it (i.e., only reads it)

  • I do not believe mutex protection is necessary for either the bool or the vector. I rationalize as follows, "Ok, if I read the shared memory just before it was updated.. thats still fine, I will get the updated value the next time around. More importantly, I do not see why the writer should be blocked while the reading is reading, because afterall, the reader is only reading!"

If someone can point me in the right direction, that would be just great. I am on GCC 4.3, and Intel x86 32-bit. Thanks a lot!

回答1:

Do I need to mutex protect the bool flag?

Not necessarily, an atomic instruction would do. By atomic instruction I mean a compiler intrinsic function that a) prevents compiler reordering/optimization and b) results in atomic read/write and c) issues an appropriate memory fence to ensure visibility between CPUs (not necessary for current x86 CPUs which employ MESI cache coherency protocol). Similar to gcc atomic builtins.

Can I say that all bool read/writes are inherently atomic operations? If you say Yes or No, where did you get your information from?

Depends on the CPU. For Intel CPUs - yes. See Intel® 64 and IA-32 Architectures Software Developer's Manuals.

I recently heard about GCC Atomic-builtin. Can I use these to make my flag read/writes atomic without having to use mutexes? What is the difference? I understand Atomic builtins boil down to machine code, but even mutexes boil down to CPU's memory barrier instructions right? Why do people call mutexes an "OS-level" construct?

The difference between atomics and mutexes is that the latter can put the waiting thread to sleep until the mutex is released. With atomics you can only busy-spin.

Do I need to mutex protect my std::vector? Recall that the worker thread populates this vector, whereas the parent only calls empty() on it (i.e., only reads it)

You do.

I do not believe mutex protection is necessary for either the bool or the vector. I rationalize as follows, "Ok, if I read the shared memory just before it was updated.. thats still fine, I will get the updated value the next time around. More importantly, I do not see why the writer should be blocked while the reading is reading, because afterall, the reader is only reading!"

Depending on the implementation, vector.empty() may involve reading two buffer begin/end pointers and subtracting or comparing them, hence there is a chance that you read a new version of one pointer and an old version of another one without a mutex. Surprising behaviour may ensue.



回答2:

From the C++11 standards point of view, you have to protect the bool with a mutex, or alternatively use std::atomic<bool>. Even when you are sure that your bool is read and written to atomically anyways, there is still the chance that the compiler can optimize away accesses to it because it does not know about other threads that could potentially access it.

If for some reason you absolutely need the latest bit of performance of your platform, consider reading the "Intel 64 and IA-32 Architectures Software Developer's Manual", which will tell you how things work under the hood on your architecture. But of course, this will make your program unportable.



回答3:

Answers:

  1. You will need to protect the bool (or any other variable for that matter) that has the possibility of being operated on by two or more threads at the same time. You can either do this with a mutex or by operating on the bool atomically.
  2. Bool reads and bool writes may be atomic operations, but two sequential operations are certainly not (e.g., a read and then a write). More on this later.
  3. Atomic builtins provide a solution to the problem above: the ability to read and write a variable in a step that cannot be interrupted by another thread. This makes the operation atomic.
  4. If you are using the bool flag as your 'mutex' (that is, only the thread that sets the bool flag to true has permission to modify the vector) then you're OK. The mutual exclusion is managed by the boolean, and as long as you're modifying the bool using atomic operations you should be all set.
  5. To answer this, let me use an example:

 

bool              flag(false);
std::vector<char> my_vector;

while (true)
{
    if (flag == false) // check to see if the mutex is owned
    {
        flag = true; // obtain ownership of the flag (the mutex)

        // manipulate the vector

        flag = false; // release ownership of the flag
    }
}

In the above code in a multithreaded environment it is possible for the thread to be preempted between the if statement (the read) and the assignment (the write), which means it possible for two (or more) threads with this kind of code to both "own" the mutex (and the rights to the vector) at the same time. This is why atomic operations are crucial: they ensure that in the above scenario the flag will only be set by one thread at a time, therefore ensuring the vector will only be manipulated by one thread at a time.

Note that setting the flag back to false need not be an atomic operation because you this instance is the only one with rights to modify it.

A rough (read: untested) solution may look something like:

bool              flag(false);
std::vector<char> my_vector;

while (true)
{
    // check to see if the mutex is owned and obtain ownership if possible
    if (__sync_bool_compare_and_swap(&flag, false, true)) 
    {
        // manipulate the vector

        flag = false; // release ownership of the flag
    }
}

The documentation for the atomic builtin reads:

The “bool” version returns true if the comparison is successful and newval was written.

Which means the operation will check to see if flag is false and if it is set the value to true. If the value was false true is returned, otherwise false. All of this happens in an atomic step, so it is guaranteed not to be preempted by another thread.



回答4:

I don't have the expertise to answer your entire question but your last bullet is incorrect in cases in which reads are non-atomic by default.

A context switch can happen anywhere, the reader can get context switched partway through a read, the writer can get switched in and do the full write, and then the reader would finish their read. The reader would see neither the first value, nor the second value, but potentially some wildly inaccurate intermediate value.