Suppose I have an application that may or may not have spawned multiple threads. Is it worth it to protect operations that need synchronization conditionally with a std::mutex as shown below, or is the lock so cheap that it does not matter when single-threading?
#include <atomic>
#include <mutex>
std::atomic<bool> more_than_one_thread_active{false};
void operation_requiring_synchronization() {
//...
}
void call_operation_requiring_synchronization() {
if (more_than_one_thread_active) {
static std::mutex mutex;
std::lock_guard<std::mutex> lock(mutex);
operation_requiring_synchronization();
} else {
operation_requiring_synchronization();
}
}
Edit
Thanks to all who have answered and commented, very interesting discussion.
A couple of clarifications:
The application processes chunks of input, and for each chunk decides if it will be processed in a single-threaded or parallel or otherwise concurrent fashion. It is not unlikely that no multi-threading will be needed.
The operation_requiring_synchronization()
will typically consist of a few inserts into global standard containers.
Profiling is, of course, difficult when the application is platform-independent and should perform well under a variety of platforms and compilers (past, present and future).
Based on the discussion so far, I tend to think that the optimization is worth it.
I also think the std::atomic<bool> more_than_one_thread_active
should probably be changed to a non-atomic bool multithreading_has_been_initialized
. The original idea was to be able to turn the flag off again when all threads other than the main one are dormant but I see how this could be error-prone.
Abstracting the explicit conditional away into a customized lock_guard is a good idea (and facilitates future changes of the design, including simply reverting back to std::lock_guard if the optimization is not deemed worth it).
I disagree with wide-spread idea that locking mutext is cheap. If you really are after performance, you wouldn't want to do this.
Mutexes (even uncontested) hit you with three hummers: they penalize compiler optimizations (mutexes are optimization barriers), they incure memory fences (on un-pessimized platforms) and they are kernel calls. So if you are after nanoseconds performance in tight loops, it is something worth considering.
Branching is not great, either - for multiple reasons. The real solution is to avoid operatations requiring synchronization in multithread environment. As simple as that.