void undefined_behaviour_with_double_checked_locking()
{
if(!resource_ptr) #1
{
std::lock_guard<std::mutex> lk(resource_mutex); #2
if(!resource_ptr) #3
{
resource_ptr.reset(new some_resource); #4
}
}
resource_ptr->do_something(); #5
}
if a thread sees the pointer written by another thread, it might not
see the newly-created instance of some_resource, resulting in the call
to do_something() operating on incorrect values. This is an example of
the type of race condition defined as a data race by the C++ Standard,
and thus specified as undefined behaviour.
Question> I have seen the above explanation for why the code has the double checked locking problem that causes the race condition. However, I still have difficulties to understand what the problem is. Maybe a concrete two-threads step-by-step workflow can help me really understand the race problem for the above the code.
One of the solution mentioned by the book is as follows:
std::shared_ptr<some_resource> resource_ptr;
std::once_flag resource_flag;
void init_resource()
{
resource_ptr.reset(new some_resource);
}
void foo()
{
std::call_once(resource_flag,init_resource); #1
resource_ptr->do_something();
}
#1 This initialization is called exactly once
Any comment is welcome
-Thank you
The simplest problem scenario is in the case where the intialization of some_resource
doesn't depend on resource_ptr
. In that case, the compiler is free to assign a value to resource_ptr
before it fully constructs some_resource
.
For example, if you think of the operation of new some_resource
as consisting of two steps:
- allocate the memory for
some_resource
- initialize
some_resource
(for this discussion, I'm going to make the simplifying assumption that this initialization can't throw an exception)
Then you can see that the compiler could implement the mutex-protected section of code as:
1. allocate memory for `some_resource`
2. store the pointer to the allocated memory in `resource_ptr`
3. initialize `some_resource`
Now it becomes clear that if another thread executes the function between steps 2 and 3, then resource_ptr->do_something()
could be called while some_resource
has not been initialized.
Note that it's also possible on some processor architectures for this kind of reordering to occur in hardware unless the proper memory barriers are in place (and such barriers would be implemented by the mutex).
In this case (depending on the implementation of .reset
and !
) there may be a problem when Thread 1 gets part-way through initializing resource_ptr
and then gets paused/switched. Thread 2 then comes along, performs the first check, sees that the pointer is not null, and skips the lock/fully-initialized check. It then uses the partially-initialized object (probably resulting in bad things happening). Thread 1 then comes back and finishes initializing, but it's too late.
The reason that a partially-initialized resource_ptr
is possible is because the CPU is allowed to reorder instructions (as long as it doesn't change single-thread behaviour). So, while the code looks like it should fully-initialize the object and then assign it to resource_ptr
, the optimized assembly code might be doing something quite different, and the CPU is also not guaranteed to run the assembly instructions in the order they are specified in the binary!
The takeaway is that when multiple threads are involved, memory fences (locks) are the only way guarantee that things happen in the right order.