I read the following article by Antony Williams and as I understood in addition to the atomic shared count in std::shared_ptr
in std::experimental::atomic_shared_ptr
the actual pointer to the shared object is also atomic?
But when I read about reference counted version of lock_free_stack
described in Antony's book about C++ Concurrency it seems for me that the same aplies also for std::shared_ptr
, because functions like std::atomic_load
, std::atomic_compare_exchnage_weak
are applied to the instances of std::shared_ptr
.
template <class T>
class lock_free_stack
{
public:
void push(const T& data)
{
const std::shared_ptr<node> new_node = std::make_shared<node>(data);
new_node->next = std::atomic_load(&head_);
while (!std::atomic_compare_exchange_weak(&head_, &new_node->next, new_node));
}
std::shared_ptr<T> pop()
{
std::shared_ptr<node> old_head = std::atomic_load(&head_);
while(old_head &&
!std::atomic_compare_exchange_weak(&head_, &old_head, old_head->next));
return old_head ? old_head->data : std::shared_ptr<T>();
}
private:
struct node
{
std::shared_ptr<T> data;
std::shared_ptr<node> next;
node(const T& data_) : data(std::make_shared<T>(data_)) {}
};
private:
std::shared_ptr<node> head_;
};
What is the exact difference between this two types of smart pointers, and if pointer in std::shared_ptr
instance is not atomic, why it is possible the above lock free stack implementation?
The atomic "thing" in shared_ptr
is not the shared pointer itself, but the control block it points to. meaning that as long as you don't mutate the shared_ptr
across multiple threads, you are ok. do note that copying a shared_ptr
only mutates the control block, and not the shared_ptr
itself.
std::shared_ptr<int> ptr = std::make_shared<int>(4);
for (auto i =0;i<10;i++){
std::thread([ptr]{ auto copy = ptr; }).detach(); //ok, only mutates the control block
}
Mutating the shared pointer itself, such as assigning it different values from multiple threads, is a data race, for example:
std::shared_ptr<int> ptr = std::make_shared<int>(4);
std::thread threadA([&ptr]{
ptr = std::make_shared<int>(10);
});
std::thread threadB([&ptr]{
ptr = std::make_shared<int>(20);
});
Here, we are mutating the control block (which is ok) but also the shared pointer itself, by making it point to a different values from multiple threads. This is not ok.
A solution to that problem is to wrap the shared_ptr
with a lock, but this solution is not so scalable under some contention, and in a sense, loses the automatic feeling of the standard shared pointer.
Another solution is to use the standard functions you quoted, such as std::atomic_compare_exchange_weak
. This makes the work of synchronizing shared pointers a manual one, which we don't like.
This is where atomic shared pointer comes to play. You can mutate the shared pointer from multiple threads without fearing a data race and without using any locks. The standalone functions will be members ones, and their use will be much more natural for the user. This kind of pointer is extremely useful for lock-free data structures.
N4162(pdf), the proposal for atomic smart pointers, has a good explanation. Here's a quote of the relevant part:
Consistency. As far as I know, the [util.smartptr.shared.atomic]
functions are the only atomic operations in the standard that
are not available via an atomic
type. And for all types
besides shared_ptr
, we teach programmers to use atomic types
in C++, not atomic_*
C-style functions. And that’s in part because of...
Correctness. Using the free functions makes code error-prone
and racy by default. It is far superior to write atomic
once on
the variable declaration itself and know all accesses
will be atomic, instead of having to remember to use the atomic_*
operation on every use of the object, even apparently-plain reads.
The latter style is error-prone; for example, “doing it wrong” means
simply writing whitespace (e.g., head
instead of atomic_load(&head)
),
so that in this style every use of the variable is “wrong by default.” If you forget to
write the atomic_*
call in even one place, your code will still
successfully compile without any errors or warnings, it will “appear
to work” including likely pass most testing, but will still contain a
silent race with undefined behavior that usually surfaces as intermittent
hard-to-reproduce failures, often/usually in the field,
and I expect also in some cases exploitable vulnerabilities.
These classes of errors are eliminated by simply declaring the variable atomic
,
because then it’s safe by default and to write the same set of
bugs requires explicit non-whitespace code (sometimes explicit
memory_order_*
arguments, and usually reinterpret_cast
ing).
Performance. atomic_shared_ptr<>
as a distinct type
has an important efficiency advantage over the
functions in [util.smartptr.shared.atomic] — it can simply store an
additional atomic_flag
(or similar) for the internal spinlock
as usual for atomic<bigstruct>
. In contrast, the existing standalone functions
are required to be usable on any arbitrary shared_ptr
object, even though the vast majority of shared_ptr
s will
never be used atomically. This makes the free functions inherently
less efficient; for example, the implementation could require
every shared_ptr
to carry the overhead of an internal spinlock
variable (better concurrency, but significant overhead per
shared_ptr
), or else the library must maintain a lookaside data
structure to store the extra information for shared_ptr
s that are
actually used atomically, or (worst and apparently common in
practice) the library must use a global spinlock.
Calling std::atomic_load()
or std::atomic_compare_exchange_weak()
on a shared_ptr
is functionally equivalent to calling atomic_shared_ptr::load()
or atomic_shared_ptr::atomic_compare_exchange_weak()
. There shouldn't be any performance difference between the two. Calling std::atomic_load()
or std::atomic_compare_exchange_weak()
on a atomic_shared_ptr
would be syntactically redundant and might or might not incur a performance penalty.
atomic_shared_ptr
is an API refinement. shared_ptr
already supports atomic operations, but only when using the appropriate atomic non-member functions. This is error-prone, because the non-atomic operations remain available and are too easy for an unwary programmer to invoke by accident. atomic_shared_ptr
is less error-prone because it doesn't expose any non-atomic operations.
shared_ptr
and atomic_shared_ptr
expose different APIs, but they don't necessarily need to be implemented differently; shared_ptr
already supports all the operations exposed by atomic_shared_ptr
. Having said that, the atomic operations of shared_ptr
are not as efficient as they could be, because it must also support non-atomic operations. Therefore there are performance reasons why atomic_shared_ptr
could be implemented differently. This is related to the single responsibility principle. "An entity with several disparate purposes... often offers crippled interfaces for any of its specific purposes because the partial overlap among various areas of functionality blurs the vision needed for crisply implementing each." (Sutter & Alexandrescu 2005, C++ Coding Standards)