I'm currently reinventing the wheel of a thread pool in C++. I've eliminated almost all locks from the code, except for multiple instances of the following construct:
std::atomic_size_t counter;
void produce() {
++counter;
}
void try_consume() {
if (counter != 0) {
--counter;
// ...
}
else {
// ...
}
}
So, I need a thread-safe lock-free version of this function:
bool take(std::atomic_size_t& value) {
if (value != 0) {
--value;
return true;
}
return false;
}
There is one solution that I know of: use boost::lockfree::queue<std::monostate>
, where pop()
does the job. Is there any better/faster solution?
The construct you're implementing is a counting lock, or counting semaphore. Use one from a library that has a
trylock
version instead of rolling your own, so you can get optimized OS-supported sleep / wakeup. Or do you always have useful work to do if thetrylock
(akatake
) fails?Note that you can avoid using any traditional locks while implementing your own, but "lock free" is a technical term with a different meaning than lockless. A consumer almost by definition can't be lock-free in the computer-science sense, because it can be stuck waiting forever if the producer thread(s) are blocked. Related: Lock-free Progress Guarantees
CAS is good. Just make sure that your function doesn't run
compare_exchange_weak
at all if it sees the counter already 0 with a pure load (typically withmemory_order_relaxed
). You don't want your CPU hammering on the location while other threads are trying to increment it so your thread will see a non-zero value.Another option is a signed counter, and change the compare to
>= 0
. Check for overshoot in the result offetch_add(-1)
, and if so correct it. (Threads that see the counter as temporarily negative just see it "locked"). But this is typically not better than a CAS retry loop; failed CAS is rare unless contention is very high. And extra atomic ops to correct overshoot will probably cost about as much (or more) than CAS retries.