__sync_val_compare_and_swap vs __sync_bool_compare

2019-05-04 02:35发布

问题:

I've been thinking about the return values of these two functions. The __sync_bool_compare_and_swap function's return value seems to have obvious benefits, i.e. I can use it to tell whether the swap operation took place. However I can't see a good use of __sync_val_compare_and_swap's return value.

Firstly, lets have a function signature for reference (from GCC docs minus the var args):

type __sync_val_compare_and_swap (type *ptr, type oldval type newval);

The problem I see is that the return value of __sync_val_compare_and_swap is the old value of the *ptr. To be precise, it's the value which was seen by the implementation of this function once appropriate memory barriers had been put in place. I explicitly state this to cater for the fact that between calling __sync_val_compare_and_swap and executing instructions to enforce the memory barrier the value of *ptr could easily change.

Now, when the function returns what can I do with that return value? There's no point trying to compare it to *ptr because *ptr can now be changed on other threads. Likewise comparing newval and *ptr doesn't really help me either (unless I lock *ptr which probably undermines my use of atomics in the first place).

So all that's really left for me to do is ask whether the return value == oldval, which is effectively (see below for a caveat) asking whether the swap operation took place. So I could have just used __sync_bool_compare_and_swap.

The caveat I just mentioned is that the only subtle difference I can see here is that doing so doesn't tell me whether the swap occured or not, it just tells me that at some point before the memory barrier was released *ptr had the same value as newval. I'm considering the possibility that oldval == newval (although I'd struggle to see a way of implementing the function efficiently so that it could check these values first and not swap if they were the same so it's probably a moot point). However I can't see a situation where knowing this difference would make a difference to me at the call site. In fact, I can't imagine a situation where I would set oldval and newval to be equal.

My question is thus:

Is there any use case in which using __sync_val_compare_and_swap and __sync_bool_compare_and_swap would not be equivalent, i.e. is there a situation where one provides more information than the other?

ASIDE

The reason I was thinking about this was that I found an implementation of __sync_val_compare_and_swap in terms of sync_bool_compare_and_swap which has a race:

inline int32_t __sync_val_compare_and_swap(volatile int32_t* ptr, int32_t oldval, int32_t newval)
{
    int32_t ret = *ptr;
    (void)__sync_bool_compare_and_swap(ptr, oldval, newval);
    return ret;
}

The race being on the storing of *ptr in ret, as *ptr could change before __sync_bool_compare_and_swap is called. It made me realise that I there doesn't seem to be a safe way (without extra barriers or locks) of implementing __sync_val_compare_and_swap in terms of sync_bool_compare_and_swap. This got me thinking that the former must provide more "information" than the latter, but as per my question I don't see that it really does.

回答1:

The operation provided by __sync_val_compare_and_swap can always be implemented in terms of __sync_bool_compare_and_swap (and of course the other direction is obviously possible), so in terms of power the two are equivalent. However implementing __sync_val_compare_and_swap in terms of __sync_bool_compare_and_swap is not very efficient. It looks something like:

for (;;) {
    bool success = __sync_bool_compare_and_swap(ptr, oldval, newval);
    if (success) return oldval;
    type tmp = *ptr;
    __sync_synchronize();
    if (tmp != oldval) return tmp;
}

The extra work is needed because you could observe failure of __sync_bool_compare_and_swap but then read a new value from *ptr that happens to match oldval.

As for why you might prefer the __sync_val_compare_and_swap behavior, the value that caused failure may give you a starting point to retry the operation more efficiently or might indicate a meaningful cause of the failure for some operation that won't be "retried". As an example, see the code for pthread_spin_trylock in musl libc (for which I am the author):

http://git.musl-libc.org/cgit/musl/tree/src/thread/pthread_spin_trylock.c?id=afbcac6826988d12d9a874359cab735049c17500

There a_cas is equivalent to __sync_val_compare_and_swap. In some ways this is a stupid example since it's just saving a branch or conditional move by using the old value, but there are other situations where multiple old values are possible and knowing the one that caused the operation to fail matters.