The situation I'll describe is occurring on an iPad 4 (ARMv7s), using posix libs to mutex lock/unlock. I've seen similar things on other ARMv7 devices, though (see below), so I suppose any solution will require a more general look at the behaviour of mutexes and memory fences for ARMv7.
Pseudo code for the scenario:
Thread 1 – Producing Data:
void ProduceFunction() {
MutexLock();
int TempProducerIndex = mSharedProducerIndex; // Take a copy of the int member variable for Producers Index
mSharedArray[TempProducerIndex++] = NewData; // Copy new Data into array at Temp Index
mSharedProducerIndex = TempProducerIndex; // Signal consumer data is ready by assigning new Producer Index to shared variable
MutexUnlock();
}
Thread 2 – Consuming Data:
void ConsumingFunction () {
while (mConsumerIndex != mSharedProducerIndex) {
doWorkOnData (mSharedArray[mConsumerIndex++]);
}
}
Previously (when the problem cropped up on iPad 2), I believed that mSharedProducerIndex = TempProducerIndex
was not being performed atomically, and hence changed to use an AtomicCompareAndSwap
to assign mSharedProducerIndex
. This has worked up until this point, but it turns out I was wrong and the bug has come back. I guess the 'fix' just changed some timing.
I have now come to the conclusion that the actual problem is an out of order execution of the writes within the mutex lock, i.e. if either the compiler or the hardware decided to reorder:
mSharedArray[TempProducerIndex++] = NewData; // Copy new Data into array at Temp Index
mSharedProducerIndex = TempProducerIndex; // Signal consumer data is ready by assigning new Producer Index to shared variable
... to:
mSharedProducerIndex = TempProducerIndex; // Signal consumer data is ready by assigning new Producer Index to shared variable
mSharedArray[TempProducerIndex++] = NewData; // Copy new Data into array at Temp Index
... and then the consumer interleaved the producer, the data would not have yet been written when the consumer tried to read it.
After some reading on memory barriers, I therefore thought I’d try moving the signal to the consumer outside the mutex_unlock
, believing that the unlock would produce a memory barrier/fence which would ensure mSharedArray
had been written to:
mSharedArray[TempProducerIndex++] = NewData; // Copy new Data into array at Temp Index
MutexUnlock();
mSharedProducerIndex = TempProducerIndex; // Signal consumer data is ready by assigning new Producer Index to shared variable
This, however, still fails, and leads me to question if a mutex_unlock
will definitely act as a write fence or not?
I've also read an article from HP which suggested that compilers could move code into (but not out of) crit_sec
s. So even after the above change, the write of mSharedProducerIndex
could be before the barrier. Is there any mileage to this theory?
By adding an explicit fence the problem goes away:
mSharedArray[TempProducerIndex++] = NewData; // Copy new Data into array at Temp Index
OSMemoryBarrier();
mSharedProducerIndex = TempProducerIndex; // Signal consumer data is ready by assigning new Producer Index to shared variable
I therefore think I understand the problem, and that a fence is required, but any insight into the behaviour of the unlock and why it doesn’t appear to be performing a barrier would be really useful.
EDIT:
Regarding the lack of a mutex in the consumer thread: I'm relying on the write of the int mSharedProducerIndex
being a single instruction and therefore hoping the consumer would read either the new or old value. Either are valid states, and providing that mSharedArray
is written in sequence (i.e. prior to writing mSharedProducerIndex
) this would be OK, but from what has been said so far, I can’t reply on this.
By the same logic it appears that the current barrier solution is also flawed, as the mSharedProducerIndex
write could be moved inside the barrier and could therefore potentially be incorrectly re-ordered.
Is it recommended to add a mutex to the consumer, just to act as a read barrier, or is there a pragma
or instruction for disabling out-of-order execution on the producer, like EIEIO
on PPC?