I have a setup where I need to lock, read some data, process, write some data, and then unlock. To this end, I made a locking texture as a layout(r32ui) coherent uniform uimage2D
. The critical section's data is declared similarly.
Unfortunately, all my attempts at a spinlock don't prevent race conditions, leading to incorrect results. I tried several different approaches.
I thought I'd collect all the information I could find on GLSL locking, along with my results (GTX 580M). I have added a Community Wiki answer with this exhaustive list. I would appreciate edits/comments about possible issues each presents, ultimately creating a list of valid approaches.
I have standardized the locking texture to be img0
.
Lock Type 1:
Thread warps have a shared program counter. If a single thread grabs the lock, the other threads in the warp will still be stuck in the loop. In practice, this compiles but results in a deadlock.
Examples: StackOverflow, OpenGL.org
while (imageAtomicExchange(img0,coord,1u)==1u);
//<critical section>
memoryBarrier();
imageAtomicExchange(img0,coord,0);
Lock Type 2:
To work around the issue of type 1, one instead writes conditionally. In the below, I have sometimes written the loop as a do-while loop, but a while loop doesn't work correctly either.
Lock Type 2.1:
The first thing one tries is a simple loop. Apparently due to buggy optimizations, this can result in a crash (I haven't tried recently).
Example: NVIDIA
bool have_written = false;
while (true) {
bool can_write = (imageAtomicExchange(img0,coord,1u)!=1u);
if (can_write) {
//<critical section>
memoryBarrier();
imageAtomicExchange(img0,coord,0);
break;
}
}
Lock Type 2.2:
The above example uses imageAtomicExchange(...)
, which might not be the first thing one tries. The most intuitive is imageAtomicCompSwap(...)
. Unfortunately, this doesn't work due to buggy optimizations. It (should be) otherwise sound.
Example: StackOverflow
bool have_written = false;
do {
bool can_write = (imageAtomicCompSwap(img0,coord,0u,1u)==0u);
if (can_write) {
//<critical section>
memoryBarrier();
imageAtomicExchange(img0,coord,0);
have_written = true;
}
} while (!have_written);
Lock Type 2.3:
Switching back from imageAtomicCompSwap(...)
to imageAtomicExchange(...)
is the other common variant. The difference with 2.1 is the way the loop is terminated. This doesn't work correctly for me.
Examples: StackOverflow, StackOverflow
bool have_written = false;
do {
bool can_write = (imageAtomicExchange(img0,coord,1u)!=1u);
if (can_write) {
//<critical section>
memoryBarrier();
imageAtomicExchange(img0,coord,0);
have_written = true;
}
} while (!have_written);