Broken GLSL Spinlock/GLSL Locks Compendium

2019-02-23 17:03发布

问题:

I have a setup where I need to lock, read some data, process, write some data, and then unlock. To this end, I made a locking texture as a layout(r32ui) coherent uniform uimage2D. The critical section's data is declared similarly.

Unfortunately, all my attempts at a spinlock don't prevent race conditions, leading to incorrect results. I tried several different approaches.

I thought I'd collect all the information I could find on GLSL locking, along with my results (GTX 580M). I have added a Community Wiki answer with this exhaustive list. I would appreciate edits/comments about possible issues each presents, ultimately creating a list of valid approaches.

回答1:

I have standardized the locking texture to be img0.

Lock Type 1:

Thread warps have a shared program counter. If a single thread grabs the lock, the other threads in the warp will still be stuck in the loop. In practice, this compiles but results in a deadlock.

Examples: StackOverflow, OpenGL.org

while (imageAtomicExchange(img0,coord,1u)==1u);

//<critical section>
memoryBarrier();

imageAtomicExchange(img0,coord,0);

Lock Type 2:

To work around the issue of type 1, one instead writes conditionally. In the below, I have sometimes written the loop as a do-while loop, but a while loop doesn't work correctly either.

Lock Type 2.1:

The first thing one tries is a simple loop. Apparently due to buggy optimizations, this can result in a crash (I haven't tried recently).

Example: NVIDIA

bool have_written = false;
while (true) {
    bool can_write = (imageAtomicExchange(img0,coord,1u)!=1u);

    if (can_write) {
        //<critical section>
        memoryBarrier();

        imageAtomicExchange(img0,coord,0);
        break;
    }
}

Lock Type 2.2:

The above example uses imageAtomicExchange(...), which might not be the first thing one tries. The most intuitive is imageAtomicCompSwap(...). Unfortunately, this doesn't work due to buggy optimizations. It (should be) otherwise sound.

Example: StackOverflow

bool have_written = false;
do {
    bool can_write = (imageAtomicCompSwap(img0,coord,0u,1u)==0u);

    if (can_write) {
        //<critical section>
        memoryBarrier();

        imageAtomicExchange(img0,coord,0);
        have_written = true;
    }
} while (!have_written);

Lock Type 2.3:

Switching back from imageAtomicCompSwap(...) to imageAtomicExchange(...) is the other common variant. The difference with 2.1 is the way the loop is terminated. This doesn't work correctly for me.

Examples: StackOverflow, StackOverflow

bool have_written = false;
do {
    bool can_write = (imageAtomicExchange(img0,coord,1u)!=1u);

    if (can_write) {
        //<critical section>
        memoryBarrier();

        imageAtomicExchange(img0,coord,0);
        have_written = true;
    }
} while (!have_written);